3D perspective-matched video overlay

Hi. I would like to take a video clip and layer it on top of another video clip semi-transparently, maybe with chroma-key, and 3D-transformed to match the bottom clip’s perspective, to create something like an augmented reality type of effect. What do I have to do to get this effect in Shotcut? Could HTML5 filters help here in some way? Can HTML5 or WebGL filters be used for actual video processing with video input and output, not just rendering text and graphics? I couldn’t really find good documentation on this, and the bundled filters didn’t work on my Mac mini OSX 10.8 machine, and trying to drag&drop WebGL pages onto Shotcut either didn’t produce any sensible output or just crashed the whole application. I probably did something wrong with them.

The thing I’m trying to do is to layer a MIDI keyboard visualization on top of video footage of an actual piano or other keyboard instrument being played. Usually, piano tutorial videos show the two views separately: camera view in one and MIDI based visualization in another area of the video window. I think that’s not a particularly good visualization, and it shouldn’t be necessary to do it like that with all the graphics processing muscle we have.

Another thing: I can use a Mac mini i7 that doesn’t really have a GPU worth mentioning, or a Linux PC that has a nicely powerful Nvidia GTX-670 GPU. I’ve tried using Shotcut on the Mac, and I haven’t really got any sensible results from the HTML5 filters with it, so does it really need a more powerful GPU and/or Linux? Mostly on the Mac (OSX 10.8), the HTML5 filters just don’t seem to produce more than a black or static screen, or a crash. I wouldn’t like to start using the Linux machine for video editing, but if that’s actually what’s needed, then I’ll have to.

What I’m looking for is basically, the two video sources in frame buffers, and a way to transform and mix the overlay video frames with the base video frames. The actual transformation shouldn’t be too hard to do, once the “infrastructure” is in place. What would be the proper way to do it: HTML5? WebGL+GLSL? Native x64 coding? Or just some existing filter with the right parameters?

Yes, a WebGL-based HTML5 filter could do this. The Rutt-Ettra filter is an example of actually accessing pixel byte data in JavaScript, but one can also use the frame of video as simply a GL texture. For example:

Now, the existing Rotate filter is actually a limited version of a general affine transform filter. There are 2 components of a filter: the non-UI core processing in MLT (or a dependent), and the UI in Shotcut. The Shotcut Rotate UI filter corresponds to the MLT general affine transform filter. You can easily make a copy of Rotate and add more parameters. Here are docs on the MLT affine:
https://www.mltframework.org/bin/view/MLT/TransitionAffine

It sounds like your Mac is not good enough to drive the HTML5 filters for whatever reason. It works for me on 2 macs I have - a hackintosh running 10.8 and a MacBook Pro 13 running 10.11.

So, I would try making the custom affine filter. Simply go into Shotcut.app/Contents/MacOS/share/shotcut/qml/filters. Copy the rotate folder to “affine.” Edit affine/meta.qml with a text editor and change “objectName” and “name”. Next, open the affine/ui.qml file in a text editor. Copy and paste blocks of Label+SliderSpinner+UndoButton to map those to the additional MLT parameters available in the doc link I provided above.

I would think about cranking something out for this for the next release, but I really want to give this more thought with respect to an interactive video overlay UI instead of a bunch of sliders. Give it a shot! There is hardly anyone hacking these easily modifiable filter UIs! You can learn more about QML from qt.io.

Thanks, I was able to get the Rotation filter working, and following your advice I made a more general “AffineGeneric” filter, with parameters RotationX, RotationY, RotationZ, ShearX, ShearY, ShearZ, ScaleX, ScaleY, X offset, Y offset. The first time the UI didn’t show up at all, it was because I had made a copy/paste error, had duplicate SliderSpinner ids for ScaleX and ScaleY, or something. Once I fixed that, the UI worked and I was able to play with the affine transform parameters.

One little problem still remains, namely that perspective mapping is not an affine transform. :slight_smile: The rectangle would have to be divided in a relatively dense polygon mesh, to be able to use affine texture mapping for the individual polygons without looking obviously wrong.

Anyway, it was an encouraging experience. I’ll look into the MLT library and try hacking some more. Maybe I’ll have to use the Linux machine, but I’d like to avoid that option, because I’ll be doing the audio+MIDI+screen capture recording on the Mac. I’d have to use both computers and move files over the network… Maybe it wouldn’t be that bad. I’m planning to record the live camera footage with a Canon DSLR and use Canon’s Mac-based remote-control software for it.

Here’s the general affine filter I made with your instructions: http://www.kameli.net/~yzi/affine.zip

OK, for that you are probably better off with WebGL + three.js or Blender. Shotcut is not exactly aiming for that level yet. I was hoping this would work for you without too much work since HTML filters are giving you trouble on your OS X machine.

I won’t give up just yet! :slight_smile: I downloaded the MLT framework’s sources and managed to compile it on the Mac (just had to configure --disable frei0r, jackrack and SDL support). The modules/plus/transform_affine.c thing looks like it should be pretty straight-forward to make it a more general texture mapper. Instead of using a textbook affine mapping with a matrix and everything, I’ll supply the parameters as two sets of vertex coordinates, and do the texture mapping like it’s done in good old scanline rendering 3D engines. (which I used to do for fun in the 1990s, until 3D accelerators made them redundant)
It’ll take awhile, because I’ll have to learn the MLT framework, how the plugins work, how to use the melt command line application for testing, and maybe copy the modified libmltplus.dylib + metadata into Shotcut.app etc. Thanks for your help!

Ok, I took the Affine transition and generalized it a bit. Here’s the first working test


I extended the affine transform struct by adding a 3x1 translation vector, and now it’s more like a proper 3d transform. I’ll upload the source somewhere, once it get it nice and tidy.

Thank you for the update. If this changes behavior of the existing transition, then please make it a new one, use a different set of properties to use new behavior, or provide an a new property to trigger the new behavior. I cannot accept a change that will break the fidelity of results for people’s old projects or existing scripts. Of course, I could also accept your contribution and change it to conform with these rules, but it will expedite inclusion if it is all ready.

Hello! Did you succeed to create settings for Affine transform? I’ve found a nice feature in kdenlive, think this can be used in shotcut as well.affine