Clarification on what "GPU acceleration" means

Before I jump into filter design with both feet, could I get clarification on what “GPU acceleration” means within Shotcut? I can imagine four stages:

  1. Decoding from source video. ffmpeg can sometimes do hardware decoding if conditions are right. Is Shotcut using ffmpeg in such a way that hardware decoding benefits would be realized? Or is ffmpeg always decoding in pure CPU? If ffmpeg used hardware, would that explain why some people have great performance and others do not?

  2. Filter processing. This is the obvious GPU acceleration space, I suppose.

  3. Compositing and preview. Can track compositing and preview scaling be done by GPU? Are there conditions that will sometimes prevent this from being possible? Oh wait, I think Dan already said that preview scaling was done as a fragment shader. I’m unclear on compositing, though. All I could find was that it worked in 16-bit space, which sounds kinda OpenGL-y.

  4. Encoding the output video. Pretty straightforward… hardware encoding is already a feature on all platforms.

I’ve glanced through the source code and seen things like only four CPU threads allocated for preview. My CPU usage also suggests that ffmpeg is decoding in software. This has me wondering… What performance gains could we really expect from GPU filters if there happens to be a large bottleneck in decoding and compositing?

My understanding (possibly wrong) is that moving a frame from CPU to GPU and back is very expensive, and that getting DaVinci Resolve-like responsiveness requires an all-in approach to using the GPU through the entire pipeline. In that light, can anyone provide guidance on when it’s worth the effort to design a Shotcut filter in GPU, and when it just won’t matter due to other bottlenecks?

I’m also assuming that decoding is the biggest bottleneck in the chain. If that’s true, is it accurate to say that when using common filters that are already GPU accelerated, Shotcut may already be operating as fast as hardware allows? And it would take a hardware H.264 decoder (as an example) to go any faster? If these bottlenecks are true, and CPU-GPU-CPU movement is that big a deal, then is GPU processing even an important roadmap item?

Disclaimer: My GPU experience is very limited, which is why I’m asking. I never use the GPU option because I like the widest cross-platform compatibility I can get. Create on Windows, then export on a Linux farm. And I get better exports with libx264 than I can get with hardware.

Thanks in advance.

You already have most areas covered by your numbered list. However, there is one more area, and some reorganization:

  1. Decoding could be GPU-accellerated but is not currently in Shotcut. It is difficult when dealing with multiple video tracks, but I want to eventually bring it to the Source player and bottom video track. Another difficulty is to convert the decoded output to OpenGL texture for subsequent processing without making the trip through RAM and then doing this for the variety of APIs for different vendors on each OS. Decoding is actually not a huge bottleneck in the overall pipeline thanks to heavy optimization within FFmpeg and CPUs (SIMD).

  2. Image Processing. This covers filters, transitions, and compositing/blending. This the biggest bottleneck because most of the CPU code lacks much optimization. As a result, instead of having to optimize each filter individually, we have been throwing multi-threading at it via multiple slices of an image and multiple frames-at-a-time (so called Parallel processing in Export). GPU acceleration can be done in a few ways currently: OpenGL via Movit, WebGL via WebVfx, and possibly something in FFmpeg libavfilter and OpenCL although not currently in Shotcut. When we write “GPU Effects” in the context of the Shotcut Settings menu entry that is hidden, we mean Movit. Movit uses 16-bit floating point per color component, which is roughly equivalent to 10-bit integer. Also, it is able to chain filters together into a combined shader instead of rendering each effect to a texture let alone moving the video data between CPU and GPU RAM. Thus, it is very good and best approach. However, it is not user-extensible like WebGL. It is impossible to bring the same benefits of Movit to WebGL currently. It could be possible to define a filtering framework within a single WebGL filter, however; but that would not interact with the transitions and compositing.

  3. Preview. Shotcut uses custom OpenGL to display video as both a cross-platform API as well as to offload a last step colorspace conversion and scaling to the display viewport (including zoom).

  4. Encoding. You got it.

  5. The user interface, specifically the Timeline, Keyframes, and Filters panels. These use Qt’s QML Quick API, which is an OpenGL scenegraph API. Basically, each UI object renders to an OpenGL texture, which are then arranged and composited. This is where Settings > Drawing Method comes into play on Windows. The Chrome team found WebGL to be too unreliable on many OpenGL implementations that they created a middleware to convert OpenGL to Direct3D. MacOS avoids this problem because it has heavily relied on OpenGL itself for many years. Linux desktops can often have this problem, but there is nothing else like Direct3D to fallback to except Mesa-based software rendering (now also available in Shotcut for Windows v19.04). Shotcut also uses this technology to overlay a UI on the preview video - something we call VUI - think the rectangle control for Size and Position although you can do other things.

It remains to be seen where all this dependence upon OpenGL will go as Vulkan and Metal takeover.

1 Like

Adding to this question, are any or all of the filters/effects in Shotcut currently Frei0r? I heard that Frei0r effects are only CPU based and do not support GPU. Is that true? If so, then for future filters/effects that are created for Shotcut, Frei0r should not be an option.

some

I heard that Frei0r effects are only CPU based and do not support GPU. Is that true?

Correct

If so, then for future filters/effects that are created for Shotcut, Frei0r should not be an option.

No, this is not your decision. I notice you seem to be trying to act as the product or project manager with some of your comments like this, and you are not.

:face_with_raised_eyebrow:

Uh… no. I am simply offering suggestions. Your pessimism is making you paranoid.

I also received that comment negatively at first. The frei0r library and existing built-features of the MLT framework are valuable assets and I am eager to continue to make them available and improve them. Using the GPU more is a great aspiration. But it requires a skill-set that I currently don’t possess and I don’t (currently) have a lot of motivation to go learn about it because I can currently add more value to Shotcut by continuing to build new features that use the CPU.

Thanks for taking the time to provide a very detailed explanation. That helps a lot. Much respect for the way you’ve leveraged so many diverse technologies to make Shotcut as fast as it is.

I plan to always create CPU versions of filters because 1) it’s easier and 2) it allows Shotcut to continue being THE gold standard for low-cost video editing. If a souped-up computer with a beefy video card is required for Shotcut to work, I’m now in the same expense category as Resolve and I may as well use it. My curiosity was whether it would be worth the effort to also create GPU versions of filters. I realize that the decision would be very workload-dependent. But the more I understand the plumbing, the easier the decision would be to make.

Shotcut had “GPU Effects” as an option. If GPU acceleration support were to be fully implemented, “GPU Effects” could always be kept as an option so that a beefy video card is not a requirement so someone with a low end computer could just turn it off.

This topic was automatically closed after 90 days. New replies are no longer allowed.