Is rendering on OSX GPU accelerated?

Is there any reason h264videotoolbox codec is not using NVIDIA GPU (at all) on my Mac?

On a similar note, is there a reason libx264 is not fulling utilizing my CPU on Mac?

Handbrake uses the same libx264 and it fully loads my Macs while Shotcut loads 1/2 of CPU and goes about 50% slower. Almost looks like it is number of cores vs. number of threads kinda thing.

On Windows libx264 fully loads my CPU, same as Handbrake.

It depends on what your project is doing. You can try enabling parallel processing in Export > Video if you are using any effects or even simply scaling. If you want to make the simplest test to compare with something like Handbrake, set your Video Mode to Automatic, open a clip, do not add it to Playlist or Timeline, and simply export it without changing anything in Export > Video. Effects are rendered on the CPU - sometimes with parallelism but not always - and are often going to be a bottleneck for a hardware encoder.

What model gpu is it? unless it’s a mac pro with a quadro(or a hackintosh) it might be too old they haven’t used nvidia cards in a long time.

videotoolbox works on my Intel-based graphics MacBook. I do not know which is chosen if you have both Quick Sync capable Intel and NVIDIA. You might be able to see in the export job log. One thing is for sure: if you have a hardware encoder turned on and it is not working, you will either get no video in the export or the job may fail. If you turn it on, and you get video, it is working.

I only tried it on Hackintosh with GT640 and I didn’t see Activity Monitor using any GPU, neither Intel nor NVIDIA.

Well, I did try h264videotoolbox with my “kosher” MBP with GT750M too but to be honest I don’t remember if I looked at Activity Monitor for GPU utilization or not, so I won’t make a claim just yet.

So what’s everyone experience with h264videotoolbox overall vs. libx264?

Is h264videotoolbox acceleration is for Intel GPU only? What about Windows nvenc for OSX? Or Metal or OpenGL?

I did get encodes with some poor results, so I am pretty sure it went thru h264videotoolbox. I just didn’t see GPU utilization during encodes. I believe Activity Monitor with some sub-windows turned on is the proper way to monitor GPU utilization on OSX.
I

So the GT640 is a GK107(maybe GK208 depends on the variant good job nvidia…) chip and only supports H.264 (AVCHD) YUV 4:2:0 per nvidia’s spec here https://developer.nvidia.com/video-encode-decode-gpu-support-matrix I can’t speak much to the apple side of things specifically but as far as the gpu is concerned odds are you’re better off with just pure cpu in this case.

No, VideoToolbox is a macOS framework that provides an abstraction over different hardware.

What about Windows nvenc for OSX?

NVIDIA says NVENC is for Windows and Linux only, and Apple and NVIDIA had a falling out. I do not know the state of it. Handbrake’s page seems to confirm this.

I am pretty sure it went thru h264videotoolbox

Maybe there is some software fallback in macOS.

This is really bizarre!

But first of all I am doing just Stabilization on 1080i MTS.

On some timelines I could see my GT640 is being utilized about 25% and CPU goes about 35%. On other timelines Nvidia is not being used at all and CPU is about 30%…

So I figure it is the following

So the question is “how do I max out my GPU?”

Off-topic question - How do I run Shotcut headless (/Applications/Shotcut.app/Contents/MacOS/melt) to encode an MLT to certain H264 output? And time that?

[I want to time what is the difference between some timelines using GPU and some not.]

All my 1080i are shot on the same old camera, BTW

Read the FAQ

Link please!

The best I found is this but then I am lost what to add after melt my.mlt -consumer avformat:output.mp4 ???

How do I time too?

Over last week I’ve done quite a few encodes on different computers and same computers between OSX (x264videotoolbox) vs. Windows (nvenc).

My conclusion - x264videotoolbox is NOT using GPU and same encodes on Windows are about 150% faster on GT640 and GT7xxM.

On Windows I am getting from 30% to 100% GPU utilization with nvenc on my GT 640.
On Hackintosh, on the same computer, I am getting 0% GPU utilization with h264videotoolbox (Adobe Premiere/Media Encoder uses my GPU from 30-60% depending on encode on either Windows or OSX).

On “kosher MacBook Pro” with GT7xxM I am getting the same 0% GPU utilization with h264videotoolbox (Adobe Premiere/ME uses that GPU about same 60%).

libx264 is about 125% slower than h264videotoolbox and in 175%-200% range slower on Windows vs. nvenc for me.

I suspect Adobe is using CUDA and not VideoToolbox. On my MacBook I do get some GPU utilization with h264_videotoolbox on 1080p60

Notice how much the CPU is used even though there is no scaling, but there is decoding H.264 on the CPU. If I switch to hevc_videotoolbox, there is almost no GPU utilization:

But it is much faster than x265 (the first job is the same input with hevc_videotoolbox):

I am not confident macOS Activity Monitor accurately reports GPU utilization with Intel hardware encoding. See how much faster hevc_videotoolbox is compared with x265 and how different their CPU utilization is? And x265 is not slow for a software HEVC encoder. It is just not really possible to get that kind of performance on HEVC with software unless it is severely neutered. My conclusion: the hardware utilization is not accurately reported.

I believe Adobe uses straight OpenCL to accelerate their encoding. Anecdotical evidence - first time you launch Adobe Media Encoder it would run a compiler to compile OpenCL stuff for a while.

Adobe used to use CUDA, circa CS 2006 or about, but I believe now they are straight OpenCL across platforms - not much difference between OSX and Windows on Bootcamped Mac…

Do you mean GPU instead of CPU?

To me this looks like h264_videotoolbox is decoding? your 1080p60 on GPU?.. or may be you are just playing some video or something on GPU while h264_videotoolbox is not using GPU?

Bootcamp your Mac into Windows and see how ASF (or whatever the name of the GPU acceleration for Intel GPU in Shotcut) would load up poor Intel GPU way higher and more importantly it would keep the load consistent thru encode.

Windows Activity monitor is more detailed in my experience, but I think that OSX GPU utilization window shows it “alright” (based on when my MacBook fans would spin up and spin down)…

Those sporadic spikes in GPU utilization in your screenshots is nothing like consistent utilization of GPU on Windows…

No. There are 2 monitors: one says GPU and the other CPU.

To me this looks like h264_videotoolbox is decoding?

No, that is not implemented. Read the FAQ. You asked for a link. It is at the top of this page when you scroll up all the way.

That puzzles me too!

videotoolbox is faster than x264… about 125% and uses less CPU. Yet it is slower than nvenc/intel GPU codec on Windows… and way slower!

Adobe uses both cuda and openCL depending on what plugins are used, some third party plugins are cuda only still.

https://helpx.adobe.com/after-effects/using/basics-gpu-after-effects.html

That’s After Effects.

Adobe Media Encoder is not using CUDA since Adobe 2018 for sure… It uses OpenCL or Metal on OSX and OpenCL on Windows.

After a few more renders here is what I found:

Videotoolbox does use GPU but only occasionally. By occasionally I mean it depends a lot on material and filters.

What do I mean by “occasionally” and “it depends” and how did I arrive to this conclusion? Let me share my experience.

I’ve been doing encodes on the same Hackintosh/Windows and on MacBookPro. Both computers have QSV and NVIDIA. On Windows I’ve noticed that encodes on my 1080i material were failing a lot - it appears that NVENC doesn’t like 1080i material and errors out. Bizarrely enough same timeline could encode with NVENC on Windows say at 12Mbps VRB but fails at say 40Mbps VRB, but other timeline could encode at 40Mbps but fail at 12Mbps - it is very inconsistent in my experience, but it is a topic for another thread.

Based on my experience on Windows I concluded that may be NVIDIA just doesn’t like 1080i, no matter that I export it as 1080p30. BTW, exporting 1080i as 1080i just doesn’t work, nor do I care it to work. So I started playing with 1080p60 material and on Windows 1080p60 utilize GPU about twice as much both for QSV and NVENC, i.e. to about 30%-40% range vs. 15%-20% for 1080i when 1080i source works.

So with 1080p60 I could see h264_libtoolbox utilizing GPU (sometimes both GPU, yey!) but barely!

With no filters at all my GPUs are barely utilized on 1080p60 and CPU utilization is in 60%-80% range. With HQDN3D+ColorCorrection+Levels+Saturation my CPU is maxed out and GPU is barely utilized. Adding Stabilization to the same timeline (i.e. on top of mentioned filters) slow things down 2 to up to 5 times and my CPU stays at or bellow 50% utilization. GPU is so barely utilized that it is not showing on macOS Activity Monitor.

Conclusions:

  1. h264_videotoolbox is faster than libx264, but it barely uses GPU
  2. h264_videotoolbox is slower than either NVENC or QSV on Windows
  3. Filters are bigger bottleneck than encoding on GPU.
  4. Somehow neither NVENC nor h264_videotoolbox likes 1080i material

Questions:

  1. Do all filters processed on CPU only?
  2. Why Stabilization filter is not threading enough (drops CPU utilization bellow 50%) and how could I make it thread more?
  3. Why NVENC doesn’t like 1080i material?