White balance performance

I recently ran a few projects with the White balance adjusted(my camera got reset and was on the wrong temperature) and I’m just trying to understand some performance changes i’m seeing here, a normal project for my typical workflow is roughly half realtime on export (17 minutes or so on a 10 minute project I did yesterday) however this particlar filter brings things down to 4 hours on a 20 minute project(that takes roughly 35 without the white balance adjusted) I understand I can use hardware encoding to accelerate the process(it looks like i’m estimating an hour and 50 minutes using the gpu) but I normally leave it off(if I export projects from different systems I get slightly different results using the hardware encoders but the cpu is more consistent)

Is the white balance filter just that hard on the cpu? Or is it one of the single threaded filters? Or am I missing something else enterily from the performance question.

This filter and its integration in MLT is missing an important optimization that prevents it from recomputing its internal LUT twice on every frame!

MLT calls this f0r_set_param_value() function every frame because the values may have changed due to user interactivity or keyframes. This optimization is easy to make in frei0r for the next version.

1 Like

On my 8-core Linux workstation, a 57sec 4K@29.97 fps video took 1:43 to encode using hardware encoder (trying to leave out competition for CPU resources).
Adding the white balance filter without parallel processing increased it to 2:26.
Enabling parallel processing reduced it to 1:40.
Adding the optimization did not change it from 1:40.

On my 8-core/16-thread Windows machine, the same clip took 0:40 to encoder using hardware encoder. (The NVENC encoder is faster than the Intel one I had on Linux.)
Adding the filter without parallel processing increased it to 3:25. I guess the older CPU and memory I have on Windows made it slower. Also, it was reading from a network drive.
Enabling parallel processing reduced it to 1:52.
I confirmed it is using the SSE2-optimized path on Windows.

1 Like

God damn I must have unchecked that when I was testing something for someone else, that brought the estimated time on my export to 1:50(one hour 50 minutes) with it loading my cpu at 50%(so all 12 real cores HT isn’t helping) still a huge increase in time but not nearly as terrible as before(shotcut did gobble up 10Gb of ram which puts a smile on my face too though) I’ll tinker around with nvenc later and see if that improves things further.

Was working on a video yesterday and my main issue was editing with the whitebalance on. It was skipping two or three frames so I turned it off and figured I would enable afterwards. Didn’t even think about it and had some transitions in place and rather than messing trying to setup changes for them as well I just said forget it. Worked on my second video which was just under 9 minutes today and had a much better understanding of it all since I am new to ShotCut. Got everything put together and added whitebalance with a few other things like Hue/Saturation, Brightness, and Contrast. Anyway with NVENC the encode time was about 7 minutes 30 seconds on my system. CPU was about 50% utilization and GPU was about 15% utilization during encoding.

Specs:
Ryzen 1600 with 4.1Ghz all core OC
32GB DDR4 at 3000Mhz
Gigabyte Aorus GTX1080 Core Clock 2050Mhz
1TB NVME drive - swap disabled, 1TB 5400RPM HDD for temp files, swap enabled, 14GB RAM drive for temp and swap. All files edited were held on FreeNAS pool served with 10Gb fiber-optic, pool 7 X 4TB RaidZ3 total user space 14TB with approx 3.5 TB freespace.
Windows 10 OS

Nice system!

Thanks, I built it after my wife passed away with some cash left over from her burial policy. Every time I sit down at it I get to think of her.

The plan is to jump to the 3900 or something similar in the 4000 series when that drops if they do actually support the X470 boards.