I am currently using a notebook with an Intel Core i7-8750H. This has 6 cores and these are apparently also fully utilized when exporting videos. I would like to get a desktop PC with up to 16 cores (AMD?). Are all 16 cores fully utilized or is there a limitation of ShotCut?
According to multicore benchmark tests, these CPUs are up to 5x faster.
I hope the export will also be 5x faster.
There are a lot of variables.
If using a CPU-heavy encoder like SVT-AV1, or libx265 on veryslow
preset, then it is likely all cores will be used by the encoder.
But if using a hardware encoder (which causes minimal CPU overhead), then using all cores is very dependent on the number and type of filters being used in Shotcut. Some filters scale to many cores very well. Other filters are single-threaded and will bottleneck the entire frame-building process, leaving the other cores idle. Parallel processing can help mitigate this.
If you provide details of your specific setup, we can estimate what might happen.
Which specific configuration do you mean?
I don’t use hardware encoders because the resulting files are larger.
I use libx264 and libx265 with best settings.
As filters I use video and audio fade in and out and GPS graphics.
Perfect. The newer processor with more cores should in theory benefit you significantly. It is very likely that all 16 cores would be used, especially if parallel processing is enabled on the Export panel.
It’s a bit complicated. What’s the difference between h.265 and libx265?
This ist H.265 or HEVC.
libx265 is an encoding library for creating H.265/High Efficiency Video Coding (HEVC) video streams.
Another factor to consider is whether you have GPU effects enabled. Obviously, that shifts some of the load to the GPU, but it also introduces a potential bottleneck as Shotcut has to shuttle data between main system memory and the graphics card’s memory.
The results will vary depending on both hardware and what you’re doing (how many layers, what filters you’re using, etc.), but for what I do, which includes using hardware encoding, I find that my 8-core CPU is generally highly utilized if I have GPU effects disabled, while CPU utilization is much lower if I have GPU effects enabled - so for me, adding more cores might be useful in the former scenario but not likely in the latter.
Why did they give me the option to choose a library for the codec? Will the codec work with another library?
I have AMD Ryzen 5, six cores. I can’t load it with anything when working. Only when exporting with the parallel processing mode at 70% can I load it, I haven’t tried this mode without the hardware mode.
And could it be that the weak point is the hard drive? That sometimes I don’t have anything loaded, but it slows down.
There is some kind of warning in GP mode, I haven’t tried to enable it.
I know your original question only talks about export performance. But I also want to mention preview performance. Many people complain about poor preview performance which results in skipping frames or distorted audio while previewing.
The #1 way to improve preview performance is to have the fastest disk available. Typically NVMe.
The #2 way to improve preview performance is to have the fastest single core performance available.
When looking at CPU performance benchmarks, it is tempting to only compare the multicore performance. But the single core performance makes the biggest difference for preview in Shotcut. The CPU with the best single core performance may not have the highest multicore performance.
Thank you for listening to my Ted Talk
I currently use a Samsung 980 Pro and can only recommend it. I also always use proxy files with 360p for my 4K GoPro videos. The quality is absolutely acceptable and you can work well with it. But I currently need around 3 hours of export time for a 10-minute video.
I actually wanted an AMD CPU (AMD Ryzen 9 7900 ?), but I will also take another look at the Intel CPUs.
I would like to have a powerful (single core and multi core), energy efficient and cheap CPU
I tried but it got worse. Was this a joke?
I switched to Windows 10, it shows the graph better. It seems that my hard drive is still problematic. But in this project I worked with large files to get smooth fps. I encoded intermediate files with the DNxHD codec. But if I encoded them in h.264, the size would decrease and the load on the hard drive would also decrease and the load on the processor would increase? And what should I do to preserve the quality and work normally?
It does not mean to limit the OS or application to only 1 core. It means the clock frequency is more important than CPU count, but there is a trade-off. It is good to have several cores but not so good to have very many low-speed ones. Do not expect the best result by focusing on CPU thread count.
I once read on Wikipedia that the clock cycles are different, they have different lengths, so an old processor with a higher clock frequency is far from a fact that it will work faster than a new one with a lower clock frequency. And AMD and Intel processors have different clock cycles.
I think Wikipedia was referring to instruction sets, not clock cycles. An advanced instruction set is only faster if the application is coded to use it, or compiler-optimized to take advantage of it.
The general guidance about higher clock speed being better is still accurate.
Here is an example to demonstrate what I mean. Consider these two processors:
The Xeon processor appears to have better performance because it has twice as many cores and much higher CPU mark (which is a multicore metric). However, I would choose the Core i7 because it has far superior Single Thread rating. Shotcut preview will work better with the i7 than the Xeon because the preview can not take advantage of all the cores - but it can take advantage of the higher single core performance.
Thank you very much!
Which website did you use?
I am currently using this website:
Here the multicore benchmark shows +260%, i.e. 3.6 times better.
According to this comparison, I am hoping for an approx. 3.6-fold reduction in export time.
But in your example you are comparing a server CPU to a desktop CPU and according to my website the server CPU has better single core performance?
https://technical.city/en/cpu/Xeon-Gold-6312U-vs-Core-i7-12700K
Which CPU would you recommend?
I am currently more in favor of an AMD CPU as they are supposed to be more energy efficient for their performance.
No, it says exactly what I wrote, I know about instruction sets. Server processors have a minimal number of them and there are more real cores. I didn’t find that Wikipedia article, but here’s another one that says about the same thing.
" The clock rate alone is generally considered to be an inaccurate measure of performance when comparing different CPUs families. Software benchmarks are more useful. Clock rates can sometimes be misleading since the amount of work different CPUs can do in one cycle varies. For example, superscalar processors can execute more than one instruction per cycle (on average), yet it is not uncommon for them to do “less” in a clock cycle. In addition, subscalar CPUs or use of parallelism can also affect the performance of the computer regardless of clock rate."
Personally, I don’t develop processors, so I can’t recommend anything.
My processor has 14 cores and 20 threads.
I did a little experiment and here is the result:
libx264 (software codec). Few cores are loaded.
libx265 (software codec). All cores and threads are loaded.
hevc_qsv (intel hardware codec). The processor uses about two cores, the main load is on the video card. It works faster than all but also makes a large file.
Draw your own conclusion whether you need a lot of cores or not.
Thank you for sharing your test results. Readers, please be aware this test is really only exercising the encoding stage of export. There is also decoding and processing. Most of the video decoding and processing is multi-threaded. However, some filters may not be multi-threaded and slow down the pipeline. In that case, turning on Export > Video > Parallel processing can help because then processing can work on up to 4 frames at the same time. Why only 4? Because that process is not optimal especially with respect to memory cache. My testing revealed that beyond 4 there is little benefit and starts to degrade decoding and encoding.
What is the preset=
value in the Advanced > Other tab? I’m guessing fast
? If it were veryslow
, I would assume more cores could be used. Since the OP uses libx264 at “best quality settings”, I would expect veryslow
or similar for their use case.