Hi All, I’ve posted before about some possible performance issues but really not sure how that can be the case as I have a pretty high end machine.
I’m running an AMD system with
5900x
32gb ddr4 3600 with the usage so far averaging between 30 and 45% and Shotcut using about 17%
RTX 3080
2TB 980 pro PCIe gen 4 NVMe SSD
360mm AIO and while Shotcut definitely pushes my temps higher than even stress tests it’s still only in the low 70’s at max, which is odd since the load is only about 25% and Shotcut using about 18%
Shotcut version 21.10.31
I’ve had problems where clips won’t resize, but now I’m having an issue where in a short video with two tracks that if I try to play with a filter on one of the tracks(trying to do side by side) that it plays like crap.
Having a high-end machine does not guarantee high-end performance for many reasons. See this thread regarding export performance. For preview performance, use proxies and preview scaling to lower the total amount of math the processor has to do. If the source videos are slow to decode like H.265, they may need to be converted to edit-friendly as well.
I can’t even imagine a 12 core processor having problems with such a little amount of editing. Are the position and crop filters single threaded? Would this also cause the problem I was having simply resizing my split clips? I tried preview scaling but didn’t see a difference, documentation implies there is overhead there as well so stand alone it might not do much. I know I was editing on a phenom 2 and 8gb of ram without having too many problems with 1080p 60fps. I’m also already using h.254 with I-frame only. Now I’m having a problem with it allowing me to move an audio snip from one track to another so batting a thousand here.
Is this 4K footage? What codec? I see they have “converted” in name, did you use one of the higher quality in the slider? You might have too low of a reading speed of your HDD as the edit-friendly converted files are huge if 4K. I see the issue appears when you have the 2 videos side by side so that’s my best guess.
Show us the Media info for the files and open task manager and check out the HDD/SSD/(network drive?) active time/read speed where the readings are made.
Another thing is you should try to change the Settings → Display method and check them all out. I have a worse CPU than you (ryzen7 2700x) and old GPU (gtx970) and can have 2x 4k30 videos side by side with better playback than what you’re showing (of course, it’s not smooth, I’d say 5-10 fps).
Also, make sure Settings → Realtime (frame dropping) is enabled. You probably don’t want this disabled (unless you have a very good reason to).
It is 4k60fps, it is h.264 I-frame only(which is the “good” conversion option). At first I thought that was bad but someone on here educated me on the “i-frame only” making it a good codec to edit with.
As for my sdd I have a 2 tb 980 Pro which is PCIe NVMe Gen4, it doesn’t get much faster than that.
I have DirectX for the display method, which I believe was the default and what I found in the forums seemed to indicate that is what I should be using with my setup.
Other way around. Disable real-time frame dropping. Also check that the Settings > Interpolation mode is bilinear instead of bicubic or Lanczos. Those are slow scalers for preview.
4K60 is simply a lot of pixels. Using Preview Scaling on its own might actually help, where the overhead is worth it. There’s also the proxy option. Also, we can reasonably assume that the Size Position Rotate filter isn’t going to scale much beyond eight threads, if that.
Ok I will disable realtime but since the audio was choppy I’m not sure how well it will help…
…after writing that sentence I went to documentation page for realtime and it does mention that OFF should be used when wanting to see if multiple threads will make things better, which is unintuitive for me as I don 't understand why realtime would NOT use multiple threads. I’ve had realtime on this whole time because a lot of tutorials out there mention it as a solution for choppy playback. The documentation says specifically that shotcut will use up to 4 background threads(not sure why only 4) to render 4k video when realtime is OFF, I’m assuming that means it is single threaded when realtime is on but again not sure why that’d be the case.
From a technical/software side this is getting quite interesting…
Anyway to continue…
Bilinear is already enabled and I will start my next video with preview scaling set to 720 Since I’m working with 4k source on a 1440 ultrawide(assuming I’m understanding documentation correctly).
“scale much beyond 8 threads,” I would hope 8 threads would be enough but clearly from a technical side I’m working off of best guesses and have a lot to learn. I’m curious how you came to the value of 8?
Proxy is still on the table but I admit I’ve been trying to avoid it, again I’ve just assumed with my higher end PC, past experience, and what others have said that I shouldn’t be having too much trouble working with my unaltered(beyond vfr to cfr frame rates) source. But also if I’m reading correctly I will have to go through two conversion steps, one to convert to edit friendly then another to create a proxy. And then make sure I don’t screw up the export and just export the proxy.
But it is not unaltered. The Size Position & Rotate filter is heavy. And when it is done the video from two tracks must be composited. It does not understand that it could simply draw half of one video on top of the side of another video. Rather, the effects being used are very flexible to allow much more expression, and there is no way to say to do the other simpler faster thing.
Not trying to be condescending at all here, but this is not a little amount of editing. There are two tracks, which means a compositor must be invoked. There is a SPR filter involved which has limits to how much it can thread before the overhead is counter-productive, and scaling is a pretty slow operation in the first place. And you’re trying to do all of this at 4K60. Multiply decoding overhead * 3840 * 2160 * 3 pixel components * 60fps * YUV/RGB conversions * 2 tracks * compositing overhead * scaling overhead * another scale to preview resolution, and you can easily see that the number of math operations will exceed the amount your processor can do per second. It’s even worse if any part of the chain has to be single-threaded.
The reason Resolve and other editors can do real-time 4K60 is because the CPU is only responsible for decoding the video files. Then the frames are moved into the GPU for effects to be applied, where there are many more specialized cores available for those processes. But Shotcut is not using GPU for effects. This means the CPU has to do all the math. And that’s going to be slower than a GPU-based workflow.
However, a CPU-based editor (especially if using proxies) has far lower system requirements compared to the expensive GPUs that Resolve requires to even start up, meaning people can get into editing at a much lower price point in exchange for patience. This is the trade-off between the two designs.
You’ve answered me a few times and I don’t find you condescending at all. I’m just surprised my decade old pc could handle 1080p that much better than a new high end one can handle 4k. The scaling hasn’t matched my expectations but your explanation makes sense and I appreciate you actually taking the time to understand my thinking as to properly correct them. It doesn’t answer my resizing issue though as that was a single track with no filters. So basically I’ve been looking for something to be wrong in my settings somewhere.
That being said, I tried disabling real-time which didn’t fix anything. Does preview scaling not really help that much because of the cost? Setting it to 720 didn’t make a difference either.
I see there is an option for hardware encoder but it is under proxy so assume it has nothing to do with the player.
Has there ever been any talk of shotcut having an option to bring the gpu in or is the framework just not there?
Quoting myself to correct this: I retested with two 4k60 videos and indeed I have basically the same playback speed as your recording. It’s probably too much for the CPU to process fast enough.
My guess is because limited multithreading support as I don’t see my CPU go to 100% so if only [let’s say] 4 cores are doing work it doesn’t matter if it’s 4 out of 16 or 4 out of 24. Core frequency is probably a bit in your favour but something like 10-20% extra power just means +1, maybe +2 frames per second here so if you also factor in the original video differences and other random stuff (temperature/screen recording) we end up with simmilar playback rate in this scenario.
If the old PC was processing 1080p 30fps footage, then it was doing only 1/8th the work of 4K60. That’s a big difference.
I’m also guessing there isn’t a huge clock frequency difference between your old PC and new PC, that it’s mainly the core count that changed. For single-threaded or low-threading parts of code, that means the old and new PCs are progressing at the same speed because the clocks are (nearly) the same. The core count advantage means little if Shotcut can’t actually thread due to a filter bottleneck.
I’m not sure I understand this part. Where’s the resizing problem if there are no filters? Do you mean no filters besides a single SPR filter?
The effectiveness of this setting can be influenced by many things, such as hardware type and source video codec. However, is there still no change after toggling this option then restarting Shotcut?
What about even smaller? On a 1440 screen, the preview window may not even be 720 tall, so there’s no reason to build frames that are larger than what can be seen. Maybe 540 would be faster? 540 is barely shorter, but it’s 44% fewer total pixels.
This has been discussed in the past. (I am not a developer; I’m just relaying past conversations in the forum.) The main issue is that once a frame enters the GPU, all processing from then on needs to stay in GPU because it is expensive to move a frame back and forth between CPU and GPU. This effectively means the entire framework needs to support GPU if going that route. A major problem here is that the bulk of effects processing is actually done by FFmpeg, but FFmpeg’s filters are not written for GPU. Somebody would have to go through and rewrite them as GPU equivalents, which is a huge labor effort. Then comes the issue of which GPU hardware gets supported and which hardware doesn’t, and the increased system requirements on everybody. It’s a complex topic.