Built in proxy generation

I completely forgot about that! You are using the UT codec on very low resolution. I’ve been messing around with the UT codec on Shotcut as full scale lossless exports! :rofl: It does indeed make a difference in file size. I just tested a 1080p file I have that’s 12 gigs and exported it out of Shotcut with the UT Lossless preset and changed the resolution to 480x270 as you suggested along with changing the audio codec from the pcm_24le codec that the preset is set for to ac3 as Dan suggested with 128k bitrate. You’re right that the image and colors still hold up even with such a low resolution but the file is still big as it came out to 40 gigs or so. It’s still much lower than it would’ve been at a full scale although I have to say that the results at full scale are very interesting. I don’t remember if it was made clear in the other thread but UT Video is also mathematically lossless like FFV1 and HuffYUV, right?

I imagine that if this proxy generator thing gets going that probably the best approach would be to offer two options. One for a regular proxy file that is very small in file size and a lossless proxy using the UT codec. The choice should be there in case disk space is a concern.

I’ll test some more stuff out with the UT codec and share my thoughts in a day or so. I’m hardly a technical expert on this so bear with me. :slight_smile: And thanks so much for taking the time out to explain all of this stuff. I’m learning a lot!

1 Like

I think the idea of two proxy presets is brilliant. Ut Video is mathematically lossless, and using it along with 480x270 could be a useful preset for people stacking a lot of tracks, which requires low resolution to composite everything in real-time. (My personal stack record at 480x270 is 18 tracks.) The second preset could be 960x540 using H.264 All-Intra to save on disk space. I haven’t tested the lossy codecs as thoroughly for this second preset, so another one may end up being better. But H.264 All-I would be my start point for research.

(To be super technical, all that really matters is the height when deciding the proxy size. My ffmpeg scripts use -vf scale=-1:270 which means “video filter scale to whatever width correlates to a height of 270 pixels when preserving the aspect ratio”. This way, you always have 270 or 540 lines available to your preview window, but the width is free to jump around in case somebody brought in a video of a digitized 4:3 VHS tape or something. Not every source is guaranteed to be 16:9, so only the height is constant among proxies.)

As a general principle, there isn’t much benefit from a 1080p proxy, especially lossless. The phrase “full-resolution proxy” is literally an oxymoron. :slight_smile: A full-resolution transcode, if lossless, is an intermediate rather than a proxy. A full-resolution transcode, if lossy, is not perfect anyway, so why overkill the resolution on such imperfection?

960x540 is a bit of a sweet spot for people editing on 1080p monitors (which is probably the majority of people at the moment). At 960x540, the video is slightly bigger than the Shotcut default preview window, which makes for a fast and sharp preview. It is also half of the monitor’s resolution when going to full-screen, so it will scale perfectly and still look decently sharp for editing work at full-screen. If the proxy was 1080p, the preview window would have to scale the video down to fit (which takes CPU), the compositor and all filters would have four times as many pixels to smash together (more CPU), and the full-screen preview would provide a level of detail that gains you nothing in terms of functionality over 540p. There is also the extra encoding time and disk space required to make the 1080p proxy. It’s extra work for no added benefit if using a 1080p monitor. Proxies by nature are supposed to be imperfect so they can be fast. Sensitive work like color grading can be done after switching back to the originals.

Granted, everyone’s preferences and unique project requirements are different, but that’s the general principle and a pretty good starting point.

Oh, almost forgot… using Ut Video…

4K source to 480x270 proxy = proxy at 40% disk space of source. The start resolution is massive and the proxy resolution is tiny, so the relative space percentage is small.

1080p source (25% of 4K) to 1080p proxy = proxy that takes more space than there are atoms in the known universe. The start resolution is small, the proxy resolution is high for a proxy, and it’s lossless, so the relative percentage is going to be well over 100% (as in, proxies will be significantly larger than the sources).

BTW, you qualify as a technical person if you can make sense out of any of the mumbo going on in here. :joy:

1 Like

We may want to give the user a little more control over it than that since some projects could still be 4:3 instead of 16:9 but I’m sure it would be possible to build a small table and give the users an option between “High quality”(h264 All-Intra or something like prores) and “Lossless” HuffYUV or UT video depending on final details.

If I wanted to do lossless editing of a 1080p or 4K video but I wanted to save as much space as I could, does this make sense to you: Make a FFV1 of the 1080p or 4K source then take that very FFV1 to create a UT 480x270 proxy? Sure, there’ll be lots of CPU usage because of FFV1 but if disc space was a concern could that workflow be good to do lossless editing of 1080p or 4K?

The 4:3 possibility was covered in the second paragraph. All proxies would be created in the same aspect ratios as their originals because the width is left variable and calculated at encoding time. The project timeline has no bearing on proxy resolution. A proxy’s job is to mimic its source, including aspect ratio. So using our current example for lack of a better idea, the proxy preset names could be something like “Proxy - 540p 29.97 fps” which could use H.264 All-I, and “Proxy - 270p 29.97 fps” which could use Ut Video. Neither preset name specifies a width because it’s both unknown and irrelevant.

If I’m tracking this right, you want to edit on 270p proxies. If that’s true, your edit is fast because you’re using proxies. So when it comes to your final render, why not use the original video? Why do you need a lossless FFV1 copy of your original? Converting to FFV1 will not bring any magic quality gains to your original video. The only time I can see that making sense is if you’re converting variable frame rate cell phone video into constant frame rate and you want it to be lossless. That would make sense. Although even in that scenario, a phone sensor is so bad that it may not be worth lossless preservation anyway. H.264 at CRF 18 or so would be less compression than the phone already put on that video.

Doesn’t exporting from a source always go down a generation in quality? That’s why I was asking about lossless editing. Making a lossless file then exporting it from that for the finished work to not lose any quality.

If you’re working with proxies they’re just a “placeholder” ideally you still go back to the source when you export.

You are correct, exporting from a source does go down a generation in quality unless you’re exporting to a mathematically lossless format, or you’re exporting to a visually lossless format and you’re okay losing the details that only bumblebees can see. Visually lossless formats like ProRes are designed to pass through many generations without visible loss.

Just to make sure we’re not skipping any steps (I apologize if this is overbearing), there is no generation loss in reading the original video file. Reading it produces the same bitstream every single time. The loss happens when writing (exporting) a new video file into a lossy format. It’s the new file that has generation loss, not the rendering pipeline that led up to the new file. As in, the compositor and filters and color grades and previews would have access to all of the details that were in the original video files with no loss. The editing and computation phase is lossless. It’s the content of a lossy final file (if a lossy export format is chosen) where the breakdown happens. It’s generally okay for the final render to be in a “good enough” lossy format because no further editing will be done and there’s no point burning disk and CPU on details the eye can’t see anyway. YouTube and Netflix want to crush out everything that doesn’t matter in order to save network bandwidth on streaming. Studios crush the final render to fit longer movies and more bonus content on a single Blu-ray disc and avoid needing a second disc.

So here is where I’m not tracking yet. When you make a lossless file from your source (basically an intermediate) and render your final against the lossless, how is that any higher quality than rendering against the source video directly? How would the lossless file produce color information that the original video file couldn’t? Without that, what does the lossless file bring to the party (outside of converting VFR to CFR)? It doesn’t provide editing speed because editing is against the proxies. It doesn’t provide quality any higher than the original (it’s not going to create more pixels or better colors unless you add filters), so why add the extra step? The lossless file itself was an “export” from the original video and there was no loss. So it’s equally okay to render direct against your source and the only loss that would happen is whatever is inherent to the export format you write to (or none if you export the final in a lossless format).

If I’m reading it all wrong and you’re talking about the final export going to a lossless file rather than each source video, then yes, exporting the final render to a lossless file to prevent generation loss makes total sense.

A shorter version since my last one was too complicated. :slight_smile: My apologies for a double post:

If the goal is to convert the source video from variable frame rate to constant frame rate (or any other necessary conversion because the source is unusable as-is), then yes, using an FFV1 intermediate to avoid loss yet save disk space is a perfect solution.

If the original source file is usable as-is, then adding an FFV1 intermediate buys you nothing extra. You can render directly against the original file and get the same level of quality. After all, the lossless intermediate got its quality from the original; so the final render can do the same.

1 Like

@Austin, I was experimenting more with Ut and had thought about bringing the resolution even lower than 480x270 to see what happens. I have done several exports with Ut at 360x204 both with 1080p and 4K. I found that the video still holds up surprisingly well. I love it because that file sizes are much smaller than 480x270 Ut. Unless I am missing something the 360x204 resolution is the one I will use.

Have you tried it out at 360x204? If not please do and let me know what you think. :slight_smile:

Since “proxy generation and editing” is in the road map, do you have any pointers as to when it will be implemented?

I agree proxy generation would be useful + slider for LUT intensity

Agree as well, even if it could be done quickly manually

For %f In (*) Do ffmpeg.exe -i "%f" -vf scale=-1:270 -c:v libx264 -profile:v high -crf 12 -intra -tune film -preset veryfast -c:a aac -b:a 384k "..\Proxy\%f.mp4"

there are dozens of tests I did, during writing the “auto proxy creation” tool, I kinda reached a balance among speed, size, quality, for a “lossy proxy”. What I find are:
H264 is the most stable codec. Although sometimes it is not the very fastest, it works well in all environments.
360p is a better choice. 270p misses too many details like doing colour grading you can’t see small items in dark area showing up or not. And 360p is very close to the actual preview window size on most 1080p monitors.
tune with “fastdecode” will ease the playback cpu usage and make the proxy smoother a tiny little bit on lower-end computers. (tune with film will increase video blocks which dropped the proxy quality)
GOP = 1 is pretty important to get the forward/backward playback smoothness and timeline click-instant-jump.
It is also said the ac-3 is a bit better than aac.

My code was:

ffmpeg -i “%f” -vf scale=-1:360 -c:v libx264 -threads 0 -preset ultrafast -tune fastdecode -crf 15 -g 1 -c:a ac3 -y “…\Proxy%f.mp4”

Tested in production and worked well. If filesize doesn’t bother you you can reduce the crf to 12 or use 450p.

why use ac-3 instead of wav?

Dan in an earlier post in this thread recommended AC-3 over AAC for audio proxies. Here’s the comment:

wav files are great but part of the aim of KKnBB’s proxy tool is to make good but small sized proxy files. wav files being uncompressed will make the proxy files bigger.

ahh it’s been so long I forgot small was part of his criteria, big files are usually fine for me

I’m curious, do you do all of your editing with HuffYUV in Shotcut? Does your computer have all the space for all the HuffYUV usage? I use an external hard drive for Ut Video and the like.

Not all of it(I’ve thrown some quick mp4’s together on occasion without any serious editing) but yes I do any serious editing as HuffYUV at 4k with WAV files for audio. As for space it’s a big dual socket workstation I’ve got a ton of drives in the system including a 1TB NVME just for current projects(along side 2x10tb(raid 1) spinners 2x2tb(raid 0) ssd’s for fast storage and a 500gb os ssd) I typically edit the HuffYUV on the NVME drive and export to the dual ssd raid 0(as both FFV1 and Youtube format) with archived files ending up on the spinners(I don’t keep the huffYUV but I do keep the originals as well as the FFV1)