Suggested codecs/file formats for editing (speed, reduced lagging)

Hello Folks.

As far as I understand, ShotCut does not work with (resolution-reduced) proxy-files for editing. With this in mind, I would like to address a question further below in a general manner, where I could not yet find the answer.

In my case I want to edit two cameras (differnet angles) on two Video tracks (1080p, 30/60FPS), plus one Audo track.

Currently the preview (playing the Videos in the Editor) is very laggy. I am on Win 8.1 64bit with 8GB Ram and I7-4500U CPU and Shotcut-Version 19.06.15.
The current try with H.264-Files is very laggy and I am considering re-encoding to other formats prior editing.
I have been using VideoStudio, which seems to handdle these files better under a performance aspect, but there are reasons, that VideoStudio in not at all an option for me - I have way too many issues for a paid software and therefor I want to participate in the Open Source software-movement, if possible.

So the general question: Which are the best (performance wise) file formats (Audio and Video) I should convert my source clips to prior editing, to have the best performance and as little lag (CPU usage) as possible during editing?
Is ProRES suggested for Video, or something else?

Am I maybe missing some settings, which could increase performance during edition and playing the clips?

What are Your codec-suggestions for Video and Audio, if the best performance (least CPU usage) is focussed?

See this Shotcut tutorial for working with proxy-files for editing.

Hi GIS, I’ve studied this question quite a bit lately. I tend to not use ProRes in ffmpeg-based programs because of encode/decode speed and because of file size. ProRes is a 10-bit format but Shotcut is a mostly 8-bit editor. So there is some disk space lost representing colors that may never be seen. The prores_ks encoder is also slower than other encoders by a factor of 3 on my computer.

These are the four codecs I would recommend, each for a different reason:

  1. HuffYUV (4:2:2 8-bit only) in a MKV container. This format has the least aggressive compression, meaning it decodes the fastest. Its subsampling is also in Shotcut-native 4:2:2, meaning there is no CPU spent on conversion. The trade-off for this speed is massive file sizes. This is the format for ultimate performance and perfect color.

  2. DNxHR HQ (4:2:2 8-bit only) in a MOV container. This is Avid’s competitor to ProRes. It is also a very fast format that holds color very well. There is currently not a DNxHR export profile in Shotcut, but you can encode directly with ffmpeg by using -c:v dnxhd -profile:v dnxhr_hq and put the result in a MOV container. This format takes around 30% less space than HuffYUV with color fidelity that is visually (although not mathematically) identical. The 4:2:2 is good for holding edge sharpness at proxy resolutions if you are just editing and want to verify footage is in focus.

  3. Ut Video (using 4:2:0 8-bit mode) in an AVI container. Similar to HuffYUV. This setup gives you mathematically perfect color fidelity, but in a 4:2:0 mode. Combined with its more aggressive compression, it yields files around 30% less space than HuffYUV. It loses slight performance due to the 4:2:2 conversion required, but it’s normally not noticeable. Since this format holds color with zero loss, it is useful for color grading on proxies and giving a very true representation of what the final grade will look like.

  4. H.264 All-Intra at CRF 12 (4:2:0 8-bit) for proxy resolutions 640x360 and less in an MP4 container. Higher proxy resolutions can use CRF 14. This format will obviously have compression artefacts and dancing noise and color problems, but the chosen CRFs keep those very much at bay for simple editing work. File sizes can be as little as 10% of an equivalent HuffYUV file, which makes this a great format for internal laptop hard drives.

The container is important for all of these. If a video does not specify its colorspace, Shotcut has a heuristic that assumes BT.601 colors for any video whose width multiplied by height is less than 750,000 total pixels (which is essentially anything under 720p). You might have a 4K source video in BT.709, but if you make a 640x360 proxy with BT.709 colors, it will still get interpreted as BT.601 colors (due to being less than 720p) unless the stream or the container has a way of explicitly flagging the colors as BT.709. In particular, AVI does not have colorspace flags. The only reason Ut Video survives in an AVI is because it has a different FourCC code for each colorspace, and the FourCC itself flags the colors. For the other codecs, they rely on the container to hold the color flags. In the case of Matroska, you can get better performance if you encode through ffmpeg yourself and set -write_crc32 false. I’m not sure if a Shotcut export can do this directly.

TL;DR … Personally, I like the Ut Video format the best for its balance of file size and performance while keeping perfect color. That’s because I do lots of color grading. If you don’t grade, then DNxHR would be perfect for you. If you want it all and you’re above the 750,000-pixel cutoff, then you can do HuffYUV in an AVI (rather than the slower MKV) and have blazing performance if you have the disk space for it.

3 Likes

Hi @GIS and welcome to the forum!

With Shotcut I am working with a lot different file formats and since I’m filming with a new camera that saves the clips in “.mts” I noticed that this format is quite the quickest in editing.

Forgot to mention audio, sorry about that.

There are only three formats that I personally would bother considering. WAVE (pcm_s24le), Apple Lossless (alac), and AC-3 (ac3). Most of the other formats have codec delay. I also don’t like compressed formats like AAC and MP3 because they don’t always seek accurately and don’t seek as fast.

Personally, I use pcm_s24le for everything. Audio produces small files compared to video, so I don’t worry about the space it takes. The reason I use pcm_s24le even for proxies is because I may export all my speech tracks to another program for audio editing, and I want the flexibility to export from the “full resolution side” or the “proxy side” with equal sound quality. If that doesn’t sound like something you’ll be doing, then yes, you could save some space using ALAC or AC-3.

It might be good to update some of the formats/codecs in Shotcut’s Convert to Edit-friendly feature - at least for the truly lossless option, which is currently FFV1 with ALAC in MKV.
DNxHD was always too limited when it comes to supported resolutions, but I just tested 365x360 with dnxhr_hq, and it worked for me.
Do you have any command line option recommendations for Ut Video? What about color_range? We have to set that for the export preset, but if we need this on the ffmpeg command line, it needs to be adaptive.

I agree 10,000% and never used it for that very reason. Avid picked up on the industry discontent with its limitations and made the trait “resolution independent” a major selling point for DNxHR. The pitch works for me at least, because ProRes begins falling apart when asked to encode resolutions that are far outside its optimized targets. It’s sad that during testing I had to use ProRes 422 HQ for my tiny proxies because the Standard profile had artefacts trying to compress at resolutions it wasn’t designed for. DNxHR finally gets a lot of things right. And being 8-bit, it fits Shotcut perfectly (as well as most consumer camera codecs).

Yeah, this one’s a little funky. ffmpeg has a colorspace signaling bug that we have to work around. However, the workaround works 100% in all my testing. For the format I specified earlier, the command line is basically…

ffmpeg -loglevel verbose -i input.mp4 -map 0 -ignore_unknown -filter:v? idet,yadif=deint=interlaced:mode=send_frame,scale=out_color_matrix=bt709:out_range=mpeg:flags=accurate_rnd+full_chroma_inp+full_chroma_int:sws_dither=ed,format=yuv420p -filter:a? aresample=async=1:min_comp=0.001:min_hard_comp=0.1:first_pts=0 -pix_fmt yuv420p -color_range mpeg -colorspace 1 -color_primaries 1 -color_trc 1 -vsync cfr -codec:v? utvideo -pred left -codec:a? pcm_s24le -ar 48000 -sn -dn -f avi -max_muxing_queue_size 99999 -y output.avi

The -colorspace trio is what forces signaling and gets around the bug. Without it, ffmpeg always defaults colors to BT.601.

EDIT: I forgot that I recently added the “idet,yadif” part for proxies because it reduces the frame rate by half for interlaced material, which is a huge performance boost in proxy mode. It is also necessary for output codecs that do not support interlacing. Obviously, I would avoid “idet,yadif” when possible for intermediates, as those should mimic the originals as closely as possible.

I’m not saying this is the right or best solution, but I finally decided to force all transcodes to use whatever color range the output codec prefers. ProRes in a MOV prefers mpeg range. DNxHR can be either range. Ut Video in an AVI has to be mpeg range because there is no FourCC or metadata flags to indicate full range to a player. [Previous three sentences edited to be more specific.] So the upside is that I didn’t need adaptive color range in my ffmpeg commands at all, and conversion between everything has been silky smooth because everything is properly tagged and stored in a supported way.

EDIT: My proxy scripts, while not adaptive to the color range, are adaptive to the input color matrix. If the input file doesn’t specify a color matrix, it uses the same heuristic as Shotcut to keep colors in sync. If the input is specified as 601 or 709, then that is used in the transcode command.

1 Like

Wow, guys.

What a bunch of (great) inforamation. Thank you. I got work to do…

While studying and evaluating these options to change the Convert and Reverse outputs, I am not comfortable with the full range handling with 8-bit DNxHR and UtVideo. Ideally, I want to preserve full range, and use 10-bit where not possible. Today, good MP4 preserves full range, ProRes is 10-bit, and FFV1 downscales to limited, which is really bad for claiming best.

I can make DNxHR output yuv422p10le if full range and yuv422p otherwise. I do like the speed of that for both encoding and decoding/seeking. For the best (true lossless) option, I do not find any ability to do full range or 10-bit for UtVideo or HuffYUV. I might switch to lossless x264 for that. On the other hand, I know some people would like an option that uses royalty-free codecs leading me towards 10-bit FFV1 if full range.

Is any of this going to mess up Shotcut’s color fidelity?

Shotcut’s color fidelity is almost perfect now.

This is more of an ffmpeg/avconv handling issue of “yuvj420p” , not utvideo or ffv1 per se;

Lossless codecs tend not to support “yuvj420p” (or any of the “j” derivatives; except x264 or x265) . So this means a yuvj420p => yuv420p means ffmpeg will clamp YCbCr 0-255 => 16-235

If you have full range YUV input, flagged as full range , or as “yuvj420p” , and you are trying to encode to ffv1 or utvideo, you’d need to use

to tell ffmpeg full range in/out

-vf scale=in_range=full:out_range=full 

to set the flags

eg (same for utvideo)

ffmpeg -i full_range_input.mp4 -vf scale=in_range=full:out_range=full -colorspace bt709 -color_range pc -c:v ffv1 -an -g 1 ffv1.mkv

The ffmpeg output is confirmed lossless, full range data, when compared in other programs

Range and matrix flags are set properly according to mediainfo , and ffmpeg/ffprobe reads it as full range 709 too - but shotcut does not automatically set the range property to “full” by default upon importing , you have to manually set this in the properties panel . I suspect this because this is “yuv420p” , not “yuvj420p”. Perhaps you can modify shotcut import behaviour based on the other full range flag metadata if you decide to include ffv1 / utvideo as options

In light of your full range requirements…

HuffYUV as a codec simply compresses whatever numbers are thrown at it. It doesn’t do any clamping on its own. If the filter graph can just get full range data to it, it will store full range data verbatim. The other trick is the metadata, which means an MKV container will probably be required to carry the colorspace and color range flags. I echo what @pdr already said about using the full in+out scale filter.

DNxHR HQ looks like it can do this trick too, although I’m not sure if it’s “officially supported” by Avid or not.

Ut Video can do the same trick, but it is not a good candidate for full range video because its FourCC is specifically tagged for limited range. Decoding it should flag the contained video as limited range even if there are full range bits inside. Ut Video does have a 10-bit mode using the official codec suite since version 14, but the lagging ffmpeg implementation does not support 10-bit. As a possible fallback, what about using RGB mode to simulate full range?

There is one other option that I was hesitant to include because it isn’t widely recognized by other programs the way HuffYUV and Ut Video are. If you want support for every pixel format under the sun including 10-bit and alpha channels, ffvhuff may be your weapon of choice. (Check out ffmpeg -h encoder=ffvhuff.) It is a fast lossless Huffman coder just like HuffYUV and Ut Video. It has a context flag to set whether the Huffman tables are set up once for the whole video (fast but inefficient compression), or reconfigure for every frame (slightly slower but better compression). Compressing 4:2:2 with ffvhuff -context 0 is bit-for-bit identical to HuffYUV except the FourCC code tagged to the stream. Compressing with -context 1 is not bit-for-bit identical to Ut Video, but the resulting video is essentially the same file size as Ut Video because the codecs are using the same technique. So ffvhuff gets you the best of everything except widespread support in other video editors (which isn’t needed if staying inside the Shotcut/ffmpeg ecosystem). You can force full range video into it using the same HuffYUV hack described above. An MKV will probably be necessary for it too, and performance of MKV can be improved with -write_crc32 false.

FFV1 can also be coerced into full range and yield noticeably smaller file sizes, but it is nowhere near as fast at decoding as the previous options. I can’t get real-time 4K playback from it even with a media player (meaning I can’t check an export for correctness), so it’s pretty useless to me at higher resolutions.

For my own education, how would a 10-bit file help with color range? Wouldn’t a limited range 10-bit file map to the limited range of an 8-bit file?

[quote]

Ut Video can do the same trick, but it is not a good candidate for full range video because its FourCC is specifically tagged for limited range. [/quote]

http://umezawa.dyndns.info/archive/utvideo/utvideo-20.0.0-readme.en.html
If you’re referring to the “readme” -
It’s not really a “limited range fourcc”; What that “internal format” category indicates is - if you use RGB input, it will convert to limited range YUV using Bt601 or 709 according to the chart . (You could argue that you shouldn’t rely on the codec to do the conversion, you should feed it the proper format in the first place). Ideally you should use whatever your source is, 4:2:0, 4:2:2, 4:4:4 etc… since this is in the context of “lossless” compression, assuming the host program handles it properly

For lossless codecs, “Fastest” decoding speed would be magicyuv , but it’s not open source, and the ffmpeg variant is relatively new (read: possible bugs, hasn’t been “battle tested”) . Next fastest would be utvideo - this has been in use for many years - fast, stable

Huffyuv comes in several variants, but the common ones do not support 4:2:0 . 4:2:0 is still a common input format. Technically that upsampling to 4:2:2 is not lossless unless you use nearest neighbor algorithm to up sample and downsample back to 4:2:0 . It’s slightly slower than the “newer” lossless codecs like UT video, magicyuv, because the code base is quite old (~18 years). It does not use newer SIMD instruction sets from newer processors . e.g. AVX2, AVX512, etc…

x264 in MP4 is another option, and it’s the only one that gets treated properly in other programs like Premiere. (Other YUV “lossless” codecs are not treated as lossless, because they get converted to RGB). But even with --tune fastdecode --keyint 1 , the decoding speed and latency is slower than ut video, huffyuv or magicyuv

In the context of my changes to Convert to Edit-friendly and Reverse presets, I do not want to change or coerce colorspace.

-vf scale=in_range=full:out_range=full -color_range pc

This is working in Matroska, but I was deceived by a bug in MLT/Shotcut not reading it (it only sees yuvj pixfmt currently). I will fix this, but the 19.07 beta is already prepared, but I may still put it in the release.

I had considered it, but I want to give users an encumbered option.

8-bit RGB < 8-bit YUV

https://software.intel.com/en-us/ipp-dev-reference-color-models#FIG6-4

Instead of explicitly declaring -colorspace bt709, I’ve had good luck with -colorspace 1 to simply force a tag in whatever the current colorspace of the filter graph output is.

That is the same thing as bt709. This option takes the following enum by number or name:

Details from FFmpeg source code
enum AVColorSpace {
    AVCOL_SPC_RGB         = 0,  ///< order of coefficients is actually GBR, also IEC 61966-2-1 (sRGB)
    AVCOL_SPC_BT709       = 1,  ///< also ITU-R BT1361 / IEC 61966-2-4 xvYCC709 / SMPTE RP177 Annex B
    AVCOL_SPC_UNSPECIFIED = 2,
    AVCOL_SPC_RESERVED    = 3,
    AVCOL_SPC_FCC         = 4,  ///< FCC Title 47 Code of Federal Regulations 73.682 (a)(20)
    AVCOL_SPC_BT470BG     = 5,  ///< also ITU-R BT601-6 625 / ITU-R BT1358 625 / ITU-R BT1700 625 PAL & SECAM / IEC 61966-2-4 xvYCC601
    AVCOL_SPC_SMPTE170M   = 6,  ///< also ITU-R BT601-6 525 / ITU-R BT1358 525 / ITU-R BT1700 NTSC
    AVCOL_SPC_SMPTE240M   = 7,  ///< functionally identical to above
    AVCOL_SPC_YCGCO       = 8,  ///< Used by Dirac / VC-2 and H.264 FRext, see ITU-T SG16
    AVCOL_SPC_YCOCG       = AVCOL_SPC_YCGCO,
    AVCOL_SPC_BT2020_NCL  = 9,  ///< ITU-R BT2020 non-constant luminance system
    AVCOL_SPC_BT2020_CL   = 10, ///< ITU-R BT2020 constant luminance system
    AVCOL_SPC_SMPTE2085   = 11, ///< SMPTE 2085, Y'D'zD'x
    AVCOL_SPC_CHROMA_DERIVED_NCL = 12, ///< Chromaticity-derived non-constant luminance system
    AVCOL_SPC_CHROMA_DERIVED_CL = 13, ///< Chromaticity-derived constant luminance system
    AVCOL_SPC_ICTCP       = 14, ///< ITU-R BT.2100-0, ICtCp
    AVCOL_SPC_NB                ///< Not part of ABI
};

static const char * const color_space_names[] = {
    [AVCOL_SPC_RGB] = "gbr",
    [AVCOL_SPC_BT709] = "bt709",
    [AVCOL_SPC_UNSPECIFIED] = "unknown",
    [AVCOL_SPC_RESERVED] = "reserved",
    [AVCOL_SPC_FCC] = "fcc",
    [AVCOL_SPC_BT470BG] = "bt470bg",
    [AVCOL_SPC_SMPTE170M] = "smpte170m",
    [AVCOL_SPC_SMPTE240M] = "smpte240m",
    [AVCOL_SPC_YCGCO] = "ycgco",
    [AVCOL_SPC_BT2020_NCL] = "bt2020nc",
    [AVCOL_SPC_BT2020_CL] = "bt2020c",
    [AVCOL_SPC_SMPTE2085] = "smpte2085",
    [AVCOL_SPC_CHROMA_DERIVED_NCL] = "chroma-derived-nc",
    [AVCOL_SPC_CHROMA_DERIVED_CL] = "chroma-derived-c",
    [AVCOL_SPC_ICTCP] = "ictcp",
};

I can simply leave it out, and the color range selection still works.

OK, for the 19.07 release version, I was able to fix detection of color range from signaling independent of pix_fmt (e.g. container metadata). Then, in the Convert & Reverse transcode jobs, I removed the 10-bit output for full range DNxHR and changed FFV1 to Ut Video.

Thank you, I have always wondered what the “1” meant. That was the only part of my ffmpeg command that was still a black box to me.

I wanted to thank everyone for the very educational discussion here. There’s nothing like converting between video formats to make me feel smart one day and then like a total idiot the next day. :slight_smile: I’m eager to see how the new Edit Friendly formats shake out, and very glad that Shotcut is determining color range based on more than yuv[J] now.

This topic was automatically closed after 90 days. New replies are no longer allowed.