mlt_image_format=yuv444p10 ?

I have been given a bunch of HDR10+ phone recordings and needed to include them together with a couple of SDR gameplay videos. This has forced me to learn many things for the past week, starting with the differences between MLT and Ffmpeg bu the more answers I get the more questions arise.

First: I noticed Kdenlive defaulted to mlt_image_format=rgba for all the projects. Shotcut apparently does that only when a 10 bit export option is selected. Various research (most of them pointing this forum) indicated Shotcut can change its internal processing (which is MLT, I believe) image format according to presense/absence of some effects (please feel free to correct me). Then on github, in one of the changelogs of mltframework, I have noticed 2 new options:

  • mlt_image_yuv420p10
  • mlt_image_yuv444p10

In my experiments couple things got my attention:

  1. if input = output, no scaling no fx, just h264 to DNxHr = things are fine
  2. the same above, mlt_image_format=rgb/rgba = things are fine
  3. the same above, mlt_image_format=<any of the above 10 bit formats> = things are broken (the video is just green stripes, like a terrible Robocop vision)
  4. input + Lanczos upscaled to 4K with mlt_image_format=<any of the above 10 bit formats> = things are broken
  5. input + a neutral FX (levels with no change) with mlt_image_format=<any of the above 10 bit formats> = things are fine again.
  6. All of these results produce different sized files from the ones created with an ffmpeg command, tuned for:
  • fixed fps 60 with fps_mode and -r parameter
  • scaled with the same filter = Lanczos

I would live to learn why these videos are broken with 10 bit yuv444 internal processing and why are they fixed with a neutral effect and why they are different then FFmpeg generated ones.

Additional info: psnr and ssim scores are the highest with ffmpeg generated and yuv444p10

These are only intended to be used with GPU Effects, which is the only true 10-bit pipeline in MLT. But some people want 10-bit output with the 8-bit CPU pipeline, and rgb(a) is generally better for that since all 8-bit YUV options are subsampled chroma and might also be limited color range (effectively < 8bit). There is no AVFrame pasthrough option. Not everything comes from and is done through FFmpeg.