TL;DR Don’t use MP3. Use our MJPEG preset, which outputs uncompressed audio.
It is complicated. You wrote “converting.” Does that mean you used Properties > Convert, View > Resources > Convert Selected, or Export > Export File? It will make a difference because Export runs through the edit engine, which is more strict and less flexible than a sequential transcode, which is what Convert does. The edit engine (MLT) operates in video frames as a unit of work whereas transcode (ffmpeg
) has more precision.
2-3 ms longer (in Shotcut)
Shotcut has frame precision, and the timecode displays’ last field is in frame units: hours:minutes:seconds:frames. I think you are seeing 2 or 3 frames, which is going to be many more milliseconds. The next version of Shotcut has a configurable time display that has an option to show milliseconds, but it does not change the fundamental frame level-of-precision.
In any case, let’s look at an example. I have a test.mp4
with a NTSC 30000/1001 ~= 29.97.0030 fps. It has a duration of 00:02:27;27 in Shotcut’s Properties. The frame counter is 0-based; so, there are 28 frames after the last full second. 28f * 1000.0 ms/s * 30000/1001 f/s ~= 934.27 ms.
Now, if I run Properties > menu > More Information I can see that each stream/track in the multiplexed file has a unique duration:
video duration=0:02:27.914433
audio duration=0:02:27.925000
Why are they different? I suspect it may have something to do with the fact that AAC audio codec has 1024 samples in each of its packets of data. To make matters more interesting, this is not variable frame rate, and at this frame rate, 27 frames after the last full seconds is ~= 900.9 ms. Yet, both of the above durations are somewhere in between these two frames. I do not know why that is for the video stream (start_time=0
); that is deep within the FFmpeg libraries and depends on codec.
Which duration will Shotcut use? Neither. Actually, if you scroll down further in More Information, there is a [format]
section that represents the multiplex container:
duration=0:02:27.925000
MLT (and thereby Shotcut) uses the format duration, and converts it to a rounded frame count at the Video Mode frame rate.
Now, I export using the MJPEG preset (PCM audio), open the result, and Properties gives me a duration of 00:02:27;27 (same).
Next, I change Export > Audio > Codec to libmp3lame and export. This shows a duration of 00:02:27;29 in Properties! More Information shows the audio stream duration=0:02:27.984. So, this is due to using the MP3 audio codec. MP3 uses 1152 samples per packet. But also, the encoder has something called codec delay, which means when you initially send some number of samples in, you do not immediately get the same number of compressed samples out. The output is delayed by a certain number of samples on the magnitude of a few packets. Thus, at the end of the export, MLT must drain the encoder to get all of the data, and this might be a factor.