Audio Alignment Implementation

andrecaldas · November 1, 2021, 2:16pm

Hey, @DRM… thank you for the comment about the drift. My program can successfully detect the drift and the misalignment between my video and audio files. I think it is usable, and I hope it can be integrated in Shotcut soon. Thank you @brian and @shotcut for the support. Sorry for making so many questions! I do not know anything about audio and video file and stream formats.

Ploblem I was having

It seems that my audio files where being processed differently by different “people”. I do think that there should be a way to make all those consumer/producers deliver results in a consistent way. Well… I do not know anything about audio or video file formats… and this is not what I want to discuss. So, I used ffmpeg to convert the audio and avoid the issue.

Audio format I was using

My audio is recorded by a cheap device I got on the internet.

First, the application file reports for my audio file:

$ file test.wav
test.wav: RIFF (little-endian) data, WAVE audio

The test.wav “properties” reported by my file browser (nautilus, in Gnome) report (I am translating freely from Portuguese):

Container: WAV
Codec: DVI ADPCM
[…]
Sampling frequency: 48000 Hz
Bit rate: 385 kbps

Using `ffmpeg` to convert

After using ffmpeg…

$ ffmpeg -y -stats -i test.wav -v error -vn -ar 48000 test_48000.wav

The command file reports:

file test_48000.wav
test_48000.wav: RIFF (little-endian) data, WAVE audio, (censored M$) PCM, 16 bit, stereo 48000 Hz

And nautilus gives:

Container: WAV
Codec: WAV
[…]
Sampling frequency: 48000 Hz
Bit rate: 1536 kbps

Generating the audio and video combined `mlt` file

Now, when I use my undrifter and aligner program:
$ ./undrifted_and_aligned_xml.sh test.mp4 test_48000.wav > test.mlt

I get a perfectly aligned and drift-free test.mlt I can use to insert as a clip in my projects.

In Shotcut, on the timeline, I do loose the nice graphical representation of the audio envelope. And also the video thumbnails.

If you want to test it

If you don’t have any “drifted” audio files, but you want to test, you can generate one using the atempo filter in ffmpeg:

ffmpeg -y -stats -i nice_file.wav -v error -vn -ar 48000 -filter:a “atempo=1.001” file_with_drift.wav

To fix this, the detected drift should be \frac{1}{1.001}. Probably, there is a way to cut the first second, so you also have a misaligned file. If anyone knows how to do it, please, tell me as I do not really need it and therefore I shall not spend time looking for.