Seems like this question is recurring but the last one from 2016.
Syncing video clips by aligning the audio tracks seems like a pretty simple automated procedure but it is not part of Shortcut’s features? I tried to look at the roadmap but could not see that it is even a planned feature? Would it not be a low hanging fruit to harvest if this feature was added? Saves a lot of time for dual camera editing?
My first task on every new video is audio synchronizing from two to five video tracks.
I find this a bit ironic, knowing that finding the synchronization is a relatively straightforward and simple (and well-known, in some circles) algorithm. The algorithm is call “Cross Covariance”; on my very first paid computer job more than fifty years ago (sub-contracting to NASA before NASA had put a man on the moon), I was running punchcard decks of satellite data into an IBM 7094 to to synchronize data between satellites using a cross-covariance app.
However, if the datastream is long, it is a massive number-crunch operation; all very simple calculations, but gazillions of them. The kind of thing that the parallel number-crunchers of a GPU are so good at doing efficiently.
Approaching the synchronization problem from any viewpoint other than cross-covariance, the problem looks like an insurmountable nightmare.
Wow, what you say about your work is exciting.
I don’t understand covariance or programming and this request may be incompatible with the resources @Shotcut (Dan and developers) have at their disposal.
In the past, I saw how some users with programming knowledge, contributed filters or frames for filters (I remember something about 360º filters).
Maybe you can help to implement this functionality in Shotcut.
The alignment of different videos according to the audio, I’ve already seen it some time ago in another free video editor, so in theory, it’s something feasible to do for Shotcut as well.
Yes, if the developers are interested, I would be willing to help.
I have already this morning developed a user interface that I believe would be compatible with the UI engine being used. Documenting that interface design, however, is an hours-long endeavor which I will only undertake if interest is shown,
I cannot speak for the developers.
I hope they read this thread and evaluate this possibility.
There is a road map with the direction in which they want to go, but the world turns, and everything evolves.
Is it an interesting feature? Of course.
Does it interfere with other perhaps more critical or important characteristics? That is known to the developers.
I like the possibility, especially the precision it could produce.
However, in designing the user interface, my enthusiasm dampened somewhat.
In any case, this would need another “View” (those choices in the upper right that I never touch)
If there was an option for a one-click solution (well, almost one click), it would also be a “OK, I got that started; it will be a while. Do we want to go to Pizza Hut or Applebees while we wait?” type of operation. The computations needed to compare entire tracks of any length for the best match is immense.
The other option - making a small clip cut from the slave track, getting “in the ballpark”, setting up the scan, then rejoining after the right offset is found - might be more work than the visual match method.