After many sessions of editing audio, it has been made clear to me that simply splitting audio will add pops/clicks. This is not limited to the preview. The exported file will also have these pops/clicks.
I have prepared a demo that I uploaded to mega. You can click on the link and simply preview the demo. You don’t have to download it. Demo.
The mp3 I used is called “Gaiety in the Golden Age” by Aaron Kenny and you can find it in the youtube audio library.
In my demo, I play a portion of an mp3 several times so that you can hear that section play as it should sound. Then I simply split the clip and bring the playhead back once again to play it several more times. You should hear the pop/click that is now generated every time it passes through that split part.
I should also point out that in my demo you can hear a weird stutter sound every time I press the mouse button to bring the playhead back to an earlier spot before playing it again. I don’t know if that’s normal and something that can be fixed but I thought I should also point that out.
If you want to hear the exported mp3 from Shotcut of that very portion I play in the demo above where you can here that the pop/click is heard in the exported file also then here is that link.
That is true independent of the software you are using. Splitting audio causes pops/clicks unless you make the cut at a zero crossing or you apply a short fade in. Not a bug. Just a fact.
A split creates a new seeking point in the underlying framework. Some formats can seek with higher sample accuracy than others. Would you be willing to redo your experiment with a .wav file instead of .mp3? It would be interesting to know if you get the same results.
That is true when splitting from one clip to an unrelated clip. But in this case, he is making a split without changing the source clips. So it would be reasonable to expect that sample continuity would be maintained across the split in this case.
No, it is not. I’ve edited many times in many other software before. I know what I am talking about. Please don’t post about something you don’t really know about because it derails the thread.
Converting it to wav seems to avoid the issue and that’s a problem because I’ve tried this very experiment in Kdenlive with the very same mp3 clip in the very same part and this doesn’t happen there.
Many people have reported about pops and clicks in their audio when using Shotcut. I believe this is what’s going on. I’m hearing pops and clicks where the shouldn’t be any and it’s all because of simply splitting audio.
Thanks. That was a helpful experiment.
Some more questions to help narrow down the problem:
- What is the sampling frequency of the original .mp3 file?
- What is the sampling frequency of the .wav file?
- What is the sampling frequency of the export file?
Sample rate conversion can be another factor. Sample rate conversion causes a delay of a few samples. When a split occurs, a new sample rate converter has to be created and started from scratch at the split point which would cause a sample discontinuity.
The sample rate of the Shotcut preview player is always 48kHz. But the export sample rate depends on the export format and user settings. So there could be a situation where sample rate conversion occurs in preview but not in export or vis-à-vis.
I’ve run into this a lot myself to the point that I’ve had to add a fade in/out whenever I reconnect clips. I have yet to figure out how to remedy this…
I went a step further and made a wav file with Shotcut’s wav export preset of the mp3 file and changed the sample rate to 44100. When I did the split on that clip a slight pop/click can be heard in playback but not as bad as what is heard in the demo above with the original mp3.
I then exported that edit as two wav files and two mp3 files using Shotcut’s export presets. One set of wav and mp3 files was left as the default 48000 sample rate and the other I changed to 44100 sample rate . The pop/click is heard in the exported 48000 wav and mp3 files. It is not heard in the exported 44100 wav and mp3 files.
So it seems your suspicion of the sample rate as the cause of the problem is on the money.
This is a basic audio problem. It happened back when “clips” meant clipping the magnetic tape with scissors, and it is still with us in digital audio.
The only solution is timing the split at microsecond precision with a zero-crossing detector. It was this technology that was the breakthrough at Harrison Systems (now GLW) where I used to work designing professional studio audio gear.
Adding a “split at nearest zero crossing” would be a daunting software challenge.
I get around it by using Shotcut’s Keyframes on the Gain filter, dropping to -65 dB at every split.
Thank you for that detailed testing. So this specific artifact problem is exclusive to sample rate conversions. Unfortunately, there is nothing I can think of to easily fix this in the underlying framework - it is inherent in the architecture.
There could possibly be more we could do to make the user aware. For example, we could suggest the user convert to edit friendly when performing sample rate conversion. But it is not always obvious when that will happen since the user can change the sample rate in the export panel. Savvy users can be aware of their target sample frequency and could normalize all their files to match to avoid the conversion. I am open to suggestions on this.
So the only way to fix this is to do some sort of code refactoring?
Yes. I would call it an architectural change.
I recommend the work-arounds:
- Do not make unnecessary cuts
- Match audio sampling frequencies to the final intended sampling frequency
- Quickly fade in and out on cuts
Could you consider adding more options for audio transitions? I’ve done a lot of editing with audio on different projects and simple fade in and outs work fine on some situations but not on others when the edit needs to be more precise. I found that sometimes the cross-fade can also work but other times it doesn’t work either and I find myself having to spend more time trying to find the right solution with the fades when in other software it isn’t necessary. I think it would really be useful to have more of the audio transitions out there made available instead of just one. For more complicated editing one of them could be the quick solution to getting the kind of edit that is aimed for while still avoiding pops/clicks.
If you could pick one audio transition to add, what would it be?
I suppose Constant Power since one of the problems that can happen with the cross-fade is the dip in audio which may not be desirable for that specific part. However, the solution other than cross-fade doesn’t always end up being just one other way. That’s why other software offer more than one audio transition because of different situations. Is it a question of workload to add more than one?
I am ignorant on the different types of audio transitions. If there are some transitions already implemented in the underlying MLT framework, then it might not be too difficult to expose them. If not, I would need to learn about each one and implement it from scratch.
I see. Just as a reference, Premiere Pro and Resolve both have 3 kinds of audio transitions. Premiere Pro has Constant Gain, Constant Power and Exponential Fade. Resolve has a +3dB Cross Fade, a -3 dB Cross Fade and a 0dB Cross Fade.
I think it would be reasonable for Shotcut to have at least three. The default “cross-fade” that is already there, Constant Power and well I don’t know what is available in mlt.
I don’t know the audio internals, so this idea may not work. But… if Shotcut internal audio processing is 48k, and media clips were always converted to 48k, and splits were made at 48k, but then the user changed export sample rate to 44.1k at the last minute… wouldn’t that be fine? The internal processing at 48k would have no pops. Downsampling a pop-less master to 44.1k should be fine.
What I don’t know is if Shotcut internal processing is always 48k, or if it changes with the export setting.
That is helpful for me to know. I do not use Premier Pro or Resolve and I don’t aspire to learn them. But I will keep any eye out to see if I find easy ways to add more audio transitions over time.
This is a good line of thinking. Currently, Shotcut’s internal processing uses whatever sample rate is requested. So, when using the preview player, 48k is always requested. When exporting, the export settings dictate what is requested.
I think that a future enhancement will be to add audio configuration to the Video mode. So, when a user sets up a project, they would configure audio sampling frequency and number of channels at the same time they set up the resolution and frame rate. This will probably be the definitive solution in the future.