Splitting Audio Adds Pops/Clicks

Converting it to wav seems to avoid the issue and that’s a problem because I’ve tried this very experiment in Kdenlive with the very same mp3 clip in the very same part and this doesn’t happen there.

Many people have reported about pops and clicks in their audio when using Shotcut. I believe this is what’s going on. I’m hearing pops and clicks where the shouldn’t be any and it’s all because of simply splitting audio.

Thanks. That was a helpful experiment.

Some more questions to help narrow down the problem:

  • What is the sampling frequency of the original .mp3 file?
  • What is the sampling frequency of the .wav file?
  • What is the sampling frequency of the export file?

Sample rate conversion can be another factor. Sample rate conversion causes a delay of a few samples. When a split occurs, a new sample rate converter has to be created and started from scratch at the split point which would cause a sample discontinuity.

The sample rate of the Shotcut preview player is always 48kHz. But the export sample rate depends on the export format and user settings. So there could be a situation where sample rate conversion occurs in preview but not in export or vis-à-vis.

I’ve run into this a lot myself to the point that I’ve had to add a fade in/out whenever I reconnect clips. I have yet to figure out how to remedy this…

I went a step further and made a wav file with Shotcut’s wav export preset of the mp3 file and changed the sample rate to 44100. When I did the split on that clip a slight pop/click can be heard in playback but not as bad as what is heard in the demo above with the original mp3.

I then exported that edit as two wav files and two mp3 files using Shotcut’s export presets. One set of wav and mp3 files was left as the default 48000 sample rate and the other I changed to 44100 sample rate . The pop/click is heard in the exported 48000 wav and mp3 files. It is not heard in the exported 44100 wav and mp3 files.

So it seems your suspicion of the sample rate as the cause of the problem is on the money. :slightly_smiling_face:

This is a basic audio problem. It happened back when “clips” meant clipping the magnetic tape with scissors, and it is still with us in digital audio.

The only solution is timing the split at microsecond precision with a zero-crossing detector. It was this technology that was the breakthrough at Harrison Systems (now GLW) where I used to work designing professional studio audio gear.

Adding a “split at nearest zero crossing” would be a daunting software challenge.

I get around it by using Shotcut’s Keyframes on the Gain filter, dropping to -65 dB at every split.

Thank you for that detailed testing. So this specific artifact problem is exclusive to sample rate conversions. Unfortunately, there is nothing I can think of to easily fix this in the underlying framework - it is inherent in the architecture.

There could possibly be more we could do to make the user aware. For example, we could suggest the user convert to edit friendly when performing sample rate conversion. But it is not always obvious when that will happen since the user can change the sample rate in the export panel. Savvy users can be aware of their target sample frequency and could normalize all their files to match to avoid the conversion. I am open to suggestions on this.

So the only way to fix this is to do some sort of code refactoring?

Yes. I would call it an architectural change.

I recommend the work-arounds:

  1. Do not make unnecessary cuts
  2. Match audio sampling frequencies to the final intended sampling frequency
  3. Quickly fade in and out on cuts
1 Like

Could you consider adding more options for audio transitions? I’ve done a lot of editing with audio on different projects and simple fade in and outs work fine on some situations but not on others when the edit needs to be more precise. I found that sometimes the cross-fade can also work but other times it doesn’t work either and I find myself having to spend more time trying to find the right solution with the fades when in other software it isn’t necessary. I think it would really be useful to have more of the audio transitions out there made available instead of just one. For more complicated editing one of them could be the quick solution to getting the kind of edit that is aimed for while still avoiding pops/clicks.

If you could pick one audio transition to add, what would it be?

I suppose Constant Power since one of the problems that can happen with the cross-fade is the dip in audio which may not be desirable for that specific part. However, the solution other than cross-fade doesn’t always end up being just one other way. That’s why other software offer more than one audio transition because of different situations. Is it a question of workload to add more than one?

I am ignorant on the different types of audio transitions. If there are some transitions already implemented in the underlying MLT framework, then it might not be too difficult to expose them. If not, I would need to learn about each one and implement it from scratch.

I see. Just as a reference, Premiere Pro and Resolve both have 3 kinds of audio transitions. Premiere Pro has Constant Gain, Constant Power and Exponential Fade. Resolve has a +3dB Cross Fade, a -3 dB Cross Fade and a 0dB Cross Fade.

I think it would be reasonable for Shotcut to have at least three. The default “cross-fade” that is already there, Constant Power and well I don’t know what is available in mlt.

I don’t know the audio internals, so this idea may not work. But… if Shotcut internal audio processing is 48k, and media clips were always converted to 48k, and splits were made at 48k, but then the user changed export sample rate to 44.1k at the last minute… wouldn’t that be fine? The internal processing at 48k would have no pops. Downsampling a pop-less master to 44.1k should be fine.

What I don’t know is if Shotcut internal processing is always 48k, or if it changes with the export setting.

That is helpful for me to know. I do not use Premier Pro or Resolve and I don’t aspire to learn them. But I will keep any eye out to see if I find easy ways to add more audio transitions over time.

This is a good line of thinking. Currently, Shotcut’s internal processing uses whatever sample rate is requested. So, when using the preview player, 48k is always requested. When exporting, the export settings dictate what is requested.

I think that a future enhancement will be to add audio configuration to the Video mode. So, when a user sets up a project, they would configure audio sampling frequency and number of channels at the same time they set up the resolution and frame rate. This will probably be the definitive solution in the future.

2 Likes

Oops. You’re right. For some reason I was thinking about trimming the beginning of a clip instead of simply making a cut and then doing nothing with the cut.

I like this. Perhaps, if somebody defines a 5.1 surround project but wants to export a stereo version too, then downmix intelligence can get it there. (Maybe not as good as a human mastering engineer doing a dedicated mix, but still pretty good.) But today, if the export is switched to stereo and processing happens in stereo but all the pan and gain settings were designed for surround, then the export will be hopelessly messed up.

This is a time-honored technique.

When i worked in the music industry in Nashville in the days before digital audio, the final stage before sending a tape to the mastering lab was “ducking out the cuts”, copying the finished work to a new tape and with a deft motion of the hand on the fader dropping the gain momentarily at every “click” that came from a tape splice.

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.