Basically, your audio is suffering from almost every conversion error possible. The good news is that your ears were sensitive enough to detect it.
The output settings in Reaper need to change because it’s exporting in a lossy compressed format. This format will be re-encoded by Shotcut for the video export, then re-encoded again by YouTube or whatever delivery method is being used. There will be accumulative quality loss at each stage. This transcoding chain is the reason that the exact audio exported from Reaper cannot be preserved all the way to the final viewer. You will have zero control over what YouTube does to your audio when it re-encodes.
So it’s best to over-prepare. Sounding “good” out of Reaper is not good enough to survive two more generations of encoding. It has to sound absolutely perfect out of Reaper to get highest-quality audio.
The workflow generally looks like this:
-
In Reaper, the export settings should be set to whatever the source material sampling rate is. If the audio was recorded at 44.1kHz, then the export should be 44.1kHz. Converting from 44.1kHz to 48kHz or vice versa during export will introduce resampling artefacts.
-
In Reaper, the export settings should be a lossless format like 24-bit PCM or 32-bit FP (floating point). The previous settings of AAC 128kbps are devastatingly too low to survive two generations of transcoding. AAC is very inaccurate to begin with, and going down to 128kbps makes a terrible situation worse.
-
In Shotcut, go to the Export panel > Advanced button > Audio tab and set the sampling rate to the same as the Reaper export. Same situation… if Reaper exported 44.1kHz but Shotcut is encoding at 48kHz, there will be resampling artefacts created.
-
In Shotcut, still on the Audio tab, avoid using AAC if music is critical to you. Use AC-3 at 448kbps if exporting as MP4. If you’re okay with exporting as Matroska (MKV), then audio could be exported as FLAC for lossless audio. It depends on how demanding your quality requirements are.
The reasons, charts, graphs, etc behind these recommendations are detailed in this similar thread:
Crackling/Distorted Audio in Exported Video
There is no reason to go to 96kHz(*). Any quality analog-to-digital converter would have oversampled during recording and made a clean 48kHz or 44.1kHz audio file. Once in the digital domain, additional anti-aliasing (rolloff) filters are not applied so long as the sampling rate doesn’t change. There is no reason for more rolloff because a digital source can’t generate a frequency higher than the destination can capture when the sampling rates are equal. Since the main benefit of 96kHz is shifting the rolloff filters into the inaudible range, the absence of filters likewise means the benefits of 96kHz are greatly reduced.
(*): A studio setting with a pristine setup… maybe 96kHz has a place during processing. But the difference between 48kHz and 96kHz will never be noticed in the average consumer’s home.