If you want to stay completely inside Shotcut, you could use the High Pass filter on a speech track and raise the cutoff frequency just to the point that it starts to affect the speech. If there are other background noises at specific frequencies you can locate (look for bars on the Spectrum Analyzer graph that are taller than they should be outside the speech range), you could add a Notch filter at those frequencies with a bandwidth of 150 Hz or less and a rolloff of around 6.
If you’re willing to jump outside of Shotcut and use Audacity for noise reduction, be sure to take a look at the notes in the Audacity manual about how and when to apply the filter. The final sound can be very bad if filtering is done in the wrong order. Any artefacts generated can get unnecessarily amplified.
Audacity noise reduction filter manual: https://manual.audacityteam.org/man/noise_reduction.html
If you are willing to jump outside of Shotcut but you want even better results, and you’re willing to pay money for it, then take a look at iZotope RX Elements for $129. This is the same denoiser that major studios use, and I personally feel this is one area of the processing chain where open source software has miles to go before it catches up with commercial offerings. I paid money for iZotope because I felt it was that much better. I am not affiliated with them in any way.
https://www.sweetwater.com/store/detail/RX7El--izotope-rx-elements
The iZotope Elements Suite is only $199 if you also want Ozone for mastering. The transient shaper tool is especially useful for customizing snare drum sounds, if you’re into recording music performances.
https://www.sweetwater.com/store/detail/ElementsSte2--izotope-elements-suite-v4
The difference between Audacity and iZotope is that Audacity’s noise reduction method is basically a glorified graphic EQ filter. It uses spectral noise gating, which basically means it divides the frequency spectrum into 1025 bands like an old-school graphic equalizer on steroids. It is trained by analyzing a patch of noise to see how much energy is in each band. Once applied to your real audio (speech), if the energy in each band is not significantly higher than the noise fingerprint, the band gets reduced because the filter assumes there is no signal (speech) currently happening. Obviously, there will be artefacts because there isn’t a smooth transition between the 1025 bands, and the human ear is sensitive enough to hear any major changes applied between neighboring bands that wide. Also, this technique is rendered useless if the noise profile pulses or changes over time, like vehicle traffic or crowd noise.
This isn’t to say Audacity’s NR is bad. I’m just saying that once you know how it works, you now know how to get the best results out of it. It works best in a controlled recording environment where any background noise is extremely consistent. Then Audacity’s method can provide great results.
If a controlled environment isn’t possible, then iZotope is your only real option. It uses a learning algorithm rather than static EQ binning, and is enormously more sophisticated which produces much better results.