There is another way if you don’t mind a bit of manual work.
Using Audacity or any other audio editor/DAW, put the wanted soundtrack on a stereo track, mono also OK.
Now create another stereo (or mono) track.
Into this, you insert audio “blips” into at the right places.
This “blips” can be any audio tone or even white/pink noise.
What I did find is it makes it easier later in SC to make the duration of each “blip”, 80mS (2 frames) if your video rate is 25 fps but up to you.
Once all done, mute the “blips” track and export only the soundtrack.
Now mute the soundtrack and export only the “blips” track.
Open SC and place the wanted video tracks.
Create an audio track and add the exported soundtrack.
Add another audio track and add the “blips” track to it.
As shown below, the video track/s are on top (the blue), the soundtrack on A1 and “blips” on A2:
Now, whenever you see a “blip”, cut the video at the same point.
Once done, mute “blips” track A2 and export.
Certainly not automatic but does make it quicker cutting the video at the correct places
as the “blips” are already inserted at the right points.
Depending on your needs, it may do the trick.
There is a third way where you insert the “blips” at record time (on the camera).
Will post that in a day or two when I have some spare time.
Essentially it works like this:
If you are recording the footage yourself (like shown in the youtube video you linked to), use a camera with stereo external audio input.
On the left channel, you feed the wanted audio from a mic.
On the right channel, a tone burst, controlled via a switch.
This switch you press (to create the tone burst) at the right time.
From there it’s easy as these will then act as markers for your cuts.
If you are keen, I can draw you up a schematic diagram on how to build it.