Too Long; Didn't Read
Mitsubishi and Indiana University have published a new model as well as a new dataset tackling this task of identifying the right soundtrack. The problem here is isolating any independent sound source from a complex acoustic scene like a movie scene or a youtube video where some sounds are not well balanced. If you successfully isolate the different categories in a soundtrack, it means that you can also turn up or down only one of them, like turning down the music a bit to hear all the other actors correctly.