Stop and think about it for just a moment. You can't filter out just music (or even reduce it) without also having a negative effect on dialog, sound effects, etc. since all of it occurs within the same spectrum. And, since it's dynamic, you can't simply apply a fixed filter. You can try but I doubt the results will be usable.
Your problem is very similar to some of the situations I encountered as a SONAR Tech. We had millions of dollars worth of VERY specialized equipment available to us and still had to live with far less than perfect results.