Ooh, good luck with that. I have quite a bit of experience with speech recognition working with the spoken word, and I can tell you that just handling speech doesn't give you a great deal of accuracy. So, even if you manage to extract the speech component from audio, you will probably find that the speech recognition part does not give you accurate results as accents have a huge effect on comprehending the lyrics. Consider the song "
Israelite[
^]" by Desmond Dekker, he famously sang "oh oh, me Israelite". What many people heard, however, was "oh oh, me ears are alight".