how to convert the sound wave of the currently playing song to a text

Question

1.00/5 (1 vote)

See more:

actually my project is to read sound wave from the currently playing video and convert them to text
convert the audio in the song to text and disply it on the screen on the run time...so i need some help ....

Posted 29-Jan-14 2:02am

ch.haya

Add a Solution

2 solutions

Solution 1

how to convert a wave file of english song to text? with timer?[^]

Posted 29-Jan-14 2:14am

Kornfeld Eliyahu Peter

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Pete O'Hanlon · Accepted Answer · 2014-01-29T02:24:00

Solution 2

Ooh, good luck with that. I have quite a bit of experience with speech recognition working with the spoken word, and I can tell you that just handling speech doesn't give you a great deal of accuracy. So, even if you manage to extract the speech component from audio, you will probably find that the speech recognition part does not give you accurate results as accents have a huge effect on comprehending the lyrics. Consider the song "Israelite[^]" by Desmond Dekker, he famously sang "oh oh, me Israelite". What many people heard, however, was "oh oh, me ears are alight".

Posted 29-Jan-14 2:24am

Pete O'Hanlon

Comments

ch.haya 29-Jan-14 12:09pm

ok...thnx...but kindly guide me about its code actually i have code which convert a english sentence with SAPI API but unable for song kindly update following code ..

using System.Collections.Generic; using System.ComponentModel; using System.Data; using System.Drawing; using System.Linq; using System.Text; using System.Windows.Forms; using EllisMIS.Audio.Transcription.Microsoft; namespace MicrosoftSpeechToTextExample { public partial class Form1 : Form { Dictation _transcriber; public Form1() { InitializeComponent(); } private void btnWavFile_Click(object sender, EventArgs e) { ///Not sure if a .Dispose is needed at all, but threw it in there. if (_transcriber != null) { _transcriber.Dispose(); } _transcriber = new Dictation(); SetEvents(); _transcriber.Start("example.wav"); } void _transcriber_SpeechHypothesizingEvent(object sender, System.Speech.Recognition.SpeechHypothesizedEventArgs e) { Console.WriteLine("Speech Recognizing: " + e.Result.Text); } void transcriber_SpeechRecognizedEvent(object sender, System.Speech.Recognition.SpeechRecognizedEventArgs e) { Console.WriteLine("Speech Recognized: " + e.Result.Text); } public void SetEvents() { _transcriber.SpeechRecognizedEvent -= new Dictation.SpeechRecognizedEventHandler(transcriber_SpeechRecognizedEvent); _transcriber.SpeechHypothesizingEvent -= new Dictation.SpeechHypothesizingEventHandler(_transcriber_SpeechHypothesizingEvent); _transcriber.SpeechRecognizedEvent += new Dictation.SpeechRecognizedEventHandler(transcriber_SpeechRecognizedEvent); _transcriber.SpeechHypothesizingEvent += new Dictation.SpeechHypothesizingEventHandler(_transcriber_SpeechHypothesizingEvent); } using System; using System.IO; using System.Speech.Recognition; namespace EllisMIS.Audio.Transcription.Microsoft { public class Dictation : IDisposable { #region Local Variables private SpeechRecognitionEngine _speechRecognitionEngine = null; private DictationGrammar _dictationGrammar = null; private bool _disposed = false; #endregion #region Constructors public Dictation() { ConstructorSetup(); } public Dictation(DictationGrammar targetGrammar) { _dictationGrammar = targetGrammar; ConstructorSetup(); } #endregion /// /// Start the transcriber using your default microphone. /// public void Start() { _speechRecognitionEngine.SetInputToDefaultAudioDevice(); StartSetup(); } /// /// Transcribe a .wav file /// /// <param name="targetWavFile"></param> public void Start(string targetWavFile) { if (!File.Exists(targetWavFile)) { throw new FileNotFoundException("Specified WAV file does not exist.", "targetWavFile"); } _speechRecognitionEngine.SetInputToWaveFile(targetWavFile); StartSetup(); } private void StartSetup() { if (_dictationGrammar == null) { _dictationGrammar = new DictationGrammar(); } _speechRecognitionEngine.LoadGrammar(_dictationGrammar); _speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple); _speechRecognitionEngine.SpeechRecognized -= new EventHandler(SpeechRecognized); _speechRecognitionEngine.SpeechHypothesized -= new EventHandler(SpeechHypothesizing); _speechRecognitionEngine.Sp

S Houghtelin 31-Jan-14 7:40am

I don't think this person gets that spoken words and words with modulated pace and frequency are two different things. They keep reposting this request. If they do succeed I'll be the first to doff my cap to them.