hello I want to convert audio file to text file , the problem is when i am printing the value of (stream) it getting display but when the ( recognizer and result.getHypothesis) displaying nothing and not able to understand where is the problem can any one can help me to solve this problem
public class AppRunner
{
public static void main(String[] args) throws ParseException, IllegalArgumentException, IllegalAccessException
{
DSQ_ConnectionFactory.initialize();
DSQUtils.createSchemaMapInMemory();
DSQ_GenericConfigConnector BSC = DSQ_ConnectionFactory.getConfigConnector();
Configuration configuration = new Configuration();
configuration.setAcousticModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us");
configuration.setDictionaryPath("file:E:/anvita_work/cmudict-en-us.dict");
configuration.setLanguageModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us.lm.bin");
StreamSpeechRecognizerrecognizer;try{recognizer=newStreamSpeechRecognizer(configuration);InputStreamstream=AppRunner.class.getResourceAsStream("/com/dsquare/Arabtec Construction (INDIA) Private Limited.wav");Stringresultone=getStringFromInputStream(stream);/* value of stream is getting display*/System.out.println("*******************");System.out.println(resultone);System.out.println("*******************");stream.skip(44);// Simple recognition with generic modelrecognizer.startRecognition(stream);SpeechResultresult;while((result=recognizer.getResult())!=null){/* value of result.getHypothesis() not displaying anthing*/System.out.println("List of recognized words and their times:"+result.getHypothesis());for(WordResultr:result.getWords()){System.out.println(r);}}}catch(IOExceptione){// TODO Auto-generated catch blocke.printStackTrace();}}privatestaticStringgetStringFromInputStream(InputStreamis){BufferedReaderbr=null;StringBuildersb=newStringBuilder();Stringline;try{br=newBufferedReader(newInputStreamReader(is));while((line=br.readLine())!=null){sb.append(line);}}catch(IOExceptione){e.printStackTrace();}finally{if(br!=null){try{br.close();}catch(IOExceptione){e.printStackTrace();}}}returnsb.toString();}
}
please reply
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
thanks for reply @Nickolay V.Shmyrev
but after conversion of the video according to the given format then also its is not working it is printing something else, most of the word which is recognized is not into that audio file , but at the same time it is working fine for the video which is provided by CMU Sphinx tutorial so where is the problem can you please let me know
Last edit: anvita tiwari 2015-08-21
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You audio has 8khz bandwidth and also contains Indian-accented English which we do not support. You need to train your own acoustic model to decode it:
dear Nickoaly sir can I ask u one question that the audio file is size of " 9MB" and is it only the reason of "Indian english" not running or size of the file or both reasons ( indian english and size of the file)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for reply
I went through the link you have provided but since I am new to CMU Sphinx not able to understand it clearly I came to know that we have to buit some database and also we have to divide our video into 20 segment which will be store in transcription file known as artic20.transcription and artic20.fields
can you please let me know how to do that(how to start with it) means not fully just hint since i am new to it i am confused
please reply
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
hello I want to convert audio file to text file , the problem is when i am printing the value of (stream) it getting display but when the ( recognizer and result.getHypothesis) displaying nothing and not able to understand where is the problem can any one can help me to solve this problem
here is my code:
package com.dsquare;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.PrintStream;
import java.io.StringWriter;
import java.net.URL;
import edu.cmu.sphinx.api.Configuration;
import edu.cmu.sphinx.api.LiveSpeechRecognizer;
import edu.cmu.sphinx.api.SpeechResult;
import edu.cmu.sphinx.api.StreamSpeechRecognizer;
import edu.cmu.sphinx.result.WordResult;
import java.text.ParseException;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.json.JSONObject;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.BufferedReader;
import com.amazonaws.services.datapipeline.model.Field;
import com.dsquare.sbs.AuthenticationModule;
import com.dsquare.sbs.DSQ_Constants;
import com.dsquare.sbs.DSQ_SBSSystem;
import com.dsquare.sbs.data.connector.DSQ_ConnectionFactory;
import com.dsquare.sbs.data.connector.DSQ_GenericConfigConnector;
import com.dsquare.sbs.socialMedia.webContent.DSQ_HTMLParser;
import com.dsquare.sbs.socialMedia.youtube.DSQ_YouTubeAPI;
//import com.dsquare.sbs.socialMedia.webContent.DSQ_GoogleAnalytics;
import com.dsquare.sbs.utils.DSQUtils;
import com.google.api.client.util.IOUtils;
import com.restfb.*;
public class AppRunner
{
public static void main(String[] args) throws ParseException, IllegalArgumentException, IllegalAccessException
{
DSQ_ConnectionFactory.initialize();
DSQUtils.createSchemaMapInMemory();
DSQ_GenericConfigConnector BSC = DSQ_ConnectionFactory.getConfigConnector();
Configuration configuration = new Configuration();
configuration.setAcousticModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us");
configuration.setDictionaryPath("file:E:/anvita_work/cmudict-en-us.dict");
configuration.setLanguageModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us.lm.bin");
}
please reply
Most likely your input file has wrong format. It must be 16khz 16bit mono WAV file.
thanks for reply @Nickolay V.Shmyrev
but after conversion of the video according to the given format then also its is not working it is printing something else, most of the word which is recognized is not into that audio file , but at the same time it is working fine for the video which is provided by CMU Sphinx tutorial so where is the problem can you please let me know
Last edit: anvita tiwari 2015-08-21
You can share your audio file to get help.
hi Nickolay V.Shmyrev
sorry for late reply
plz find the attachment and let me know the solution please
You audio has 8khz bandwidth and also contains Indian-accented English which we do not support. You need to train your own acoustic model to decode it:
http://cmusphinx.sourceforge.net/wiki/tutorialam
dear Nickoaly sir can I ask u one question that the audio file is size of " 9MB" and is it only the reason of "Indian english" not running or size of the file or both reasons ( indian english and size of the file)
Thanks for reply
I went through the link you have provided but since I am new to CMU Sphinx not able to understand it clearly I came to know that we have to buit some database and also we have to divide our video into 20 segment which will be store in transcription file known as artic20.transcription and artic20.fields
can you please let me know how to do that(how to start with it) means not fully just hint since i am new to it i am confused
please reply
have u get the output for your query ?