Menu

audio to text file conversion

2015-08-21
2017-10-20
  • anvita tiwari

    anvita tiwari - 2015-08-21

    hello I want to convert audio file to text file , the problem is when i am printing the value of (stream) it getting display but when the ( recognizer and result.getHypothesis) displaying nothing and not able to understand where is the problem can any one can help me to solve this problem

    here is my code:
    package com.dsquare;

    import java.io.ByteArrayInputStream;
    import java.io.ByteArrayOutputStream;
    import java.io.IOException;
    import java.io.PrintStream;
    import java.io.StringWriter;
    import java.net.URL;
    import edu.cmu.sphinx.api.Configuration;
    import edu.cmu.sphinx.api.LiveSpeechRecognizer;
    import edu.cmu.sphinx.api.SpeechResult;
    import edu.cmu.sphinx.api.StreamSpeechRecognizer;
    import edu.cmu.sphinx.result.WordResult;

    import java.text.ParseException;
    import java.util.HashMap;
    import java.util.List;
    import java.util.Map;

    import org.json.JSONObject;

    import java.io.InputStream;
    import java.io.InputStreamReader;
    import java.io.BufferedReader;

    import com.amazonaws.services.datapipeline.model.Field;
    import com.dsquare.sbs.AuthenticationModule;
    import com.dsquare.sbs.DSQ_Constants;
    import com.dsquare.sbs.DSQ_SBSSystem;
    import com.dsquare.sbs.data.connector.DSQ_ConnectionFactory;
    import com.dsquare.sbs.data.connector.DSQ_GenericConfigConnector;
    import com.dsquare.sbs.socialMedia.webContent.DSQ_HTMLParser;
    import com.dsquare.sbs.socialMedia.youtube.DSQ_YouTubeAPI;
    //import com.dsquare.sbs.socialMedia.webContent.DSQ_GoogleAnalytics;
    import com.dsquare.sbs.utils.DSQUtils;
    import com.google.api.client.util.IOUtils;
    import com.restfb.*;

    public class AppRunner
    {
    public static void main(String[] args) throws ParseException, IllegalArgumentException, IllegalAccessException
    {
    DSQ_ConnectionFactory.initialize();
    DSQUtils.createSchemaMapInMemory();
    DSQ_GenericConfigConnector BSC = DSQ_ConnectionFactory.getConfigConnector();
    Configuration configuration = new Configuration();
    configuration.setAcousticModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us");
    configuration.setDictionaryPath("file:E:/anvita_work/cmudict-en-us.dict");
    configuration.setLanguageModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us.lm.bin");

                StreamSpeechRecognizer recognizer;
                try
                {
                        recognizer = new StreamSpeechRecognizer( configuration);
                        InputStream stream = AppRunner.class.
                        getResourceAsStream("/com/dsquare/Arabtec  Construction (INDIA) Private Limited.wav");
                        String resultone = getStringFromInputStream(stream); /* value of stream is getting display*/
                        System.out.println("*******************");
                        System.out.println(resultone);
                        System.out.println("*******************");
                        stream.skip(44);
                        // Simple recognition with generic model
                        recognizer.startRecognition(stream);
                        SpeechResult result;
                        while ((result = recognizer.getResult()) != null) 
                        {
                               /* value of result.getHypothesis() not displaying anthing*/
                                System.out.println("List of recognized words and their times:" + result.getHypothesis());
                                for (WordResult r : result.getWords()) 
                                {
                                        System.out.println(r);
                                }
    
    
                        }
                }
                catch (IOException e)
                {
                        // TODO Auto-generated catch block
                            e.printStackTrace();
                }
    
    
    }
        private static String getStringFromInputStream(InputStream is)
    
        {
    
                        BufferedReader br = null;
                        StringBuilder sb = new StringBuilder();
                        String line;
                        try {
    
                                br = new BufferedReader(new InputStreamReader(is));
                                while ((line = br.readLine()) != null) {
                                sb.append(line);
                            }
    
        }       
                        catch (IOException e)
    
                        {
                                e.printStackTrace();
                        }
    
                        finally
    
                        {
                                if (br != null)
    
                                {
                                    try
                                    {
                                        br.close();
                                    }
    
                                    catch (IOException e) 
                                    {
                                        e.printStackTrace();
                                    }
                                }
                        }
    
        return sb.toString();
    
    }
    

    }

    please reply

     
    • Nickolay V. Shmyrev

      Most likely your input file has wrong format. It must be 16khz 16bit mono WAV file.

       
  • anvita tiwari

    anvita tiwari - 2015-08-21

    thanks for reply @Nickolay V.Shmyrev
    but after conversion of the video according to the given format then also its is not working it is printing something else, most of the word which is recognized is not into that audio file , but at the same time it is working fine for the video which is provided by CMU Sphinx tutorial so where is the problem can you please let me know

     

    Last edit: anvita tiwari 2015-08-21
    • Nickolay V. Shmyrev

      You can share your audio file to get help.

       
      • anvita tiwari

        anvita tiwari - 2015-08-24

        hi Nickolay V.Shmyrev
        sorry for late reply
        plz find the attachment and let me know the solution please

         
        • Nickolay V. Shmyrev

          You audio has 8khz bandwidth and also contains Indian-accented English which we do not support. You need to train your own acoustic model to decode it:

          http://cmusphinx.sourceforge.net/wiki/tutorialam

           
          • sreenish aj

            sreenish aj - 2017-10-20

            dear Nickoaly sir can I ask u one question that the audio file is size of " 9MB" and is it only the reason of "Indian english" not running or size of the file or both reasons ( indian english and size of the file)

             
  • anvita tiwari

    anvita tiwari - 2015-08-24

    Thanks for reply
    I went through the link you have provided but since I am new to CMU Sphinx not able to understand it clearly I came to know that we have to buit some database and also we have to divide our video into 20 segment which will be store in transcription file known as artic20.transcription and artic20.fields
    can you please let me know how to do that(how to start with it) means not fully just hint since i am new to it i am confused
    please reply

     
  • sreenish aj

    sreenish aj - 2017-10-20

    have u get the output for your query ?

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.