Menu

Transcribe .mp3 files with Transcriber.java

2011-10-31
2012-09-21
  • Gorka Perez Sánchez

    Hello , I have tested the Transcriber.java file and I have tried to trascribe
    a .mp3 file . When the code :

    URL configURL = Transcriber.class.getResource("config.xml");

    ConfigurationManager cm = new ConfigurationManager(configURL);

    Recognizer recognizer = (Recognizer) cm.lookup("recognizer");

    AudioFileDataSource dataSource = (AudioFileDataSource)
    cm.lookup("audioFileDataSource");**

    dataSource.setAudioFile(audioURL, null);

    is executed I obtain the following Exception in the comand
    dataSource.setAudioFile(audioURL, null) :

    Audio file format not supported:
    javax.sound.sampled.UnsupportedAudioFileException: could not get audio input
    stream from input URL
    javax.sound.sampled.UnsupportedAudioFileException: could not get audio input
    stream from input URL
    at javax.sound.sampled.AudioSystem.getAudioInputStream(Unknown Source)
    at edu.cmu.sphinx.frontend.util.AudioFileDataSource.setAudioFile(AudioFileData
    Source.java:150)
    at edu.cmu.sphinx.demo.transcriber.Transcriber.main(Transcriber.java:56)
    Exception in thread "main" java.lang.NullPointerException
    at edu.cmu.sphinx.frontend.util.AudioFileDataSource.setInputStream(AudioFileDa
    taSource.java:178)
    at edu.cmu.sphinx.frontend.util.AudioFileDataSource.setAudioFile(AudioFileData
    Source.java:162)
    at edu.cmu.sphinx.demo.transcriber.Transcriber.main(Transcriber.java:56)

    How can I configure it for transcribing .mp3 files (the config.xml file or
    other parameters in the project) ?.
    Thanks
    Gorka**

     
  • eliasmajic

    eliasmajic - 2011-10-31

    Mp3 is compressed data. You need to convert it to a .wav file first.

    You can use ffmpeg to do this.

     
  • Nickolay V. Shmyrev

    Or you can use this to add mp3 to Java Sound

    http://www.tritonus.org/

     
  • Anonymous

    Anonymous - 2011-12-09

    In my case, I added tritonus_mp3-0.3.6.jar and
    tritonus_share-0.3.6.jar fom
    http://www.tritonus.org/ to my java lib/ext
    folder. This should add the MP3 capabilities to Sphinx. But the above
    (Transcriber.java) code will not work as it is because AudioFileDataSource
    is not accommodating MP3 formats. To take care of that part, either you modify
    AudioFileDataSource.setAudioFile() method or just not call this method.
    Here is how I changed Aligner.java to allow MP3.

    public class Aligner {
    
        public static void main(String[] args) throws IOException, UnsupportedAudioFileException {
    
            if(args.length<2){
                System.out.println("Error: two parameters required <audio-file> <script-text>. Stopping further processing");
                return;
            }
    
            ConfigurationManager cm = new ConfigurationManager("config/aligner.xml");
            Recognizer recognizer = (Recognizer) cm.lookup("recognizer");
    
            TextAlignerGrammar grammar = (TextAlignerGrammar) cm.lookup("textAlignGrammar");
            grammar.setText(args[1]);
    
            recognizer.addResultListener(grammar);
    
            /* allocate the resource necessary for the recognizer */
            recognizer.allocate();
    
            // configure the audio input for the recognizer
            URL audioFileURL = new URL("file:" + args[0]);
            AudioFileDataSource dataSource = (AudioFileDataSource) cm.lookup("audioFileDataSource");
    
            if(args[0].endsWith(".mp3") || args[0].endsWith(".MP3")){//if non-MP3
                AudioInputStream audioStream=null, decodedAudioStream = null;
                AudioFormat baseFormat=null, decodedFormat = null;          
                try {
                    audioStream = AudioSystem.getAudioInputStream(audioFileURL);
                    baseFormat = audioStream.getFormat();
                    decodedFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED,
                            baseFormat.getSampleRate(),
                            16,
                            baseFormat.getChannels(),
                            baseFormat.getChannels() * 2,
                            baseFormat.getSampleRate(),
                            false);
                    decodedAudioStream = AudioSystem.getAudioInputStream(decodedFormat, audioStream);
                } catch (UnsupportedAudioFileException e) {
                    System.err.println("Audio file format not supported: " + e);
                    e.printStackTrace();
                } catch (IOException e) {
                    e.printStackTrace();
                }
                dataSource.setInputStream(decodedAudioStream, null);            
            }else{
                dataSource.setAudioFile( audioFileURL, null);//it does everything it needs to for the supported audio formats
            }
    
            Result result;
            while ((result = recognizer.recognize()) != null) {
    
                String resultText = result.getTimedBestResult(false, true);
                System.out.println(resultText);
            }
        }
    }
    

    You may need to compare with the stock Aligner.java that comes with Sphinx to
    see my changes.

    I hope it helps

    Pannu

     

Log in to post a comment.

MongoDB Logo MongoDB