Menu

NullPointer Exception (at AbstractScorer)

Help
2009-05-02
2012-09-22
  • Kelly Anderson

    Kelly Anderson - 2009-05-02

    Hello, I'm working on my graduation project and an important part of it is to transcribe documentaries (avi). I'm using Sphinx4 beta2 and the HUB4 acoustic and language models.

    I took an avi documentary to test things, got 3 seconds that had speech (with background music), and converted them to wav 16khz mono. Now when I try that file on Sphinx here's what I get:

    Exception in thread "main" java.lang.NullPointerException
    at edu.cmu.sphinx.decoder.scorer.AbstractScorer.startRecognition(AbstractScorer.java:116)
    at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.startRecognition(WordPruningBreadthFirstSearchManager.java:234)
    at edu.cmu.sphinx.decoder.Decoder.decode(Decoder.java:44)
    at edu.cmu.sphinx.recognizer.Recognizer.recognize(Recognizer.java:98)
    at edu.cmu.sphinx.recognizer.Recognizer.recognize(Recognizer.java:114)
    at wavfile.WavFile.main(WavFile.java:55)

    Can someone explain to me the meaning behind this error and what I should do to fix it?

    Thanks.

     
    • Nickolay V. Shmyrev

      The issue is that you were using something like BatchCMN without NonSpeechDataFilter or something like that. Please try to reproduce the frontend pipeline from the transcriber demo.

       
    • Nickolay V. Shmyrev

      Also just paste your files, I'll look.

       
    • Kelly Anderson

      Kelly Anderson - 2009-05-02

      Aha, it worked. I must've accidentally removed it or something.

      But now, I have a different problem...the transcription seems to stop and does not continue till the end of the wav file. I thought it had something to do with the values set by the SpeechMarker, but changing them achieved nothing since the person speaking didn't actually pause or anything.

      Here are my files:

      http://rapidshare.com/files/228340908/Files.rar.html

       
      • Nickolay V. Shmyrev

        You just need to invoke Recognizer.recognize in a loop until the result will be null to get the transcription for all chunks. See the Transcriber.java demo for example.

           // Loop unitl last utterance in the audio file has been decoded, in which case the recognizer will retu
            Result result;
            while ((result = recognizer.recognize())!= null) {
        
                    String resultText = result.getBestResultNoFiller();
                    System.out.println(resultText);
            }
        
         
    • Kelly Anderson

      Kelly Anderson - 2009-05-02

      It did recognize a few more words, but still most of the words did not get recognized at all, not even incorrectly. Is there something wrong with my wav files maybe?

       
      • Nickolay V. Shmyrev

        Well, files with music require special treatment, can you please try on a clean recordings first?

        On what particular sample does it fail? elephant, national or something else? Could you please provide ready to run example that clearly reproduces the problem. Not just a collection of files you are using.

         
    • Kelly Anderson

      Kelly Anderson - 2009-05-02

      I also got this:

      WARNING threadedScorer Not enough data in frontend to start recognition

      Though there clearly was data still to left be recognized.

       
      • Nickolay V. Shmyrev

        Don't care about this warning, this bug was fixed in svn trunk.

         
    • Kelly Anderson

      Kelly Anderson - 2009-05-02

      Clean recordings work pretty well.

      All three samples fail. What can I do to improve recognition with minimal background music on?

      Sorry I don't understand what you mean by a ready to run example...as in you need the WavFile.jar file? Yeah, I'll try to get it..since I'm working on Netbeans and don't know where/if it builds the jar files.

       
      • Nickolay V. Shmyrev

        > Clean recordings work pretty well.

        Hm, then music is indeed a huge problem. To be honest, there is no ready to use receipt to handle that.

        Btw, are you converting the audio from stereo mp3 files? It should be easier to cleanup music from them although it will require some coding.

         
        • Kelly Anderson

          Kelly Anderson - 2009-05-03

          No, I directly extract wav files from avi videos using ffmpeg. Do you have any suggestions for what I could do to improve those wav files and improve the accuracy?

           
          • Nickolay V. Shmyrev

            I just meant that if files are stereo files it's possible to build noise cancellation that will effectively reduce music. With mono files it's much more complicated.

            I probably need some time to search for the music cancellation code. Also, can you please try wsj model instead of hub4. In theory it should be more resistant to noise.

             
            • Kelly Anderson

              Kelly Anderson - 2009-05-03

              Yes, they are stereo files, but I converted them to mono. I also canceled out one of the channels, and the results seemed to be a bit better actually.

              I will try the wsj, and hopefully it'll work out.

               
              • Nickolay V. Shmyrev

                There are advanced techniques for source separation from stereo recording, they can help too. Otherwise it will be quite hard to get acceptable performance.

                 

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.