Menu

Problem with trained acoustic model

Help
Rajani
2014-07-18
2014-07-22
  • Rajani

    Rajani - 2014-07-18

    Hi,

    I have trained a new acoustic model using sphinxtrain. The database used for training is very small and has only 17 words in vocabulary( including fillers ). The duration of speech is only 16 seconds.
    For this I have given senone counts randomly(100,200, 300 and 500 - tried all these), as I have no clear idea on senone concept.

    When I use the acoustic model trained with above configuration, I found runtime exceptions. Please find the error details

    java.lang.ArrayIndexOutOfBoundsException: 39
    at edu.cmu.sphinx.linguist.acoustic.tiedstate.MixtureComponent.getScore(MixtureComponent.java:195)
    at edu.cmu.sphinx.linguist.acoustic.tiedstate.GaussianMixture.calculateScore(GaussianMixture.java:130)
    at edu.cmu.sphinx.linguist.acoustic.tiedstate.ScoreCachingSenone.getScore(ScoreCachingSenone.java:40)
    at edu.cmu.sphinx.linguist.acoustic.tiedstate.SenoneHMMState.getScore(SenoneHMMState.java:85)
    at edu.cmu.sphinx.linguist.flat.HMMStateState.getScore(HMMStateState.java:85)
    at edu.cmu.sphinx.decoder.search.Token.calculateScore(Token.java:177)
    at edu.cmu.sphinx.decoder.scorer.SimpleAcousticScorer.doScoring(SimpleAcousticScorer.java:164)
    at edu.cmu.sphinx.decoder.scorer.ThreadedAcousticScorer.doScoring(ThreadedAcousticScorer.java:198)
    at edu.cmu.sphinx.decoder.scorer.SimpleAcousticScorer.calculateScores(SimpleAcousticScorer.java:87)
    at edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager.scoreTokens(SimpleBreadthFirstSearchManager.java:363)
    at edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager.recognize(SimpleBreadthFirstSearchManager.java:293)
    at edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager.recognize(SimpleBreadthFirstSearchManager.java:225)
    at edu.cmu.sphinx.decoder.Decoder.decode(Decoder.java:65)
    at edu.cmu.sphinx.recognizer.Recognizer.recognize(Recognizer.java:110)
    at edu.cmu.sphinx.recognizer.Recognizer.recognize(Recognizer.java:126)
    at voicecommand.ListenerThread.run(ListenerThread.java:64)
    at java.lang.Thread.run(Thread.java:722)

    *The same set of error is found 17 times at once.
    Please help me.

     
    • Nickolay V. Shmyrev

      You need to provide acoustic model training folder in order to get help on this issue.

      You need to provide information about sphinx4 version you are using and how exactly did you use it.

      You can pack files in a single archive and share them through dropbox.

       
  • Rajani

    Rajani - 2014-07-18

    I am using Sphinx4-1.0beta6.

    I have shared files, please find here.

    https://drive.google.com/file/d/0B25RAqomLW2nOVlkUlhJUk95QTA/edit?usp=sharing

     
    • Nickolay V. Shmyrev

      Your files can not be accessed. Requires permission.

       
  • Rajani

    Rajani - 2014-07-21
     
    • Nickolay V. Shmyrev

      You need to provide a whole folder including logs, trained models and
      decoding results, not just input files.

      You need to provide information on how exactly did you use sphinx4.

       
  • Rajani

    Rajani - 2014-07-21

    Yes I did that. Here is the link

    https://drive.google.com/file/d/0B25RAqomLW2nTGJDN0pEWGRnb2s/edit?usp=sharing

    I am using Sphinx4 jar file along with acoustic model jar file in my application. The sphinx4 works for me when I use it with default acoustic model provided by CMU Sphinx i.e. WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.

     
    • Nickolay V. Shmyrev

      You need to provide information on how exactly did you use sphinx4, what changes have you made in the code, what demo did you use, how did you modify it.

       
  • Rajani

    Rajani - 2014-07-21

    In my application I'm using Grammar file and wav file as audio source.

    It can be said as a mixture of 'HelloWorld' and 'LatticeDemo'. I have combined the code from both apps in order to use grammar file and wav file in single app.

     
    • Nickolay V. Shmyrev

      You need to share your modified code.

       
  • Rajani

    Rajani - 2014-07-21

    When I tested with LatticeDemo and HelloWorld seperately, it worked. I did not find any error. So I think something was wrong with my code.

    Thank you Nickolay.

    Now the problem is when I run with LatticeDemo(i.e. using audio file) it gives me result text. But if I speak through microphone (i.e using HelloWorld app), it does not give me the text back. Why it is so?

    What is the best senone count for training acoustic model from my database?

     
    • Nickolay V. Shmyrev

      The problem with your current code is that you incorrectly modified the config file, in particular frontend component:

      ~~~~~~~
      <component name="epFrontEnd" type="edu.cmu.sphinx.frontend.FrontEnd">
      <propertylist name="pipeline">
      <item>microphone </item>
      <item>dataBlocker </item>
      <item>speechClassifier </item>
      <item>speechMarker </item>
      <item>nonSpeechDataFilter </item>
      <item>preemphasizer </item>
      <item>windower </item>
      <item>fft </item>
      <item>melFilterBank </item>
      <item>dct </item>
      <item>liveCMN </item>
      <item>featureExtraction </item>
      <item>audioFileDataSource </item>
      </propertylist>
      </component>
      ~~~~~~~~~~~~~

      The order of elements is important, you shouldn't out audioFileDataSource on the last position. Instead you should replace microphone component with audioFileDataSoruce.

      Anyway, if configuration files are too complicated for you, you need to use latest sphinx4-5prealpha API which doesn't require any config files and should work for you out-of box. Please see for details:

      http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4

       
  • Rajani

    Rajani - 2014-07-22

    Thank you so much Nickolay.

    Now the problem is, I am not getting the result text (i.e recognized text) with the acoustic model built by me. Is it because of the small amount of data used for training acoustic model? or because of any wrong configuration (like number of tied_states or densities) made by me during the training process?

     
    • Nickolay V. Shmyrev

      Is it because of the small amount of data used for training acoustic model?

      Yes

       
  • Rajani

    Rajani - 2014-07-22

    What would be the minimum amount of data (in hours) required to build a good acoustic model?

    Is there any problem in giving random numbers for tied_states and density count? If yes, how can I give proper number for the same?

     
    • Nickolay V. Shmyrev

      You can find the answer on both question in acoustic model training tutorial

      http://cmusphinx.sourceforge.net/wiki/tutorialam

       

Log in to post a comment.

MongoDB Logo MongoDB