Menu

Tuning Speech Recognition getting negative accuracy

Help
sumitraj
2016-05-31
2018-10-26
  • sumitraj

    sumitraj - 2016-05-31

    I have language model created using CMU toolkit with closed vocab and good turing discount and dictionary created using LOGIOS tool having around 3000 words and test wav file of 7 minutes and when i"m tuning accuracy I always get negative accuracy, Any help would be great on this as I'm rookie in ASR. Thank You.

    pocketsphinx_batch -adcin yes -cepdir wav -cepext .wav -ctl test.fileids -lm Domain.lm -dict Domain.dict -hmm wsj -hyp test.hyp
    
    word_align.pl Domain.transcription test.hyp
    

    TOTAL Words: 367 Correct: 218 Errors: 3578
    TOTAL Percent correct = 59.40% Error = 974.93% Accuracy = -874.93%
    TOTAL Insertions: 3429 Deletions: 0 Substitutions: 149

     

    Last edit: sumitraj 2016-06-01
  • Nickolay V. Shmyrev

    You can provide your test data and models to get help on this issue.

     
    • sumitraj

      sumitraj - 2016-06-01

      Yes sure will provide but just few clarifications before that, Language Model should we give in binary format or ARPA format as input? and 5 mins of .wav file is enough for checking accuracy? and I'm using cmusphinx-en-us-5.2 acoustic model.

       

      Last edit: sumitraj 2016-06-01
      • Nickolay V. Shmyrev

        Language Model should we give in binary format or ARPA format as input?

        It does not matter, both formats are supported

        and 5 mins of .wav file is enough for checking accuracy?

        Yes

         
  • sumitraj

    sumitraj - 2016-06-01

    in this drive I have LM mdel,. dict,. wav, and trascription https://drive.google.com/folderview?id=0BxkEp3G7x36dZ2pBUFZpejdpNFU&usp=sharing do check if any other things needed I can provide. Thanks

     
    • Nickolay V. Shmyrev

      For accuracy testing you need to split large file on individual utterances as recommended by tutorial

      http://cmusphinx.sourceforge.net/wiki/tutorialtuning

      The volume of your recording is very low, you need to configure microphone to record at a good volume.

       
  • sumitraj

    sumitraj - 2016-06-01

    what accuracy did you get with the LM which I had shared in the link?is it still negative in your case too ?

     
    • Nickolay V. Shmyrev

      I didn't get any accuracy yet because you didn't prepare the datatabase properly. You need to split on utterances first.

       
  • sumitraj

    sumitraj - 2016-06-06

    TOTAL Words: 413 Correct: 338 Errors: 103
    TOTAL Percent correct = 81.84% Error = 24.94% Accuracy = 75.06%
    TOTAL Insertions: 28 Deletions: 4 Substitutions: 71

    This is my Accuracy for LM generated using lmtool. How good is this accuracy ? and if its less what measures to be taken to increase accuracy?
    TIA

     
    • Nickolay V. Shmyrev

      You need to share the data and models you used to get this output.

       
  • sumitraj

    sumitraj - 2016-06-07

    Hi, I'm sharing G drive link where I have Acosutic model cmu_dict en-us 16khz model, language model generated using lmtool, .wav files having each utterances and transcription of respective .wav files. Pls let me know how can I improve accuracy for LM. Thanks

    https://drive.google.com/folderview?id=0BxkEp3G7x36dZ2pBUFZpejdpNFU&usp=sharing/)

     

    Last edit: sumitraj 2016-06-07
    • Nickolay V. Shmyrev

      You can improve accuracy by training a better language model. Your current langauge model does not fully cover all the commands possible in your system.

      You can also improve the accuracy by adapting the acoustic model per our adaptation tutorial

      You can greatly improve the accuracy by increasing the recording level of the speech you recorded. Currently it is too quiet.

      You can also improve accuracy by specifying bigger langauge weight since our US acoustic model is bad for Indian English. You can add something like -lw 12 -fwdflatlw 12 -bestpathlw 12 to pocketsphinx arguments to get better results.

       
  • sumitraj

    sumitraj - 2016-06-10

    Where I can make these configurations changes "-lw 12 -fwdflatlw 12 -bestpathlw 12 " in sphinx4-5 prealpha (java version)?
    And thanks Nickolay for all your responses in such a quick time.

     
    • sumitraj

      sumitraj - 2016-06-13

      Any help on above ?

       
      • Nickolay V. Shmyrev

        Sphinx4 has languageWeight parameter in default.config.xml, it is not available in the API.

         
  • sumitraj

    sumitraj - 2016-06-14

    yea I did saw in config file, but you have mentioned to use updated version. So now there is no option to make lw configurations?

     
  • Anjul Sharma

    Anjul Sharma - 2018-10-25

    Hi ,
    I have a language model with 20 hours hindi data. I have trained on pocketsphinx, but i got -2.3% accuracy. I have tested it on trained data also but accuracy is same. I'm not getting what is the problen.
    I have checked configure file as well.
    No error in whole process.

     

    Last edit: Anjul Sharma 2018-10-25
    • Nickolay V. Shmyrev

      Cool, congratulations, that's some result. Let us know if you need any help.

       
      • Anjul Sharma

        Anjul Sharma - 2018-10-26

        Thanks Nickolay,
        Actually i have tried this in other machine also but accuracy is different.
        In my machine it's -2.3% and in other one is 76%, i don't know why it is happened. Data size is good enough but i'm not getting good accuracy.
        Training time is very less compaire to decoding time and accuracy is also not good enough which i expeced .

        Problem :
        I think something happening wrong in training process.

        I have done decoding manually using pocketsphinx_batch and word_align.pl
        In sphinxtrain run Decoding gets stuck always at 0%.

           MODULE: DECODE Decoding using models previously trained
          Decoding 150 segments starting at 0 (part 1 of 1)
           0%
        
         

        Last edit: Anjul Sharma 2018-10-26

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.