Menu

Bad recognition for digits

Help
2016-09-22
2016-09-30
  • Miguel Monzon

    Miguel Monzon - 2016-09-22

    Hello everyone, I am having problems to recognize digits in my pocketsphinx app. I am trying to recognize a speech that represents a telephone number, this number is spoken digit by digit until the user stops speaking. The lenght of the telephone number may vary. For instance the telephone number 1 212 664 444 is spoken in this way one two one two six six four four four four (all at once)

    My grammar is this:

    JSGF V1.0;

    grammar call_number_digits_numeric;

    <call_number_digits_numeric> =
    0 |
    1 |
    2 |
    3 |
    4 |
    5 |
    6 |
    7 |
    8 |
    9
    ;
    public <r_call_number_digits_numeric> = <call_number_digits_numeric>+ ;</call_number_digits_numeric></r_call_number_digits_numeric></call_number_digits_numeric>

    And i am using cmusphinx-en-us-ptm-5.2.tar.gz on linux Ubuntu 14.04.3 LTS

    The problem is that the final recognition result is filled with numbers which never were spoken for example if i say the next telephone number 1 212 664 444 pocketsphinx recognizes this:
    21212166144441 (first attempt)
    9911212166441414 (second attempt)
    191212166541414141 (third attempt)

    Could you please help me find out what i am doing wrong ? Is there a special configuration i have to do for digits ? I am setting this value setKeywordThreshold(1e-45f), is this correct ??

    Thanks a lot in advance for your help

     
    • Nickolay V. Shmyrev

      You need to give us means to reproduce your problem: provide the code you are using, provide the data you are using, describe what exactly are you running.

      It looks like the sample rate of your data does not match required 16khz.

       
  • Miguel Monzon

    Miguel Monzon - 2016-09-26

    Dear Nickolay thanks for the comments, I got some improvement but I still have some doubts, I ll post more info about my code, thanks

     
  • Miguel Monzon

    Miguel Monzon - 2016-09-29

    Hello Nickolay this is what I have done, I downloaded the Android Demo App and I tested the performance for digits recognition and it is far better than my current app. Because of this result I took all the acoustic model files from the Android Demo App and I place them for my current app and my results have improved around 60% for digits recognition. What I want to do now is to adapt the acoustic model because my app is going to work surrounded with highway noise. To do this adaptation I am using the acoustic model files from the Android Demo App (feat.params, mdef, means, noisedict, sendump, transition_matrices and variances) the problem is that I am realizing that the file mixture_weights doesnt come inside the Android Demo App and this file is required to do the adaptation, where can I get this file from ??? Thanks for the help.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.