Menu

Improving accuracy on live decoding

Help
2011-02-16
2012-09-22
  • Edwin Miguel Triana

    Hi People,
    I have been working on pocketsphinx to run on Nokia (Symbian) devices. All is
    running perfect. I created my own acoustic and language (jsgf) models for
    command words. During the training phase, I got 5% of WER, I also got the same
    WER running the batch program on the device, it is very good, but when I use
    the models to recognize using the microphone, the WER increases to 80%.
    Must I configure something to adapt the decoder to the hardware? Are there
    some way to know what is going on with the WER?

    Thanks a lot for your help!

     
  • Nickolay V. Shmyrev

    Most likely you are using 'current' cmn in that case you need to select
    cmninit value carefully so it will match the typical mean. If there is a
    mismatch, accuracy issues can appear. Or you can try prior cmn which doesn't
    depend on initial value.

    And please avoid using the words "improving accuracy", you have nothing to
    improve yet. You just used strange parameters.

     
  • Pankaj

    Pankaj - 2011-02-17

    Hi,

    I think CMN PRIOR depends on the initial value, while CMN Current depends only
    on the mean of the current utterance.

    Pankaj

     
  • Edwin Miguel Triana

    Hi Nickolay and Pankaj,
    I took your recommendations, so I tested several cmninit values for cmn prior,
    but the WER is still high. Since the accuracy is good when using
    pocketsphinx_batch, I collected the raw files on live decoding, and then I
    used these as input on pocketsphinx_batch. With raw data I got newly high WER.

    I don't know what can be wrong, I'm using the same parameters from the
    decode/slave.pl script.

    Thanks a lot for your help!

     
  • Nickolay V. Shmyrev

    With raw data I got newly high WER. I don't know what can be wrong,

    That usually means that your feature extraction parameters during training and
    database test decoding doesn't match feature extraction parameters during
    decoding. For example, sample rate was not properly set. I suggest you to
    check feature values and feature extraction with sphinx_cepview, sphinx_fe and
    using mfclogdir. Also check feature extraction parameters in the trainign and
    decoding logs.

     
  • Edwin Miguel Triana

    Hi Nickolay,
    The problem was that most of training/testing files had a silence interval in
    the beginning and in the end, but the utterance captured with pocketphinx
    don't have such silence. I removed the silence in all the files and rebuilt
    the models. Now I have 6% WER with pocketsphinx_batch and 10% with the mobile
    phone. I think it is really good.

    Other thing, I have a couple of modification in the build file for Symbian,
    can I send to you the diff file to be added to the trunk?

    Thanks a lot for your help!

     
  • Nickolay V. Shmyrev

    I removed the silence in all the files and rebuilt the models.

    Ok, good. This is a problem which would be nice to solve one day.

    Other thing, I have a couple of modification in the build file for Symbian,
    can I send to you the diff file to be added to the trunk?

    Sure

     

Log in to post a comment.