CMU Sphinx / Forums / Help: Poor performance of Pocketsphinx on Android for phoneme recogntion

Leo - 2016-04-08

Hello everyone,

I am working on a project where I have to integrate the speech functionalities of Pocketsphinx into an android application. In fact, I have to integrate the phoneme recognition functionality provided by Pocketpshinx that should be able to recongize phonemes in French language, e.g. the speech recongnizer should be able to recognize syllabes( like "de", "re", "se", etc), consonants (like "m", "f", "g", etc), double-consonants(like "kl", "ks", "gr",etc) and vowels(like "a", "o","e",etc).

Right now, I have integrated the Pocketsphinx for recognizing the phonemes mentioned above, but I have really bad results. For example, when I pronounce the "o", the recognized result sometimes is: "SIL ff ei au" (even I did not pronounce the letter "f" and "e" at all), or something else is appeared at the beggining that is not pronounced. The letters that are appeared at the beggining are not allways the same (sometimes I get "ll", "uu", etc), they change according to the environment I am doing the test. But, I get sometimes the letter I pronunce at the beggining (e.g. for "a", I get "SIL aa SIL") , but this happens really rarely.

So, could you guys please help and let me know what could be the problem and any suggestions for solving this problem?

Thank you very much!

Leutrim

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-04-08
  
  You need to collect a test set to investigate decoding accuracy as described in our tutorial:
  
  http://cmusphinx.sourceforge.net/wiki/tutorialtuning
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Leo - 2016-04-08
    
    Hello Nickolay,
    
    Thank you very much for your fast repsonse.
    
    I will try this tutorial, and will see what will happen.
    
    Best,
    Leutrim
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Leo - 2016-04-08
      
      Hello Nickolay,
      
      I just noticed that I did not have at all the "assets.xml" and also this code:
      " ant.importBuild 'assets.xml'
      preBuild.dependsOn(list, checksum)
      clean.dependsOn(clean_assets) " was not added in the build.gradle file.
      
      Could this be a problem since this is a way for accessing the necessary files for doing the recognition? But, I am wondering, how is that possible then to have a recognized result?
      
      Thank you!
      Leutrim
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Nickolay V. Shmyrev - 2016-04-08
        
        Could this be a problem since this is a way for accessing the necessary files for doing the recognition?
        
        If some file is missing the demo simply will not start
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Leo - 2016-04-08
        
        Hello Nickolay,
        
        I would like to ask something related to the "test database setup", I would like just to make sure myself.
        
        So I have to create audio files for each sound that is supposed to be recognized (e.g for "la" a single audio file, then for "de" another single audio file, and so on)? Then, I have to create the "test.fieldids". Afterwards, I have to create the "test.transcription" file (this should be of the form, eg. 1st row: la (arctic_01), 2nd row: de(arctic_02), and so on.
        
        Then, I should put the audio files in a folder named "wav", and in order to run this on android, I just need to change the parameters of the decoder?
        
        Could you please let me know if this all is correct?
        
        Thank you very much in advance!
        Leutrim
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Nickolay V. Shmyrev - 2016-04-08
        
        You run test on desktop, not on android.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Leo - 2016-04-11
        
        Hello Nickolay,
        
        I did the test database, and I have very poor results. As I told you, the speech recongnizer should be able to recognize syllabes( like "de", "re", "se", etc), consonants (like "m", "f", "g", etc), double-consonants(like "kl", "ks", "gr",etc) and vowels(like "a", "o","e",etc). But, it does not recognize them as it is supposed. I still get as a result phonemes that were not pronounced at all.
        
        Could you please let me know what could be possible solutions for increasing the accuracy somehow? Or, whether pocketsphinx is able to do such recognition as recognizing a single vowel, or a syllable?
        
        Thank you very much in advance!
        
        Leutrim
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Nickolay V. Shmyrev - 2016-04-11
        
        Could you please let me know what could be possible solutions for increasing the accuracy somehow?
        
        Sure, as soon as you provide the required data files.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Leo - 2016-04-11
        
        Hello Nickolay,
        
        Thank you very much for your reply.
        
        Here you have the test.fileids, test.transcriptions and the .wav files.
        
        Thanks!
        Leutrim
        
        test.fileids
        
        test.transcription
        
        wav.zip
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Nickolay V. Shmyrev - 2016-04-11
        
        Ok, and what model do you use?
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Leo - 2016-04-11
        
        I am using the French acoustic and language model provided online on the following link:
        https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/French%20Language%20Model/You
        
        You can find them attached here as well. So, I am using the language model that is designed for the recognition of the phonemes in French language.
        
        fr-phone.lm.dmp
        
        french.zip
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Nickolay V. Shmyrev - 2016-04-11
        
        Your reference transcription does not match the audio content, for example in test_ke.wav you say "ke" two times and in reference transcription it is listed only once.
        
        What are the arguments of the decoding command you run exactly, what is the error rate you see?
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Leo - 2016-04-11
        
        Yes, you are right, my reference transcription does not match the audio content. Without having changed the reference transcription, I have the followng results:
        TOTAL Words: 47 Correct: 9 Errors: 96
        TOTAL Percent correct = 19.15% Error = 204.26% Accuracy = -104.26%
        TOTAL Insertions: 58 Deletions: 0 Substitution: 38
        
        The arguments that I am using for the decoder are the ones suggested on the website of CMUSphinx [http://cmusphinx.sourceforge.net/wiki/tutorialtuning]:
        pocketsphinx_batch \
        -adcin yes\
        -cepdir wav \
        -cepext .wav \
        -ctl test.fileids \
        -lm <your.lm, for="" example="" en-us.lm.dmp="" from="" pocketsphinx=""> \
        -dict <your.dic, for="" example="" cmudict-en-us.dict="" from="" pocketsphinx=""> \
        -hmm <your_hmm, for="" example="" en-us=""> \
        -hyp test.hyp</your_hmm,></your.dic,></your.lm,>
        
        But, I have run every single .wav file to see the recognized result, and I never get the right output. For running the a single .wav file I have used the following commands (also suggested on the CMUSphinx's website) :
        pocketsphinx_continuous -infile test/data/wav/test_ke.wav
        -hmm model/french/french \
        -allphone model/french/fr-phone.lm.dmp -backtrace yes \
        -beam 1e-20 -pbeam 1e-20 -lw 2.0
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Nickolay V. Shmyrev - 2016-04-11
        
        What exact pocketsphinx_batch command do you run?
        
        Provide an updated reference file.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Leo - 2016-04-11
        
        This the command I am using with specified paths to each required file:
        pocketsphonx_batch.exe -adcin yes -cepdir wav -cepext .wav -ctl /path to/test.fileids -lm /path to/fr-phone.lm.dmp -dict /path to/fr-dict.dict -hmm /path to/french/french -hyp test.hyp
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Nickolay V. Shmyrev - 2016-04-11
        
        You should have used -allphone instead of -lm in batch like in continuous with all other arguments recommended.
        
        I'm still waiting for the updated reference file.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Leo - 2016-04-11
        
        Yes, I did not relize that(since I need phoneme recognition, I need -allphone argument).
        
        You can find attached the updated reference file.
        
        Thank you!
        
        test.transcription
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Nickolay V. Shmyrev - 2016-04-11
        
        Ok, so what are your results with allphone?
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Leo - 2016-04-11
        
        I just finished the test. I also added all the arguments as in continous for doing phoneme recognition. The results are the following:
        TOTAL Words: 112 Correct: 52 Errors: 370
        TOTAL Percent correct = 46% Error = 330.36% Accuracy = -230.36%
        TOTAL Insertions: 310 Deletions: 0 Substitutions: 60
        
        You can find also attached a screenshot while running the pocketsphinx_batch.exe with the changed arguments.
        
        capturing.PNG
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Leo - 2016-04-12
        
        Hello Nickolay,
        
        Could you please let me know what should I do, as you see I have really bad results?
        
        Thank you very much!
        
        Leutrim
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Nickolay V. Shmyrev - 2016-04-12
        
        Could you please let me know what should I do, as you see I have really bad results?
        
        Be patient, it will take some time for me to look on your issues.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Leo - 2016-04-12
        
        Thank you very much for your help!
        
        Looking forward to hearing from you.
        
        Leutrim
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Leo - 2016-04-18
        
        Dear Nickolay,
        
        I hope you are doing fine.
        I am writing to you again, since I have a deadline for sending my solution. Could you please let me know if you have already found any way for solving my problem?
        
        Thank you very much in advance!
        
        Best,
        Leutrim
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Leo - 2016-04-11
        
        Hello Nickolay,
        
        please let me know if you need any further information?
        
        Thank you!
        
        Leutrim
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Poor performance of Pocketsphinx on Android for phoneme recogntion - French...

Speech Recognition Toolkit

Forums

Help

Poor performance of Pocketsphinx on Android for phoneme recogntion - French language

Poor performance of Pocketsphinx on Android for phoneme recogntion - French...

Speech Recognition Toolkit

Forums

Help

Poor performance of Pocketsphinx on Android for phoneme recogntion - French language document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Poor performance of Pocketsphinx on Android for phoneme recogntion - French language