Menu

Low accuracy non-native speaker

Help
2019-03-23
2019-03-24
  • Łukasz Strzelecki

    Hello again,

    I got pocketsphinx working in Java, keyword search works quite well so far. I played around with free-speech recognition though, and I get pretty bad results. I experimented with adapting different language models & dictionaries (default ones [downloaded with pocketsphinx] and mine created from smaller list of words [~14k]) and I managed to see the result getting better.
    My approach:
    I tested 100 recordings of sentences consisting of 8 words - first on not-adapted model, then on adapted one. I used the same transcription file for testing as for adaptation to see any difference.
    For default model I got at first ~-39% of accuracy, then around 1%. In logs of bw command there is a lot of warnings of words missing, that could be the reason. For my language model I got at first 31%, after adaptation close to 100%.

    Seems like adaptation was done alright, no errors or warnings in the second case and the result got better.
    I took this model then and tested it on 20 new sentences and for non-adopted model I got ~33%, but for adopted one strangely ~25%.

    I am Polish, I do not have good English accent. My question is, should I try and continue adapting model, or should I train acoustic model from the scratch using just my voice? Is it possible for me to get over 90% accuracy using free speech in any of these cases? Or perhaps would it be better to create phonetic dictionary matching my pronounciation? If so, than is there an existing tool that could create phonetic dictionary or a least show phonetic transcription of one word from recording? I have no idea how to transcrive my pronounciation.

    In case it is needed, here are the attachements:
    1. https://drive.google.com/file/d/1qJIsMyt9Mdn2Xc1dH9THwhJmBh6AL-TH/view?usp=sharing
    2. https://drive.google.com/file/d/1YsmbUUUA3xR6oHhG5G2TTqfa4UOnQYZf/view?usp=sharing
    3. https://drive.google.com/file/d/1RE81IEF4xmpSJS9TBUwrsr-4wV2YemrP/view?usp=sharing
    1. ...-my model is where adaptation of my model was done (lacking .wav files to save space - same as in default model)
    2. my-model-test is where the test of my model was done (the one with regress)
    3. ...-default model is where adaptation of default model was done

     

    Last edit: Łukasz Strzelecki 2019-03-23
    • Nickolay V. Shmyrev

      Try kaldi

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.