Menu

Word vs Phrase recognition and Speaker dependency

Help
2015-09-08
2015-09-09
  • Certain Bro

    Certain Bro - 2015-09-08

    Hi, Am able to get the app up and running and have tested it using the guides in the various forum posts/articles/tutorials. However I have a few questions:

    1) For recognising different accents, should I get different speakers to record the same sentences and then work with this data while adapting the acoustic model? Or should each speaker record unique sentences?

    2) Am getting very accurate results with phrase recognition. However, I am getting about a 50% accuracy for single-words or 2-word phrases. Any idea how I may be able to increase single word recognition? Do note that my adaptation corpus contains only phrases. Will adapting single words along with the phrases improve the recognition?

    3) If i adapt only phrases, how about recognition of new phrases made entirely
    with words from the adapted phrases?

    4) How to get multiple matches for the speech recognition like the google speech api v2?

     
    • Nickolay V. Shmyrev

      1) For recognising different accents, should I get different speakers to record the same sentences and then work with this data while adapting the acoustic model? Or should each speaker record unique sentences?

      It is better to record unique sentences

      2) Am getting very accurate results with phrase recognition. However, I am getting about a 50% accuracy for single-words or 2-word phrases. Any idea how I may be able to increase single word recognition? Do note that my adaptation corpus contains only phrases. Will adapting single words along with the phrases improve the recognition?

      If you want to recognize single words, you need to build your training database from single words, not from phrases. If you want to recognize single words and phrases, both must be in database. Detailed guess on this problem is not possible because there is not enough data. If you need help on the accuracy you'd better provide the database for analysis.

      3) If i adapt only phrases, how about recognition of new phrases made entirely
      with words from the adapted phrases?

      If you built the training database in vocabulary-independent way it should be ok.

      4) How to get multiple matches for the speech recognition like the google speech api v2?

      Both sphinx4 and pocketsphinx decoders have support for n-best list result in API, you can use it.

       
  • Certain Bro

    Certain Bro - 2015-09-09

    Thank you!

     

Log in to post a comment.