CMU Sphinx / Forums / Help: Word vs Phrase recognition and Speaker dependency

Certain Bro - 2015-09-08

Hi, Am able to get the app up and running and have tested it using the guides in the various forum posts/articles/tutorials. However I have a few questions:

1) For recognising different accents, should I get different speakers to record the same sentences and then work with this data while adapting the acoustic model? Or should each speaker record unique sentences?

2) Am getting very accurate results with phrase recognition. However, I am getting about a 50% accuracy for single-words or 2-word phrases. Any idea how I may be able to increase single word recognition? Do note that my adaptation corpus contains only phrases. Will adapting single words along with the phrases improve the recognition?

3) If i adapt only phrases, how about recognition of new phrases made entirely
with words from the adapted phrases?

4) How to get multiple matches for the speech recognition like the google speech api v2?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2015-09-08
  
  1) For recognising different accents, should I get different speakers to record the same sentences and then work with this data while adapting the acoustic model? Or should each speaker record unique sentences?
  
  It is better to record unique sentences
  
  2) Am getting very accurate results with phrase recognition. However, I am getting about a 50% accuracy for single-words or 2-word phrases. Any idea how I may be able to increase single word recognition? Do note that my adaptation corpus contains only phrases. Will adapting single words along with the phrases improve the recognition?
  
  If you want to recognize single words, you need to build your training database from single words, not from phrases. If you want to recognize single words and phrases, both must be in database. Detailed guess on this problem is not possible because there is not enough data. If you need help on the accuracy you'd better provide the database for analysis.
  
  3) If i adapt only phrases, how about recognition of new phrases made entirely
  with words from the adapted phrases?
  
  If you built the training database in vocabulary-independent way it should be ok.
  
  4) How to get multiple matches for the speech recognition like the google speech api v2?
  
  Both sphinx4 and pocketsphinx decoders have support for n-best list result in API, you can use it.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Certain Bro - 2015-09-09

Thank you!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Word vs Phrase recognition and Speaker dependency

Speech Recognition Toolkit

Forums

Help

Word vs Phrase recognition and Speaker dependency document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Word vs Phrase recognition and Speaker dependency