Quick lm, vocab, adaptation questions

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Quick lm, vocab, adaptation questions

Forum: Help

Creator: Richard Kappler

Created: 2012-08-07

Updated: 2012-09-22

Richard Kappler - 2012-08-07

I get that if I have a closed vocabulary life gets a little easier and I seem to recall reading somewhere that with a small, closed vocab I can get accuracy pretty close to 100%. What if I want to add to the vocabulary later, would I start again from scratch, appending the existing vocab and then doing the adaptation on the original hub4_wsj, discarding the adapted model?

If I do build a "closed" vocab, as I understand it I would put the vocab in a ~~vocab~~ list and run it through lmmtool, then just do the acoustical adaptation, is that correct?

When building such a "closed" vocabulary, as I understand it using the words in reasonable sentences is important because the recogniser is trained to look at the beginning and end of the word and surrounding word, not just the word itself, is that right? Would this include single word commands?

Once built and adapted, how much should I expect WER to degrade when other people interact with the program (different voices)?

regards, Richard
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-08-07

I get that if I have a closed vocabulary life gets a little easier and I
seem to recall reading somewhere that with a small, closed vocab I can get
accuracy pretty close to 100%. What if I want to add to the vocabulary later,
would I start again from scratch, appending the existing vocab and then doing
the adaptation on the original hub4_wsj, discarding the adapted model?

If the words have significantly different senones the new adaptation might be
required

If I do build a "closed" vocab, as I understand it I would put the vocab
in a ~~vocab~~ list and run it through lmmtool, then just do the acoustical
adaptation, is that correct?

I'm not sure what do you mean by "closed". For a small vocabulary recognizer
without any specific word order it's better to use JSGF grammars

When building such a "closed" vocabulary, as I understand it using the
words in reasonable sentences is important because the recogniser is trained
to look at the beginning and end of the word and surrounding word, not just
the word itself, is that right? Would this include single word commands?

For single word commands you can use single word commands for adaptation

Once built and adapted, how much should I expect WER to degrade when
other people interact with the program (different voices)?

It depends on many external factors
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.