I have a dictation system built on pocketsphinx and vocabulary of 65k words in
n-gram model trained with CMU LM Toolkit. I need to have command words for
correction and text formatting included in main language model but they are
missing.
What is the simplest way to include this words in existing model and ensure
that it will be recognized in any context of continuous dictation.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
There is little sense to add correction words to the language model. You need
to spot for them in continuous stream either with parallel keyword spotter or
with spotter for isolated commands embedded into ngram decoder. For example
you can specifically hardcode the higher probability of the correction word in
isolated utterances. You an do that by modifying sphinxbase ngram code.
Usually correction command is an isolated word in a separate utterance, so you
need to account for that case.
Yes, dictation system is not that easy to build as it might seem. Maybe you
will be interested in some Keith publicatoins
Keith publications are very useful for me but I'm stuck on simple problem.
My 3-gram LM for dictation is good for targeted vocabulary and I don't want to
change it. However it, for some reason, does not contains words like
"correct", "select" which I want to use parallel with dictation like "correct
some text".
Is there a possibility to manually add these words in arpa text format LM and
maybe specify their higher probabilities there. Or maybe build some smaller
model with this words if some tool for merging models or some feature in
pocketsphinx
to use both models at same time exist which I don't know.
I just want to avoid rebuilding whole LM.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have a dictation system built on pocketsphinx and vocabulary of 65k words in
n-gram model trained with CMU LM Toolkit. I need to have command words for
correction and text formatting included in main language model but they are
missing.
What is the simplest way to include this words in existing model and ensure
that it will be recognized in any context of continuous dictation.
Hello binac
There is little sense to add correction words to the language model. You need
to spot for them in continuous stream either with parallel keyword spotter or
with spotter for isolated commands embedded into ngram decoder. For example
you can specifically hardcode the higher probability of the correction word in
isolated utterances. You an do that by modifying sphinxbase ngram code.
Usually correction command is an isolated word in a separate utterance, so you
need to account for that case.
Yes, dictation system is not that easy to build as it might seem. Maybe you
will be interested in some Keith publicatoins
http://keithv.com/pub/
For example
http://keithv.com/pub/speechduring/speech_rec_dictation_corrections.pdf
Thank you Nickolay
Keith publications are very useful for me but I'm stuck on simple problem.
My 3-gram LM for dictation is good for targeted vocabulary and I don't want to
change it. However it, for some reason, does not contains words like
"correct", "select" which I want to use parallel with dictation like "correct
some text".
Is there a possibility to manually add these words in arpa text format LM and
maybe specify their higher probabilities there. Or maybe build some smaller
model with this words if some tool for merging models or some feature in
pocketsphinx
to use both models at same time exist which I don't know.
I just want to avoid rebuilding whole LM.
You can build unigram language model from the list of your words and mix it
with your large language model using mitlm.