Hello, im new to speech recognition and trying to understand some of the multiple variables.
So here is my question:
does the language model restrict the recognition to its contents (words used to build the language model)
or can words outside of the LM (but included in the dictionary) still be recognized?
If the LM does not restrict the recognition and my goal is to
"recognize natural language but with medical terms",
would it be better to build my own LM (myLM) and to use it with the en-us dictionary,
or build a dictionary (myDict) from the same corpus and use the compination myLM and myDict ?
Thanks in advance
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
does the language model restrict the recognition to its contents (words used to build the language model)
Yes
or can words outside of the LM (but included in the dictionary) still be recognized?
No
would it be better to build my own LM (myLM) and to use it with the en-us dictionary,
or build a dictionary (myDict) from the same corpus and use the compination myLM and myDict ?
You have to update both LM and the dictionary
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello, im new to speech recognition and trying to understand some of the multiple variables.
So here is my question:
does the language model restrict the recognition to its contents (words used to build the language model)
or can words outside of the LM (but included in the dictionary) still be recognized?
If the LM does not restrict the recognition and my goal is to
"recognize natural language but with medical terms",
would it be better to build my own LM (myLM) and to use it with the en-us dictionary,
or build a dictionary (myDict) from the same corpus and use the compination myLM and myDict ?
Thanks in advance
Yes
No
You have to update both LM and the dictionary
Ok, thank you for clearing it out.
I worked a bit with
cmuclmtk and the web toolkit and understood in practice what you said ;)
edit: post was uploaded twice. sorry
Last edit: Thanasis 2020-09-08