The sentence start "" and end "" should be put marked to the training
data by the user.
For sub-word models, the tag "<w>" is reserved to signify word break.
For sub-word models with sentence breaks the data is assumed to processed in
the following format: </w>
But I have a problem. What should be the dictionary format of sphinx4
recognizer i am confused. Also, to appear word boundary in the sphinx out put
what should i do.
Is there any way to fix this problem. Please let me know
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
What should be the dictionary format of sphinx4 recognizer i am confused.
Also, to appear word boundary in the sphinx out put what should i do.
Subword dictionary should still contain the mapping from subwords to phones
Word boundary is not readily supported by sphinx4. You will have to modify the
search algorithm to incorporate that. Basically efficient recognition
recognition using subword models needs some work.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Dear Sir,
I am working on morph based language models for speech recognition. I am using
Varikn tool kit to build language models.
http://forge.pascal-
network.org/frs/download.php/45/varikn-1.0.2.tar.gz
In that they have given
The sentence start "
" and end "" should be put marked to the trainingdata by the user.
For sub-word models, the tag "<w>" is reserved to signify word break.
For sub-word models with sentence breaks the data is assumed to processed in
the following format: </w>
<w> w1-1 w1-2 w1-3 <w> w2-1 <w> w3-1 w3-2 <w> </w></w></w></w>where wA-B is the Bth part of the A:th word.
But I have a problem. What should be the dictionary format of sphinx4
recognizer i am confused. Also, to appear word boundary in the sphinx out put
what should i do.
Is there any way to fix this problem. Please let me know
Subword dictionary should still contain the mapping from subwords to phones
Word boundary is not readily supported by sphinx4. You will have to modify the
search algorithm to incorporate that. Basically efficient recognition
recognition using subword models needs some work.