I build a set of lexical and language modelling files via "http://www.speech.cs.cmu.edu/tools/lmtool.html".
I can get .dic .lm .sent files after requesting this web-based tool. And I can apply them in SphinxII.
However, I downloaded statistical language modeling toolkit from "http://svr-www.eng.cam.ac.uk/~prc14/toolkit.html" and installed it on FreeBSD system.
I can get a .binlm file after applying this tool.
So....what's the different between them?
Are they all for SphinxII's language model?
Are there any documents about language model?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If you open up the sphinx.arg file for something like the cont.exe then you will see a comment about the LM files being able to be loaded as text or binarys.. i believe you have made a binary Lm type. the only docs available are in the cvs is a very sparious one but you should see some refrence to that there also. The arg file is a just a convient way of passing in command line aruguments to the fbs_init() function just for your info.
-MAX
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2001-07-19
Thanks a lot.
But I still can not understand this language model very well. I could transfer this binary model (.binlm) to an ARPA format LM by using binlm2arpa which is also included in "The CMU-Cambridge Statistical Language Modeling Toolkit v2." I found this ARPA-format LM is very similar to .lm generated by web-base lm tool. However, I still can not apply this ARPA-format LM to SphinxII.
Do I omit any steps?
Or...Are these two language models essentially different?
Thank you very much!
I am a beginner but intend to use SphinxII eagerly.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I build a set of lexical and language modelling files via "http://www.speech.cs.cmu.edu/tools/lmtool.html".
I can get .dic .lm .sent files after requesting this web-based tool. And I can apply them in SphinxII.
However, I downloaded statistical language modeling toolkit from "http://svr-www.eng.cam.ac.uk/~prc14/toolkit.html" and installed it on FreeBSD system.
I can get a .binlm file after applying this tool.
So....what's the different between them?
Are they all for SphinxII's language model?
Are there any documents about language model?
If you open up the sphinx.arg file for something like the cont.exe then you will see a comment about the LM files being able to be loaded as text or binarys.. i believe you have made a binary Lm type. the only docs available are in the cvs is a very sparious one but you should see some refrence to that there also. The arg file is a just a convient way of passing in command line aruguments to the fbs_init() function just for your info.
-MAX
Thanks a lot.
But I still can not understand this language model very well. I could transfer this binary model (.binlm) to an ARPA format LM by using binlm2arpa which is also included in "The CMU-Cambridge Statistical Language Modeling Toolkit v2." I found this ARPA-format LM is very similar to .lm generated by web-base lm tool. However, I still can not apply this ARPA-format LM to SphinxII.
Do I omit any steps?
Or...Are these two language models essentially different?
Thank you very much!
I am a beginner but intend to use SphinxII eagerly.