Hello. I try to create a language model, but I have some problems. What I do step by step is decribed below.
I downloaded cmuclmtk-0.7-win32.zip, extracted and created a .lm model using the following commands:
text2wfreq < file_name.txt | wfreq2vocab > file_name.vocab text2idngram -vocab file_name.vocab -idngram file_name.idngram < file_name.txt idngram2lm -vocab_type 0 -idngram file_name.idngram -vocab file_name.vocab -arpa file_name.lm
The .lm file is created, it is all good.
After that, I downloaded sphinxbase-5prealpha-win32.zip file, extracted it and go to sphinxbase-5prealpha-win32\bin\Release this folder. I copied my .lm model in the folder and try to create binary format of my model using this command:
sphinx_lm_convert -i file_name.lm -o file_name.lm.bin
But I got the error!
C:\...\sphinxbase-5prealpha-win32\bin\Release>sphinx_lm_convert -i file_name.lm -o file_name.lm.bin INFO: cmd_ln.c(697): Parsing command line: sphinx_lm_convert \ -i file_name.lm \ -o file_name.lm.bin Current configuration: [NAME] [DEFLT] [VALUE] -case -debug 0 -help no no -i file_name.lm -ifmt -logbase 1.0001 1.000100e+000 -mmap no no -o file_name.lm.bin -ofmt INFO: ngram_model_arpa.c(477): ngrams 1=4354, 2=8703, 3=13053 INFO: ngram_model_arpa.c(135): Reading unigrams INFO: ngram_model_arpa.c(516): 4354 = #unigrams created INFO: ngram_model_arpa.c(195): Reading bigrams INFO: ngram_model_arpa.c(534): 8703 = #bigrams created INFO: ngram_model_arpa.c(535): 4 = #prob2 entries INFO: ngram_model_arpa.c(543): 4 = #bo_wt2 entries INFO: ngram_model_arpa.c(292): Reading trigrams INFO: ngram_model_arpa.c(556): 13053 = #trigrams created INFO: ngram_model_arpa.c(557): 3 = #prob3 entries ERROR: "ngram_model.c", line 181: language model file type not supported ERROR: "sphinx_lm_convert.c", line 192: Failed to write language model in format (null) to file_name.lm.bin
Where is my mistake?
Don't ignore this post please. Is it a bug or I have some mistakes?
Precompiled binaries are too old and do not support bin trie format. You can compile latest sources yourself or wait till I upload updated binaries.
I have updated the precompiled version, now it should work fine.
where can I get that precompiled version ?
Log in to post a comment.
Hello.
I try to create a language model, but I have some problems. What I do step by step is decribed below.
I downloaded cmuclmtk-0.7-win32.zip, extracted and created a .lm model using the following commands:
The .lm file is created, it is all good.
After that, I downloaded sphinxbase-5prealpha-win32.zip file, extracted it and go to sphinxbase-5prealpha-win32\bin\Release this folder. I copied my .lm model in the folder and try to create binary format of my model using this command:
But I got the error!
Where is my mistake?
Last edit: Vel 2015-12-10
Don't ignore this post please. Is it a bug or I have some mistakes?
Precompiled binaries are too old and do not support bin trie format. You can compile latest sources yourself or wait till I upload updated binaries.
I have updated the precompiled version, now it should work fine.
where can I get that precompiled version ?