I have a problem with sphinx_lm_convert
I can not convert a dmp language model file to an arpa model. I tried but it
just hang after writing down the 1-grams.
Anyways I want to write a converter, but I can't understand dmp file formats.
I want to read the dmp file, but I don't know how. Any thoughts?
Amin
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I assume you have followed LM building tutorial http://cmusphinx.sourceforge.
net/wiki/tutoriallm. Note
that sphinx_lm_convert is included in sphinxbase, I might try to re-compile
and re-build sphinxbase first. Do you have problem to convert from ARPA format
to DMP format?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm sorry to hear it didn't work. I just tried HUB4 trigram LM in DMP which I
think is pretty large on my end, it converted to a 612BM ARPA format without
problem.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I tried HUB4, Good news is it finishes the 2-grams,
Bad news is that it hand in middle of 3 grams. I don't know what's wrong and
certainly need some help!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I have a problem with sphinx_lm_convert
I can not convert a dmp language model file to an arpa model. I tried but it
just hang after writing down the 1-grams.
Anyways I want to write a converter, but I can't understand dmp file formats.
I want to read the dmp file, but I don't know how. Any thoughts?
Amin
I assume you have followed LM building tutorial http://cmusphinx.sourceforge.
net/wiki/tutoriallm. Note
that sphinx_lm_convert is included in sphinxbase, I might try to re-compile
and re-build sphinxbase first. Do you have problem to convert from ARPA format
to DMP format?
I did re-build sphinxbase. The problem i think is from the size of my dmp file
(it's wsj.5000)
I think the part that it hangs is ngram_model_mgrams
Anyways, I don't have that big ARPA file, for small ARPA files it works fine.
I'm sorry to hear it didn't work. I just tried HUB4 trigram LM in DMP which I
think is pretty large on my end, it converted to a 612BM ARPA format without
problem.
Well, that's good news.
I will try that to see if that will work on my system too. Maybe the problem
is from the wsj dmp file.
Thanks
I tried HUB4, Good news is it finishes the 2-grams,
Bad news is that it hand in middle of 3 grams. I don't know what's wrong and
certainly need some help!
hand-> hang
What arguments have you used? I usually use sphinx_lm_convert -i myfile.dmp
-ifmt dmp -o myfile.lm -ofmt arpa
have you used the same?
Yes.
Can you inspect the memory/CPU usage when it is hanging? You might try it on
different computer if it's possible.
CPU usage remain high (100%) when hanging.
Problem solved.
I tried it using cygwin, it worked fine.
awesome :)