That error message isn't very helpful, is it? We'll fix that at some point...
What you're missing is a weights file. It has the format:
lm1.arpa WEIGHT1
lm2.arpa WEIGHT2
The first field is the path to the language model, while the second is the interpolation weight to give it (they should sum up to one). To get the optimum interpolation weights, you can use the 'interpolate' program with some held-out data. You have to use 'evallm' to generate probability streams (with 'interpolate -probs') from the data for each language model.
Or, you can use the Perl script 'ngram_interp', which does this for you.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I am trying to combine two langage models with lm_combine :
cmuclmtk/bin/lm_combine -lm1 $lm/lm1.arpa -lm2 $lm/lm2.arpa -lm $lm/test.arpa
and i have the error : rr_iopen: None of '' '.Z' or '.gz' exist
Am I doing something wrong or am I misunderstanding?
Thanks in advance,
Leila
That error message isn't very helpful, is it? We'll fix that at some point...
What you're missing is a weights file. It has the format:
lm1.arpa WEIGHT1
lm2.arpa WEIGHT2
The first field is the path to the language model, while the second is the interpolation weight to give it (they should sum up to one). To get the optimum interpolation weights, you can use the 'interpolate' program with some held-out data. You have to use 'evallm' to generate probability streams (with 'interpolate -probs') from the data for each language model.
Or, you can use the Perl script 'ngram_interp', which does this for you.