So I have pocketsphinx set up and for the goforward.raw file it all works fine. Here is my set up line:
config = cmd_ln_init(NULL, ps_args(), TRUE, "-hmm", MODELDIR "/en-us/en-us", "-lm", MODELDIR "/en-us/en-us.lm.dmp", "-dict", MODELDIR "/en-us/cmudict-en-us.dict", NULL);
I tried it on this file (https://dl.dropboxusercontent.com/u/3865748/sx412.raw) and the result should be something like:
2130 8146 gwen 8146 16919 planted 16919 22433 green 22433 29409 beans 29409 31289 in 31289 33754 her 33754 42722 vegetable 42722 49406 garden
But I don't get anything like that, but the speaker is British so I assumed that's why.
Now for this file (https://dl.dropboxusercontent.com/u/3865748/214.raw) the speaker is american and it should be:
"We'll plant roses this spring"
But pocketsphinx gets:
Recognized: if and moonlight in need who him cash move INFO: ngram_search.c(1030): bestpath 0.00 CPU 0.000 xRT INFO: ngram_search.c(1033): bestpath 0.00 wall 0.000 xRT '<s> 8.250 8.270 0.999000 if 8.280 10.220 0.964733 <sil> 10.230 10.400 0.723802 and 10.410 11.130 0.038393 moonlight 11.140 11.890 0.013351 <sil> 11.900 11.970 0.793559 in 11.980 12.610 0.965119 need 12.620 12.910 0.083027 who 12.920 13.200 0.178744 <sil> 13.210 13.230 0.695975 him 13.240 14.010 0.804910 cash 14.020 14.580 0.139551 <sil> 14.590 14.920 0.991535 move 14.930 16.430 0.488369 <sil> 16.440 16.970 0.997603 </s> 16.980 17.280 1.000000'
Any idea what I'm doing wrong, I didn't expect it to be perfect as it's not got any real context to go on. But it seems something is going wrong.
Input files format must be 16khz 16bit mono. Your raw file 214.raw is sampled at 48khz.
Log in to post a comment.
So I have pocketsphinx set up and for the goforward.raw file it all works fine. Here is my set up line:
I tried it on this file (https://dl.dropboxusercontent.com/u/3865748/sx412.raw) and the result should be something like:
But I don't get anything like that, but the speaker is British so I assumed that's why.
Now for this file (https://dl.dropboxusercontent.com/u/3865748/214.raw) the speaker is american and it should be:
But pocketsphinx gets:
Any idea what I'm doing wrong, I didn't expect it to be perfect as it's not got any real context to go on. But it seems something is going wrong.
Last edit: Benjamin Gorman 2015-07-22
Input files format must be 16khz 16bit mono. Your raw file 214.raw is sampled at 48khz.