I have got pocketsphinx 0.5 running on Windows Mobile 5.0. However, I'm getting only one 'hypothesis' no matter how long I speak into the PDA. Infact, I'm only getting the words -"THE", "SAY", "TURN" and "EIGHT" although I didn't mention any of these during the 'utterance_loop'.
BTW,
-I am using the code from continuous.c
Sampling rate 16000
HMM (wsj1) and LM (turtle) which come with version 0.5
I noticed that the number of recognised words in my PDA is way smaller than when I run in pocketsphinx_continuous.exe in cmd. Here is the output (if this is not too much info):
INFO: ....\src\libsphinxbase\feat\cmn_prior.c(122): cmn_prior_update: from < 266240.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
INFO: ....\src\libsphinxbase\feat\cmn_prior.c(140): cmn_prior_update: to < 48.43 -5.48 -1.47 -2.56 -1.41 -3.03 -0.77 -1.19 0.16 -0.49 1.40 0.51 0.90 >
INFO: ......\src\libpocketsphinx\ngram_search_fwdtree.c(1471): 403 words recognized (7/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdtree.c(1473): 49502 senones evaluated (812/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdtree.c(1475): 15112 channels searched (247/fr), 3819 1st, 4734 last
INFO: ......\src\libpocketsphinx\ngram_search_fwdtree.c(1479): 697 words for which last channels evaluated (11/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdtree.c(1482): 1508 candidate words for entering last phone (24/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdflat.c(829): 296 words recognized (5/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdflat.c(831): 9212 senones evaluated (151/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdflat.c(833): 3519 channels searched (57/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdflat.c(835): 454 words searched (7/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdflat.c(837): 87 word transitions (1/fr)
WARNING: "......\src\libpocketsphinx\ngram_search.c", line 965: </s> not found in last frame, using EIGHT instead
INFO: ......\src\libpocketsphinx\ngram_search.c(1007): lattice start node <s>.0 end node EIGHT.3
I have been debugging the code using my PDA for some time now and can't figure out why I am not getting good recognition. I need some one to advise how to proceed from here. I need to know what Sampling rate to use, do I need to create a new LM based on the PDA, why I'm I getting only one word printed out everytime?
Sorry for too many questions. I just wanted to bring everything to the table.
Many thanks
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for the assurance that I'm being watched. I've not met any major issues since getting this to work. I'm now trying to make a more robust continuous (listen/decode)loop. Do you have any tips on how to manage the 4sec recording time imposed by pocketsohinx. How can I adjust this - making it shorter or longer.
Thanks
Drew
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hmm, what is the 4 second recording time you speak of? You should be able to feed data of any length (within reason) to pocketsphinx. However the continuous audio segmentation code might be cutting you off for whatever reason.
Also, great work - could you post the details of what you had to do to get it to compile for Windows Mobile? I'm preparing a 0.5.1 release to fix various issues that people had, and it would be great to have WinCE support active in that again. I have Visual Studio 2005 and the WinMo SDK but I couldn't figure out how to add the appropriate configurations to the project files, so I haven't tested it (also I don't have a WinMo device).
Of course I'd really like to have Symbian and iPhone support at the same time :) We do actually have PocketSphinx running on the iPhone, but it requires a few modifications. S60 support is just a matter of figuring out the build system, actually.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The second set of values looks fine, which means that the acoustic front-end is probably working okay, but the first one is very wrong - note the 266240.
I think that maybe the configuration parsing code has a bug on WinCE which is causing the -cmninit parameter to be set to something bogus.
Sampling rate of 16000 is fine but 8000 will also work.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
"...acoustic front-end is probably working okay, but the first one is very wrong - note the 266240."
Thanks David for the tips.
I will investigate how the 266240 is coming about in the first place. I haven't got a good clue where to start from but I will give it a go.
On the other hand, there are a few functions in Sphinxbase which I have been rewriting for Wince API. Any tips on how to handle functions like getenv(), isatty() and errno.h? I haven't given these a great deal of attention because I wasn't anticipating any unusual errors. Or is it a misconception that the code for WinCE should work as well as that of Win32? Pocketsphinx_continuous.exe works ok in win32 console, shouldn't WinCE do? Argument being, you've mirrored (adjusted) the code by "ifdef'ing it" using Win32/_Win32_WCE directives.
I wish I can find someone who has managed to run pocketsphinx 0.5 better than I have for comparison. Or is it too soon since its release?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Just to update everyone.
I've finally managed to fix my problems. I am now getting pretty good recognition. Infact, almost as good as running on the PC. So, Wince doesn't suck after all.
I will be interested to hear from one about any specific issues that I should look out for when using pocketsphinx 0.5 on a PPC. I wonder if David will be interested to hear about this progress on WinCE or he is way ahead of everyone :). Any advice will be appreciated.
Didn't have to change anything 'much' - "Sampling rate of 16000 is fine but 8000 will also work."
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have got pocketsphinx 0.5 running on Windows Mobile 5.0. However, I'm getting only one 'hypothesis' no matter how long I speak into the PDA. Infact, I'm only getting the words -"THE", "SAY", "TURN" and "EIGHT" although I didn't mention any of these during the 'utterance_loop'.
BTW,
-I am using the code from continuous.c
Sampling rate 16000
HMM (wsj1) and LM (turtle) which come with version 0.5
I noticed that the number of recognised words in my PDA is way smaller than when I run in pocketsphinx_continuous.exe in cmd. Here is the output (if this is not too much info):
INFO: ....\src\libsphinxbase\feat\cmn_prior.c(122): cmn_prior_update: from < 266240.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
INFO: ....\src\libsphinxbase\feat\cmn_prior.c(140): cmn_prior_update: to < 48.43 -5.48 -1.47 -2.56 -1.41 -3.03 -0.77 -1.19 0.16 -0.49 1.40 0.51 0.90 >
INFO: ......\src\libpocketsphinx\ngram_search_fwdtree.c(1471): 403 words recognized (7/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdtree.c(1473): 49502 senones evaluated (812/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdtree.c(1475): 15112 channels searched (247/fr), 3819 1st, 4734 last
INFO: ......\src\libpocketsphinx\ngram_search_fwdtree.c(1479): 697 words for which last channels evaluated (11/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdtree.c(1482): 1508 candidate words for entering last phone (24/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdflat.c(829): 296 words recognized (5/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdflat.c(831): 9212 senones evaluated (151/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdflat.c(833): 3519 channels searched (57/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdflat.c(835): 454 words searched (7/fr)
INFO: ......\src\libpocketsphinx\ngram_search_fwdflat.c(837): 87 word transitions (1/fr)
WARNING: "......\src\libpocketsphinx\ngram_search.c", line 965: </s> not found in last frame, using EIGHT instead
INFO: ......\src\libpocketsphinx\ngram_search.c(1007): lattice start node <s>.0 end node EIGHT.3
I have been debugging the code using my PDA for some time now and can't figure out why I am not getting good recognition. I need some one to advise how to proceed from here. I need to know what Sampling rate to use, do I need to create a new LM based on the PDA, why I'm I getting only one word printed out everytime?
Sorry for too many questions. I just wanted to bring everything to the table.
Many thanks
Thanks for the assurance that I'm being watched. I've not met any major issues since getting this to work. I'm now trying to make a more robust continuous (listen/decode)loop. Do you have any tips on how to manage the 4sec recording time imposed by pocketsohinx. How can I adjust this - making it shorter or longer.
Thanks
Drew
Hmm, what is the 4 second recording time you speak of? You should be able to feed data of any length (within reason) to pocketsphinx. However the continuous audio segmentation code might be cutting you off for whatever reason.
Also, great work - could you post the details of what you had to do to get it to compile for Windows Mobile? I'm preparing a 0.5.1 release to fix various issues that people had, and it would be great to have WinCE support active in that again. I have Visual Studio 2005 and the WinMo SDK but I couldn't figure out how to add the appropriate configurations to the project files, so I haven't tested it (also I don't have a WinMo device).
Of course I'd really like to have Symbian and iPhone support at the same time :) We do actually have PocketSphinx running on the iPhone, but it requires a few modifications. S60 support is just a matter of figuring out the build system, actually.
This line seems quite suspect:
INFO: ....\src\libsphinxbase\feat\cmn_prior.c(122): cmn_prior_update: from < 266240.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
INFO: ....\src\libsphinxbase\feat\cmn_prior.c(140): cmn_prior_update: to < 48.43 -5.48 -1.47 -2.56 -1.41 -3.03 -0.77 -1.19 0.16 -0.49 1.40 0.51 0.90 >
The second set of values looks fine, which means that the acoustic front-end is probably working okay, but the first one is very wrong - note the 266240.
I think that maybe the configuration parsing code has a bug on WinCE which is causing the -cmninit parameter to be set to something bogus.
Sampling rate of 16000 is fine but 8000 will also work.
"...acoustic front-end is probably working okay, but the first one is very wrong - note the 266240."
Thanks David for the tips.
I will investigate how the 266240 is coming about in the first place. I haven't got a good clue where to start from but I will give it a go.
On the other hand, there are a few functions in Sphinxbase which I have been rewriting for Wince API. Any tips on how to handle functions like getenv(), isatty() and errno.h? I haven't given these a great deal of attention because I wasn't anticipating any unusual errors. Or is it a misconception that the code for WinCE should work as well as that of Win32? Pocketsphinx_continuous.exe works ok in win32 console, shouldn't WinCE do? Argument being, you've mirrored (adjusted) the code by "ifdef'ing it" using Win32/_Win32_WCE directives.
I wish I can find someone who has managed to run pocketsphinx 0.5 better than I have for comparison. Or is it too soon since its release?
Hie
Just to update everyone.
I've finally managed to fix my problems. I am now getting pretty good recognition. Infact, almost as good as running on the PC. So, Wince doesn't suck after all.
I will be interested to hear from one about any specific issues that I should look out for when using pocketsphinx 0.5 on a PPC. I wonder if David will be interested to hear about this progress on WinCE or he is way ahead of everyone :). Any advice will be appreciated.
Didn't have to change anything 'much' - "Sampling rate of 16000 is fine but 8000 will also work."
> I wonder if David will be interested to hear about this progress on WinCE or he is way ahead of everyone
We are all listening