Good morning, I'm a researcher of the Visilab Research Center
University of Messina - Italy.
We're working on a porting of PocketSphinx under Nanodesktop
for PSP platform.
We've recompiled your library and adapted the library in order
to run under nd. Now, we're doing some test: the first is to
launch the following command line
pocketsphinx_continuous -fwdflat no -bestpath no -lm ms0:/ndsphinxpackage/model/lm/tidigits/tidigits.lm -dict ms0:/ndsphinxpackage/model/lm/tidigits/tidigits.dic -hmm ms0:/ndsphinxpackage/model/hmm/tidigits -samprate 11025 -nfft 2048 -wlen 0.0250 -mmap no
We expected that, if we say the terms one, two, three to the
psp microphone, the system would recognize the terms. Is it
right ?
The system doesn't recognize any term, instead. I'm investigating if the trouble is in the routines that manage
the interconnection with the nd microphone API, or if it
is in the recognition engine.
Does it exist an option that provides the input to the engine
from a wave file instead that from a microphone ? If this
option exists, I could verify if the trouble is my routines
for microphone or if it is in the engine.
How PocketSphinx manages the different sample rates ?
For example, if the microphone supports only a frequency of
44100 Hz, and the user uses an option -samprate 11025 Hz, does the library provide to execute the undersampling automatically,
or this must be done by the ad_ routines ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It's really hard to say anything sensible from this video, then (null) looks suspicious of course. Again, try the batch recognition first, the ctl file should be the list of files to decode one per line without extension, look in sources for examples, there are many of them.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
a) I have used -nfft 2048 -wlen 0.0250 because I've obtained
an error by the program with the values provided by the predefined examples (the example used a frequency of 8000 Hz,
but my microphone doesn't support it).
Are you saying that the program cannot work using a value of -nfft 2048 and -wlen 0.0250 ?
b) So, Sphinx can work only at a frequency of 8 Khz ?
It cannot work at 11250 Hz ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
> Are you saying that the program cannot work using a value of -nfft 2048 and -wlen 0.0250 ?
It can, but this nfft value is not correct and can reduce accuracy of the recognition. The proper values are taken from feat.param file in the model folder.
> It cannot work at 11250 Hz ?
Decoder can only handle 16 or 8 kHz.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've written a routine that do the undersampling via
software. I've added to the code a call to
ndHAL_SND_WriteToChannel so that I can hear exactly
what is passed to the decoder.
It seems all ok.
But the recognition is unsuccessful. The routine returns a score
of -1 (strange).
Can I publish a video that show you what happens ?
For my lab a porting of Sphinx would be very very useful,
but we aren't able to do it working.
Please, help us.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
We are working hardly on it, but we need help.
Desperately need help...
You are written an option
-ctl test.ctl
for pocketsphinx_batch. Can you tell me what is
the content of this file ?
What have I write in this file ?
A second thing: there is troubles with the log. Nanodesktop
doesn't support the pipeling (2>) of the terminal to a file, yet,
so we can try to use a native function of sphinx to redirect the
stderr stream to a file.
But, for some strange reason (I think that the trouble is in
the fflush routine of our OS), the log that is written on the
disk is not complete.
So, I'm compressing a video that shows the content of the
video when the system runs pocketsphinx_continuous
program.
I know that a video isn't useful as a log, but PLEASE,
see a moment this.
Within some minutes, I'll post the link of the video.
Thank you very much for your help.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Good morning, I'm a researcher of the Visilab Research Center
University of Messina - Italy.
We're working on a porting of PocketSphinx under Nanodesktop
for PSP platform.
We've recompiled your library and adapted the library in order
to run under nd. Now, we're doing some test: the first is to
launch the following command line
pocketsphinx_continuous -fwdflat no -bestpath no -lm ms0:/ndsphinxpackage/model/lm/tidigits/tidigits.lm -dict ms0:/ndsphinxpackage/model/lm/tidigits/tidigits.dic -hmm ms0:/ndsphinxpackage/model/hmm/tidigits -samprate 11025 -nfft 2048 -wlen 0.0250 -mmap no
We expected that, if we say the terms one, two, three to the
psp microphone, the system would recognize the terms. Is it
right ?
The system doesn't recognize any term, instead. I'm investigating if the trouble is in the routines that manage
the interconnection with the nd microphone API, or if it
is in the recognition engine.
Does it exist an option that provides the input to the engine
from a wave file instead that from a microphone ? If this
option exists, I could verify if the trouble is my routines
for microphone or if it is in the engine.
How PocketSphinx manages the different sample rates ?
For example, if the microphone supports only a frequency of
44100 Hz, and the user uses an option -samprate 11025 Hz, does the library provide to execute the undersampling automatically,
or this must be done by the ad_ routines ?
Here is the video that shows what happens:
http://rapidshare.com/files/164815686/PocketSphinx_Log_1.avi.html
Thanks again...
It's really hard to say anything sensible from this video, then (null) looks suspicious of course. Again, try the batch recognition first, the ctl file should be the list of files to decode one per line without extension, look in sources for examples, there are many of them.
> -samprate 11025 -nfft 2048 -wlen 0.0250
There is no resampling from 44.1 or 11k. You need to do this yourself. Also don't change nfft and wlen, they don't work this way.
> Does it exist an option that provides the input to the engine
from a wave file instead that from a microphone ?
Did you notice pocketsphinx_batch?
> does the library provide to execute the undersampling automatically,
No, you need to resample to 8k yourself
Thanks for you answers.
I didn't understand some things:
a) I have used -nfft 2048 -wlen 0.0250 because I've obtained
an error by the program with the values provided by the predefined examples (the example used a frequency of 8000 Hz,
but my microphone doesn't support it).
Are you saying that the program cannot work using a value of -nfft 2048 and -wlen 0.0250 ?
b) So, Sphinx can work only at a frequency of 8 Khz ?
It cannot work at 11250 Hz ?
> Are you saying that the program cannot work using a value of -nfft 2048 and -wlen 0.0250 ?
It can, but this nfft value is not correct and can reduce accuracy of the recognition. The proper values are taken from feat.param file in the model folder.
> It cannot work at 11250 Hz ?
Decoder can only handle 16 or 8 kHz.
Thanks for your help.
I've written a routine that do the undersampling via
software. I've added to the code a call to
ndHAL_SND_WriteToChannel so that I can hear exactly
what is passed to the decoder.
It seems all ok.
But the recognition is unsuccessful. The routine returns a score
of -1 (strange).
Can I publish a video that show you what happens ?
For my lab a porting of Sphinx would be very very useful,
but we aren't able to do it working.
Please, help us.
> Can I publish a video that show you what happens ?
Video is useless, publish the recognition log, the recording you are trying to recognize.
Which option can I pass to pocketsphinx_continuos to obtain the log ?
The option -logfn seems don't work for
pocketsphinx_continuous (online for batch)
And... if I would decode a .wav file, can you
write me a command line that I can pass to
nd pseudoExec routine ?
> Which option can I pass to pocketsphinx_continuos to obtain the log ?
The log is dumped on console.
> And... if I would decode a .wav file, can you
write me a command line that I can pass to
nd pseudoExec routine ?
I don't know what is pseudoExec. For running it on host you can use
pocketsphinx_batch -adcin yes \
-ctl test.ctl \
-cepdir test \
-cepext .wav \
-samprate 16000 \
-lm ${LMFILE} \
-dict ${DICT} \
-hmm ${HMM}
I suggest you to try the same on the host first before moving to PSP.
We are working hardly on it, but we need help.
Desperately need help...
You are written an option
-ctl test.ctl
for pocketsphinx_batch. Can you tell me what is
the content of this file ?
What have I write in this file ?
A second thing: there is troubles with the log. Nanodesktop
doesn't support the pipeling (2>) of the terminal to a file, yet,
so we can try to use a native function of sphinx to redirect the
stderr stream to a file.
But, for some strange reason (I think that the trouble is in
the fflush routine of our OS), the log that is written on the
disk is not complete.
So, I'm compressing a video that shows the content of the
video when the system runs pocketsphinx_continuous
program.
I know that a video isn't useful as a log, but PLEASE,
see a moment this.
Within some minutes, I'll post the link of the video.
Thank you very much for your help.