CMU Sphinx / Forums / Help: HELP: PocketSphinx under Nanodesktop

pegasus2000 - 2008-11-15

Good morning, I'm a researcher of the Visilab Research Center
University of Messina - Italy.

We're working on a porting of PocketSphinx under Nanodesktop
for PSP platform.

We've recompiled your library and adapted the library in order
to run under nd. Now, we're doing some test: the first is to
launch the following command line

pocketsphinx_continuous -fwdflat no -bestpath no -lm ms0:/ndsphinxpackage/model/lm/tidigits/tidigits.lm -dict ms0:/ndsphinxpackage/model/lm/tidigits/tidigits.dic -hmm ms0:/ndsphinxpackage/model/hmm/tidigits -samprate 11025 -nfft 2048 -wlen 0.0250 -mmap no

We expected that, if we say the terms one, two, three to the
psp microphone, the system would recognize the terms. Is it
right ?

The system doesn't recognize any term, instead. I'm investigating if the trouble is in the routines that manage
the interconnection with the nd microphone API, or if it
is in the recognition engine.

Does it exist an option that provides the input to the engine
from a wave file instead that from a microphone ? If this
option exists, I could verify if the trouble is my routines
for microphone or if it is in the engine.

How PocketSphinx manages the different sample rates ?

For example, if the microphone supports only a frequency of
44100 Hz, and the user uses an option -samprate 11025 Hz, does the library provide to execute the undersampling automatically,
or this must be done by the ad_ routines ?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- pegasus2000 - 2008-11-18
  
  Here is the video that shows what happens:
  
  http://rapidshare.com/files/164815686/PocketSphinx_Log_1.avi.html
  
  Thanks again...
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Nickolay V. Shmyrev - 2008-11-18
    
    It's really hard to say anything sensible from this video, then (null) looks suspicious of course. Again, try the batch recognition first, the ctl file should be the list of files to decode one per line without extension, look in sources for examples, there are many of them.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2008-11-15
  
  > -samprate 11025 -nfft 2048 -wlen 0.0250
  
  There is no resampling from 44.1 or 11k. You need to do this yourself. Also don't change nfft and wlen, they don't work this way.
  
  > Does it exist an option that provides the input to the engine
  from a wave file instead that from a microphone ?
  
  Did you notice pocketsphinx_batch?
  
  > does the library provide to execute the undersampling automatically,
  
  No, you need to resample to 8k yourself
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - pegasus2000 - 2008-11-16
    
    Thanks for you answers.
    
    I didn't understand some things:
    
    a) I have used -nfft 2048 -wlen 0.0250 because I've obtained
    an error by the program with the values provided by the predefined examples (the example used a frequency of 8000 Hz,
    but my microphone doesn't support it).
    
    Are you saying that the program cannot work using a value of -nfft 2048 and -wlen 0.0250 ?
    
    b) So, Sphinx can work only at a frequency of 8 Khz ?
    It cannot work at 11250 Hz ?
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Nickolay V. Shmyrev - 2008-11-16
      
      > Are you saying that the program cannot work using a value of -nfft 2048 and -wlen 0.0250 ?
      
      It can, but this nfft value is not correct and can reduce accuracy of the recognition. The proper values are taken from feat.param file in the model folder.
      
      > It cannot work at 11250 Hz ?
      
      Decoder can only handle 16 or 8 kHz.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- pegasus2000 - 2008-11-16
  
  Thanks for your help.
  
  I've written a routine that do the undersampling via
  software. I've added to the code a call to
  ndHAL_SND_WriteToChannel so that I can hear exactly
  what is passed to the decoder.
  
  It seems all ok.
  
  But the recognition is unsuccessful. The routine returns a score
  of -1 (strange).
  
  Can I publish a video that show you what happens ?
  
  For my lab a porting of Sphinx would be very very useful,
  but we aren't able to do it working.
  
  Please, help us.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Nickolay V. Shmyrev - 2008-11-16
    
    > Can I publish a video that show you what happens ?
    
    Video is useless, publish the recognition log, the recording you are trying to recognize.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - pegasus2000 - 2008-11-16
      
      Which option can I pass to pocketsphinx_continuos to obtain the log ?
      
      The option -logfn seems don't work for
      pocketsphinx_continuous (online for batch)
      
      And... if I would decode a .wav file, can you
      write me a command line that I can pass to
      nd pseudoExec routine ?
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Nickolay V. Shmyrev - 2008-11-16
        
        > Which option can I pass to pocketsphinx_continuos to obtain the log ?
        
        The log is dumped on console.
        
        > And... if I would decode a .wav file, can you
        write me a command line that I can pass to
        nd pseudoExec routine ?
        
        I don't know what is pseudoExec. For running it on host you can use
        
        pocketsphinx_batch -adcin yes \
        -ctl test.ctl \
        -cepdir test \
        -cepext .wav \
        -samprate 16000 \
        -lm ${LMFILE} \
        -dict ${DICT} \
        -hmm ${HMM}
        
        I suggest you to try the same on the host first before moving to PSP.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- pegasus2000 - 2008-11-17
  
  We are working hardly on it, but we need help.
  Desperately need help...
  
  You are written an option
  
  -ctl test.ctl
  
  for pocketsphinx_batch. Can you tell me what is
  the content of this file ?
  
  What have I write in this file ?
  
  A second thing: there is troubles with the log. Nanodesktop
  doesn't support the pipeling (2>) of the terminal to a file, yet,
  so we can try to use a native function of sphinx to redirect the
  stderr stream to a file.
  
  But, for some strange reason (I think that the trouble is in
  the fflush routine of our OS), the log that is written on the
  disk is not complete.
  
  So, I'm compressing a video that shows the content of the
  video when the system runs pocketsphinx_continuous
  program.
  
  I know that a video isn't useful as a log, but PLEASE,
  see a moment this.
  
  Within some minutes, I'll post the link of the video.
  
  Thank you very much for your help.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

HELP: PocketSphinx under Nanodesktop

Speech Recognition Toolkit

Forums

Help

HELP: PocketSphinx under Nanodesktop document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

HELP: PocketSphinx under Nanodesktop