Menu

Voice driven audio player with Pocketsphinx

Help
2008-08-03
2012-09-22
  • Arnau Alemany

    Arnau Alemany - 2008-08-03

    Hello all,

    I'm developing a little voice driven player for disabled people for my final
    project at univeristy.

    Using the documentation, forums, etc i've been able to build an usable
    environment with an iPaq h3650, Pocketsphinx 0.5 and Familiar Linux 0.8.4.

    I've build my own language model with lmtool and i'm using rm1 audiobase and
    speech recognition is pretty functional. When it recognizes the word START, it
    starts playing an audio book.

    My problem is that i don't know how to discard noise, incorrect words or even
    the audio book itself. So there is no way i can say STOP to stop playing.

    My questions are:

    - When it starts to play, the app goes crazy and tries to recognize   
      itself with each word it listens. I read AudioTool from Sphinx4 can do
      that by using two different channels for input an output. Is there a way
      Pocketsphinx can ignore what it's being played?
    
    - Whatever word you say, the app recognizes one word of the vocab, i wonder 
      if it's possible to define an error threshold so the recognizer could
      discard invalid words.
    
      For example, if i say GET OUT recognizer recognizes START or STOP but i
      want it to recognize it's not a word from the custom dictionary.
    
    - Last question, is it possible to limit de listening time? I only want to  
      recognize some predefined words (commands), not dictation or large
      phrases.
    

    I've searched the documentation but i haven't found any clue on how to modify
    this behaviour. I'm sure there's a way to do this but i don't know where to
    start looking for.

    Thank you,
    Arnau

     
    • Andrew Kalonga

      Andrew Kalonga - 2008-08-08

      Hie Arnau

      Please read my recent post: http://sourceforge.net/forum/message.php?msg_id=5154975

      Drew

       
    • Andrew Kalonga

      Andrew Kalonga - 2008-08-04

      Hie Arnau

      "....- Whatever word you say, the app recognizes one word of the vocab, i wonder
      if it's possible to define an error threshold so the recognizer could
      discard invalid words.

      For example, if i say GET OUT recognizer recognizes START or STOP but i
      want it to recognize it's not a word from the custom dictionary.... "

      I also have managed to run version 0.5 on Mio 701e PPC (however using Windows). Refer to the topic PocketSphinx 0.5 running on WinCE (WM 5.0 ) if you haven't read it.

      Well, what I was going to say is that I get the same behaviour. Pocketsphinx does not recognise the words I speak. It gives me a different "word" altogether (just 1 word). Is this what you meant by the above quote?

      I've got a question for you too - are you using code from pocketsphinx_continuous project?
      I have unusual output after ps_init():
      INFO: ....\src\libsphinxbase\feat\cmn_prior.c(122): cmn_prior_update: from < 266240.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >. What are you using to open the sound card?
      -(ad_open_dev(cmd_ln_str_r(config, "-adcdev"),(int)cmd_ln_float32_r(config, "-samprate"))) == NULL)
      -or ad_open_sp(16000).

      Someone mentioned it has to do with "feature extraction" - do you know exactly what this means?

      As for your problem, why don't you assign keywords to start/stop the recognition process - of course these will have to be in the dictionary (not what you want). For example "Reco Activate" / "Reco Deactivate". Then use these in your code to add a loop around the continuous listen & decode for loop in continuous.c. Effectively, it is a copy of the whole recoginition loop but it will only listen for the Activate/Deactivate words. This is long winded but with a bit chiselling I think it can work.

      Drew

       
    • Arnau Alemany

      Arnau Alemany - 2008-08-05

      Hello Drew,
      Thank you for your reply, I've read your post "PocketSphinx 0.5 running on WinCE (WM 5.0 )" and I had the same problem some months ago when I was trying to build the system on a Windows Mobile environment (WM 5.0 and WM 6.0).

      You can find some questions I asked in the forum but and I can't help you to fix this. I think Pocketsphinx hasn't been fully tested on those platforms. However I read it was working well in Linux so I gave it a try.

      I don't know what "feature extraction" exactly means in this context but I liked your idea to start/stop the recognition. However the problem I have is that I don't know how to recognize only one word and discard the others (listen to one word only).

      I would want to know if I can improve accuracy by training my grammar against RM1 audio base with SphinxTrain, maybe that way I could recognize specific words.

      Thank you,
      Arnau

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.