Menu

How to bulk test phoneme recognition with pocketsphinx_continuous

Help
Toine db
2016-01-27
2016-02-24
  • Toine db

    Toine db - 2016-01-27

    The phoneme recognition test with pocketsphinx_continuous work fine, but one at a time.
    http://cmusphinx.sourceforge.net/wiki/phonemerecognition

    Any suggestions how to approuch a bulk test for this?
    (read a while directory or file with raw file locations?)

    PS: I'm using Windows as a OS

     
  • Nickolay V. Shmyrev

    Use pocketsphinx_batch with ctl file -ctl file.ctl as in the end of tutorial:

    http://cmusphinx.sourceforge.net/wiki/tutorialadapt#testing_the_adaptation

     
    • Toine db

      Toine db - 2016-02-05

      Thanks,

      But the batch gives different results then pocketsphinx_continues.. why could that be?

      I'm using the following commands

      pocketsphinx_continuous
      -hmm model/en-us/en-us
      -allphone model/en-us/en-us-phone.lm.bin
      -beam 1e-20
      -pbeam 1e-20
      -lw 2.0
      -infile testwavs/file1.wav
      -allphone_ci yes

      Result: SIL AY AE

      Batch Command
      pocketsphinx_batch
      -adcin yes
      -hmm model/en-us/en-us
      -allphone model/en-us/en-us-phone.lm.bin
      -beam 1e-20
      -pbeam 1e-20
      -lw 2.0
      -cepdir testwavs
      -cepext .wav
      -ctl test/testfileids.txt
      -allphone_ci yes

      Result: SIL NG HH

      FYI: without -acdin I get errors
      INFO: batch.c(729): Decoding 'file1'
      ERROR: "batch.c", line 207: File length mismatch: 0x52494646 != 0xd7a, maybe it's not MFCC file
      ERROR: "batch.c", line 422: Failed to read MFCC from the file 'testwavs/file1.wav'

       

      Last edit: Toine db 2016-02-05
    • Toine db

      Toine db - 2016-02-09

      @Nickolay any thoughts towards my problem?

       
      • Nickolay V. Shmyrev

        Hi Toine

        I'm sorry, batch and continuos are processing audio differently. Batch considers audio as a whole and normalizes volume (CMN) audio as a whole. Continuos processes audio frame by frame and it normalizes only using frames it already seen. This is not optimal process as was discussed many times on the forum and unfortunately it needs a proper initial CMN estimation. You can set -cmninit parameter to the values you see in batch log output and you will get similar results between batch and continuous. Thats why we recommend to decode longer files with continuous.

        We will work on solution to make continuos more accurate from the start, it's just not there yet.

         
        • Toine db

          Toine db - 2016-02-11

          Thanks for the reply Nickolay,

          As I understand correctly the batch is closest thing to pocketsphinx output on phones?

          In my phone Apps I currently have a custom self build recognition system that detects a specific type of sound.
          When that sound is detected it sends it to pocketsphinx for recognition, therefore starts and stops the pocketsphinx again and again.

          Is pocketsphinx able to normalize volume in this phone scenario?
          (or are the processed frames cleared after I stop the recognition in between)

          Hope to hear from you

           
          • Nickolay V. Shmyrev

            As I understand correctly the batch is closest thing to pocketsphinx output on phones?

            No, ps_process_raw calls do continuous processing.

            When that sound is detected it sends it to pocketsphinx for recognition, therefore starts and stops the pocketsphinx again and again.

            It is ok to stop and restart, just do not reinit the decoder.

            Is pocketsphinx able to normalize volume in this phone scenario?
            (or are the processed frames cleared after I stop the recognition in between)

            Volume is reset when you reinit the decoder or when you call ps_start_stream.

             
            • Toine db

              Toine db - 2016-02-12

              Thanks for the explaining anwser.

              Now i know in a functional way how to work with the decoder, but Im not really sure how to implement it.

              But I do not understand what your trying to say with the following comment:

              No, ps_process_raw calls do continuous processing
              Do I need to use Batch or Continues to get representative like phone results ???
              .

              Currently Im using ps_start_utt and ps_end_utt in between, but I'm not sure if this resets reinit the decoder and volume... the part that I really want to....
              .

              Can you give me a hint what method(s) I need to use to pause/restart pocketsphinx without triggering a reset???
              .

              Because I'm thinking of creating the following system
              1. In the background I want to keep PocketSphinx running/listening/decoding incomming audio to keep the Volume/CMN at a good level.
              2. When my custom system detects my sound I want to
              ...2.1 Pause background pocketsphinx
              ...2.2 Pause the background running pocketsphinx
              ...2.3 Clear any current recognition (without clear\reinit volume)
              ...2.4 Request recognition for my detected sound
              ...2.5 Continue background pocketsphinx

              Last question:
              Does -cmninit work like a kickstart, to set initial value but will be leveled by PocketSphinx during decoding??? or is it settings PocketSphinx a constant level???

              PS: I think I created the WIndows Phone example with reinit the decoder again and again, I'll check this later and possibly send a Push request with adjustments.

              Hope to hear from you, sorry for the lot of quesations about this topic but I want to get it best as possible. Thanks again for the last reply.

               

              Last edit: Toine db 2016-02-12
            • Toine db

              Toine db - 2016-02-23

              Nickolay,

              Sorry to bother you, but could you give a short anwser to the 3 questions so I can improve the Windows Phone example?
              (3 questions are at bottom of this thread)

              I'm planning to adjust to parts in the
              + Pause (Pocketsphinx in kind of idle mode) withhout resetting whole decoder, what currently is the issue probably
              + Add nbest as return value

              Hope to hear from you,

              Toine

               

              Last edit: Toine db 2016-02-23
              • Nickolay V. Shmyrev

                1. What do you mean with this: "No, ps_process_raw calls do continuous processing
                  Do I need to use Batch or Continues to get representative like phone results"

                Both types of processing give you representative results. You use batch for batch testing, continuous for decoding on the phone.

                1. What command do I need to use to pause and restart the decoder, without loosing the calibrated CMN?

                ps_end_utt stops the decoder. ps_start_utt starts the search. cmn is kept

                1. Is -cmninit a kickstart or set to be a constant?

                Initial value is used only for the first utterance, for next utterance cmn is recalculated.

                 
                • Toine db

                  Toine db - 2016-02-24

                  Thanks for the answers, Ill adjust the Windows Phone example accordingly.

                   
  • Toine db

    Toine db - 2016-02-17

    @Nickolay, probably the text was a litle to long, I hope you could still answer the questions I still have (mainly about the mechanisme);

    in short.
    1. What do you mean with this: "No, ps_process_raw calls do continuous processing
    Do I need to use Batch or Continues to get representative like phone results"
    2. What command do I need to use to pause and restart the decoder, without loosing the calibrated CMN?
    3. Is -cmninit a kickstart or set to be a constant?

    Hope to hear from you.

     

Log in to post a comment.