Multiple Keyword spotting

Help
2014-05-12
2014-05-28
1 2 > >> (Page 1 of 2)
  • Peetambar Singh

    Peetambar Singh - 2014-05-12

    I have some audio clips obtained by decompressing and splitting an audio-video stream.
    These audio clips can have many variations in accent, language, ambient sound etc.

    I need to spot multiple keywords from these clips and also wish to locate the time where the keyword has been uttered.

    Can it be accomplished using pocketsphinx?

    Regards
    PK-Singh

     
    • Nickolay V. Shmyrev

      Yes, you can do it with pocketsphinx. There is a kws search mode, pocketsphinx -infile file.wav -kws keyphrase.file. Keyphrase file contains key phrases one per line.

      To access this API you need to checkout pocketsphinx from subversion.

       
      Last edit: Nickolay V. Shmyrev 2014-05-12
      • Ali Hassan

        Ali Hassan - 2014-05-12

        Hi,
        thank you so much for your quick response. I'm using pocket pocketsphinx under Windows 7, I tried above mentioned command but it didn't work. Here are the the parameters I set, please let me know what is it that I'm doing wrong.

        Command: pocketsphinx_continuous.exe -argfile argFile.txt -infile .\test_files\my_audio.wav

        argFile:

        -dict ./model/lm/cmu07a.dic
        -hmm ./model/hmm/en_broadcastnews_16k_ptm256_5000
        -lm ./model/lm/en-70k.dmp
        -kws ./model/keyphrase.file
        -kws_threshold 0.95

         
      • AnatolyB

        AnatolyB - 2014-05-13

        | To access this API you need to checkout pocketsphinx from subversion.

        Could you please specify the version with kws which should be checked out? I tried the latest from the trunk but it appears to be too sensitive to all the background noise (the earlier version did not have such problem)

         
        • Nickolay V. Shmyrev

          Could you please specify the version with kws which should be checked out?

          Latest one

          but it appears to be too sensitive to all the background noise (the earlier version did not have such problem)

          Provide the information to reproduce your problem - audio file, keyword and so on.

           
          • AnatolyB

            AnatolyB - 2014-05-14

            I actually tried it in continious mode first and compared the latest version with 0.8. With the same background the older version reacted only to rather loud speech, while the newer responded to almost each minor sound, which is every couple of seconds. Is there any way to control the sensitivity in the latest version?

             
            • Nickolay V. Shmyrev

              I'm not sure what do you mean by that, this thread is about keyword spotting and keyword activation. It is not available in 0.8.

               
              • AnatolyB

                AnatolyB - 2014-05-28

                Sorry, I'll try to describe my problem more properly now.
                I'm testing a latest version of pocketsphinx from the trunk, with common Russian language model (voxforge-ru-0.2). I have a sample audio file with a child speaking "как тебя зовут?". The common processing in the continuous mode gives me "как тебя золотой". But when I switch to keyword search and try to find any part of this phrase, I always get 'null' answer (with any kws threshold). The audio file and keyfile are attached.

                 
                • bic-user

                  bic-user - 2014-05-28

                  I succeed to detect keywords in your audio file with -kws_threshold 1e-50

                   
                  • Nickolay V. Shmyrev

                    Right, voxforge-ru is not very accurate model and it's not expected to be accurate for children voices. Another factor of accuracy is parameter estimation. Your utterance is very short and you are trying to detect keyword from the beginning. There is no enough time for recognizer to estimate the channel parameters. If you make the utterance longer or just add -cmninit 12.85,0,0.07,-0.27,-0.30,0.07,-0.37,-0.13,-0.16,-0.16,-0.25,-0.10,-0.20 to command line arguments, you can lower threshold to 1e-25. And it will recognize "как тебя зовут" with lm properly.

                     
                    Last edit: Nickolay V. Shmyrev 2014-05-28
  • Peetambar Singh

    Peetambar Singh - 2014-05-12

    This is good news, but I am very new to ASR and have some doubts

    1. Will the default acoustic model be able to handle different accents and different noise conditions
    2. My application needs to spot words in English being spoken in mixed language.
      For example a person is speaking something in Arabic but in between speaks some keywords/ phrases in English. Can this also be handled with the existing acoustic model.
    3. For the above scenario, shall I have to use Language Model or the goal can be achieved by a FSG also.
     
    • Nickolay V. Shmyrev

      Will the default acoustic model be able to handle different accents and different noise conditions

      Yes

      My application needs to spot words in English being spoken in mixed language. For example a person is speaking something in Arabic but in between speaks some keywords/ phrases in English. Can this also be handled with the existing acoustic model.

      Yes, but it's better to train a new model

      For the above scenario, shall I have to use Language Model or the goal can be achieved by a FSG also.

      You need neither LM nor FSG, it's a third search mode - keyword spotting. It looks for certain words from a list.

       
  • Peetambar Singh

    Peetambar Singh - 2014-05-12

    Is it possible to time stamp the occurrence of the keyword in an utterance

    I have seen some discussions of keyword spotting based on some grammar constructs. Is this third search mode better in terms of accuracy (false acceptance/rejection) etc.

     
  • Peetambar Singh

    Peetambar Singh - 2014-05-13

    I have given the following command line for keyword spotting, audio input is a microphone.
    I have observed a lot of false rejections. There is only word in the keyphrase file
    OSCAR and its pronunciation in the dictionary file is
    OSCAR AO S K ER

    pocketsphinx_continuous.exe -hmm hub4wsj_sc_8k -kws keyphrase.file -dict keyphrase.dic -samprate 16000.

    Do I need to tune some other parameters. What is the significance of kws_threshold parameter.

     
    • Nickolay V. Shmyrev

      I have observed a lot of false rejections. There is only word in the keyphrase file

      Provide the data to reproduce your problem.

      What is the significance of kws_threshold parameter.

      threshold controls false alarm rate, however, it's better to provide the data first, there might be other issues.

       
  • Peetambar Singh

    Peetambar Singh - 2014-05-14

    Hi Nicole,

    Following is the link for the database on which I am trying out kws_search.

    I have used the following command line
    pocketsphinx_continuous -hmm hub4wsj_sc_8k -samprate 16000 -kws keyphrase.file -dict kws.dic -infile oscar.wav (similarly for other wav files).

    Is the above command line correct ????

    https://www.dropbox.com/s/orn9fa2v65lz008/debug_kws.zip

    keyphrase.file contains the following keyphrases -- OSCAR, CHARLIE, FOXTROT, DELTA, DECEMBER.
    kws.dic is the dictionary file. I am using the hub4wsj_sc_8k model as available with the pocketsphinx release. I am using the latest svn version of pocketsphinx.

    It contains 7 audio files
    oscar.wav -- 5/6 utterances of OSCAR,
    delta.wav, -- 5/6 utterances of DELTA
    december.wav, -- 5/6 utterances of DECEMBER
    charlie.wav, -- 5/6 utterances of CHARLIE
    foxtrot.wav, -- 5/6 utterances of FOXTROT
    oscar_december_charlie.wav -- a mix of OSCAR, DECEMBER and CHARLIE
    delta_foxtrot_oscar_december_charlie.wav -- a mix of DELTA, FOXTROT, OSCAR,DECEMBER, CHARLIE.

    My observation is that keyword search mode is working well for the word DECEMBER and FOXTROT while zero recognition for other words in the database.

     
  • Nickolay V. Shmyrev

    My observation is that keyword search mode is working well for the word DECEMBER and FOXTROT while zero recognition for other words in the database.

    Well, that's about it, you can do:

    1) Use more accurate models en-us or en-us-semi

    2) Use longer words, it's recommended to use words of 4 syllables for activation

    3) Tune the threshold for every word. The per-threshold could be pointed in keyword list file. For example for charlie the more reasonable threshold is 1e-10.

     
  • Peetambar Singh

    Peetambar Singh - 2014-05-15

    Hi Nicole ,

    Thanks for the quick response,

    Could you please specify the format of the keyword list file.

    for example should I write the threshold value as
    CHARLIE 1e-10 in one single line.

    Any thumb-rule on how to judge a suitable value for the kws_threshold.

     
  • Nickolay V. Shmyrev

    Could you please specify the format of the keyword list file.

    Threshold separated by /:

      charlie / 1e-10
    

    Any thumb-rule on how to judge a suitable value for the kws_threshold.

    Ideally threshold must be computed on a test set

     
  • Peetambar Singh

    Peetambar Singh - 2014-05-23

    Hi Nicole,

    When I specify the thresholds in the keyword list file as suggested earlier in the thread

    CHARLIE / 1e-10
    OSCAR / 1e-10

    I get the following error
    ERROR: "kws_search.c", line 158: The word '/' is missing in the dictionary

     
    • bic-user

      bic-user - 2014-05-23

      try:
      CHARLIE /1e-10/
      OSCAR /1e-10/

       
  • manjusha bangale

    please specify how to create keyword list file.is it a text file?
    and what another require for this please specify in detail.
    also i am created a language model of some specific words.
    recognizer sending data in onpartialResult but null in onResult method.

     
  • manjusha bangale

    hello sir, i have created one grammer file for specific commands like start bike,stop bike ,run engine ,lock bike etc.these keyword got recognized and returned in onResult method.
    sir but problem is that it sometimes sends previously passed commands.
    means first i said lock bike,unlock bike then start bike then run engine it sends me again stop bike or lock bike .when my bike is started it may send stop bike which i dont sent now.it is actually previous called command so how can i avoid this.si rplease please please please help me .

     
    • Arseniy Gorin

      Arseniy Gorin - 2017-03-08

      you should create new message (not comment in the old threads).
      also you should provide your directory files and the commands you are trying to run
      it is impossible to understand what you are doing right now and to help you...

       
1 2 > >> (Page 1 of 2)

Log in to post a comment.