Menu

Keyword spotting in noise areas

Help
ArcadeBit
2014-05-26
2014-05-27
  • ArcadeBit

    ArcadeBit - 2014-05-26

    Hi there,

    I am a real nooby in the case of ASR. To provide you as much data as possible, here the long story: I got a topic for my Bachelor Thesis. I have to provide a akustik method to detect "turmoil - like" situation. Part of the topic is to detect a few keywords like police, help or fire. The location of the keyword spotting is inside and near local public transportation (subway). Due to noisy enviroment, i am not sure if it is possible to detect keywords without clear gaps between the words. Even if I use a method to reduce the backround noise. And it has to run on a Rasbperry PI.

    I played around a bit with pocketsphinx, i createt my Dictionary, a simple JSGF and used the acoustic model from voxforge. The WER was ok. My Biggest problem was: different words or noise were often detected as Fire, Police or Help, even with a filler dictionary.

    With all the stuff i have read so far, i am not sure if this is possible for me. I mean: in an adequate amount of time or with a satisfactory result.

    Now to my Questions:
    How do I implement the noise reduction?
    Would a specific acoustic model reduce the amount of "noise = fire/police/help"?
    Would a specific acoustic model reduce my WER?
    Does the keyword spotting work with noise gaps between words?
    Is there a golden road to my goal?

    kind regards

     

    Last edit: ArcadeBit 2014-05-26
  • Nickolay V. Shmyrev

    Due to noisy enviroment, i am not sure if it is possible to detect keywords without clear gaps between the words.

    It is possible

    Part of the topic is to detect a few keywords like police, help or fire.

    For reliable detection keyword must have at least 3 syllables. "fire" is too short for keyword.

    I played around a bit with pocketsphinx, i createt my Dictionary, a simple JSGF and used the acoustic model from voxforge.

    For keyword spotting there is a specific keyword spotting search mode specified with "-kws" option. It also has option to tune the threshold (-kws_threshold) for detection/false alarm rate

    Voxforge model is too inaccurate. Our most accurate model is en-us generic acoustic model.

    How do I implement the noise reduction?

    Noise reduction is already implemented in development version in subversion trunk

    Does the keyword spotting work with noise gaps between words?

    No, it should work in continuous stream too.

    Is there a golden road to my goal?

    Create a test set and evaluate it to get best performance point.

     
  • Yuval Karon

    Yuval Karon - 2017-02-21

    Hello,

    About three years after the original post...

    In another post, about noise robustness, you recommend adding the following to sphinx_train.cfg:
    ~~~~~~~~~~~~
    $CFG_WAVFILE_SRATE = 16000.0;
    $CFG_NUM_FILT = 25; # For wideband speech it's 25, for telephone 8khz reasonable value is 15
    $CFG_LO_FILT = 130; # For telephone 8kHz speech value is 200
    $CFG_HI_FILT = 6800; # For telephone 8kHz speech value is 3500
    $CFG_TRANSFORM = "dct"; # Previously legacy transform is used, but dct is more accurate
    $CFG_LIFTER = "22"; # Cepstrum lifter is smoothing to improve recognition
    $CFG_VECTOR_LENGTH = 13; # 13 is usually enough
    ~~~~~~~~~~~~~~

    Is this a general recommendation for noisy speech?

    The default feature set is 1s_c_d_dd. Would you recommend a different feature set
    for noisy input? where can I read about the naming of feature sets?

      Thanks,
                Yuval
    
     
    • Nickolay V. Shmyrev

      Is this a general recommendation for noisy speech?

      It is simply a default

      The default feature set is 1s_c_d_dd. Would you recommend a different feature set
      for noisy input?

      No.

      where can I read about the naming of feature sets?

      In the source code.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.