Menu

specific word detection from Telephonic audio

Help
2016-02-04
2016-03-08
  • Radhika Patel

    Radhika Patel - 2016-02-04

    I have designed a new app for detecting a word which makes use of pocketsphinx and I have a requirement to re-adjust the dictonary values for that so I created text file with my keywords and then uploaded it on http://www.speech.cs.cmu.edu/tools/lmtool-new.html also created Dictonary and converted .lm file to .dmp file using command "pocketsphinx_continuous -inmic yes -lm 8521.lm -dict 8521.dic" then I implemented hmm.en-us-semi languvge model. I set my threshold value as 'HELLO /1e-1/' It's easily detecting human voice but not able to detect a telephonic Voice So I Implemented 8khz language Model as per soluton given in forum but found error in mdef file and dmp file. I followed these steps severel times and got the same error in mdef file which displays like "ERROR: "acmod.c",Folder '/path/hmm/en-us-8khz' does not contain acoustic model definition 'mdef' and error-: new_Decoder returned -1".

    Can anybody Help Me fix this error and create a dictonary for Telephonic word detection. And are there any other changes required?

     

    Last edit: Radhika Patel 2016-02-04
  • Radhika Patel

    Radhika Patel - 2016-02-05

    Thank you for your quick response...

    But now I am facing another problem that says how to list language model and dictionary in 'assets.list', Can you please provide me syntax or snap? Currently I have written 'models/hmm/en-us-8khz/README'in this way but having error 'E/error-: sync/models/hmm/en-us-8khz/README.md5'. I dont have any idea how to set files in assests and also one another thing in gram file how should I mentaion threshold value for particular word?

    Please reply me as soon as possible.

     
  • Radhika Patel

    Radhika Patel - 2016-02-05

    It's Working, Thank you :)

    Can you please tell me how to set a threshold value for getting high accuracy in world detection? Because Where there is background noise ramdom words made up from the dictionary will be decoded.

     
  • David

    David - 2016-02-05

    Hi Radhika,

    Find your answer here

     
  • Radhika Patel

    Radhika Patel - 2016-02-08

    Thank you It's works 80% cases. Where there is background noise ramdom words made up from the dictionary will be decoded.

    Can you please tell me what is the best model for teliphonic audio detection in CMU Sphinx?

     
  • David

    David - 2016-02-09

    The best model for telephony is mentionned in Post 2. To address accuracy issues you need to send your config and the that you try to decode

     
  • Radhika Patel

    Radhika Patel - 2016-02-15

    Thank you David

    Below is my digit.gram file:
    HELLO /1-e5/
    HI /1-e1/
    I HELP YOU /1-e20/
    MY NAME IS /1-e20/
    GOOD MORNING /1-e20/
    GOOD AFTERNOON /1-e35/
    GOOD EVENING /1-e25/

    In which all words are working fine just Good Afternoon is not getting detected. So can you please tell me the threshold value for Good Afternoon?

     
    • Nickolay V. Shmyrev

      The format for float numbers is 1e-35, not 1-e35. Not strange your gram does not work.

       
  • Radhika Patel

    Radhika Patel - 2016-02-16

    Thank you Nickolay It's working

    Can I tune my threshold value at runtime? For example if I want to set threshold value "HELLO /1-e5/" in my gram file then can I increase or decries it at run time as per user input ?? Is it possible?

     
    • Nickolay V. Shmyrev

      1e-5, not 1-e5. In runtime you can remove the current search and add a new one.

       
  • Radhika Patel

    Radhika Patel - 2016-02-16

    HELLO 1e-5/
    HI /1e-1/
    I HELP YOU /1e-20/
    MY NAME IS /1e-20/
    GOOD MORNING /1e-20/
    GOOD AFTERNOON /1e-30/
    GOOD EVENING /1e-25/

    How to tune this 7 key

     
    • Nickolay V. Shmyrev

      Tutorial says:

      Threshold must be tuned to balance between false alarms and missed detections, the best way to tune threshold is to use a prerecorded audio file.

      http://cmusphinx.sourceforge.net/wiki/tutoriallm

       
  • Radhika Patel

    Radhika Patel - 2016-02-17

    Thank you Nickolay

    I read the suggested tutorial Now I have a query with build Language Model that I have a word "hello" in my .dic file and there are 2 pronounciations option available for it as
    HELLO HH AH L OW
    HELLO(2) HH EH L OW
    If I want to add 3rd pronounce for Hello Is it possible to add it in .disc file? Mainly .dic file is editable by us or not?

    And another question is which library is better for telephony use 'en-us-ptm' or 'cmusphinx-en-us-ptm-8khz-5.2' ?? I got same result in both library so can you tell me which one is best ?

    Can Anybody Help me As Soon as possible ?

     

    Last edit: Radhika Patel 2016-02-18
    • Nickolay V. Shmyrev

      Mainly .dic file is editable by us or not?

      It is editable.

      And another question is which library is better for telephony use 'en-us-ptm' or 'cmusphinx-en-us-ptm-8khz-5.2' ??

      For telephony you need to use 8khz.

       

      Last edit: Nickolay V. Shmyrev 2016-02-19
  • Radhika Patel

    Radhika Patel - 2016-02-22

    Thank you,

    I am using cmusphinx-en-us-ptm-8khz-5.2 library and my theshold is
    HELLO /1e-5/
    HI /1e-1/
    I HELP YOU /1e-20/
    MY NAME IS /1e-20/
    WELCOME /1e-15/
    GOOD MORNING /1e-25/
    GOOD AFTERNOON /1e-30/
    GOOD EVENING /1e-28/
    This works fine with all except Google Translater's voice so can you please suggest me solution for this

     
    • Nickolay V. Shmyrev

      You need to provide data files, command line, logs, results in order to get help on accuracy issues.

       
  • Radhika Patel

    Radhika Patel - 2016-02-23

    I attached my custom dictionary and gram file along with my recognizer setup,

    private void setupRecognizer(File assetsDir) throws IOException {
    Log.e("setupRecognizer", "setupRecognizer");
    recognizer = defaultSetup()
    .setAcousticModel(new File(assetsDir, "cmusphinx-en-us-ptm-8khz-5.2"))
    .setDictionary(new File(assetsDir, "6199.dic"))
    .setBoolean("-allphone_ci", true)
    .getRecognizer();
    recognizer.addListener(this);
    File digitsGrammar = new File(assetsDir, "digits.gram");
    recognizer.addKeywordSearch(DIGITS_SEARCH, digitsGrammar);
    }

    If I increase my threshold value it also increases false alerms. This works fine with all except Google Translater's voice so can you please suggest me solution for this

     

    Last edit: Radhika Patel 2016-02-23
    • Nickolay V. Shmyrev

      You need to share the audio file

       
  • Radhika Patel

    Radhika Patel - 2016-02-23

    I have attached vedio file the way I test app

    In testing the app, it seems that my voice can be picked up for most of the key phrases, yet when I got my client to speak into it, it doesn't pick up any of the key phrase. Can you please tell me why it happen? and what is the solutions ?

     

    Last edit: Radhika Patel 2016-02-23
    • Nickolay V. Shmyrev

      You need to provide audio file recordings to let me reproduce your problems.

       
  • Radhika Patel

    Radhika Patel - 2016-02-26

    This is my audio recording. Please give me suggection as soon as possible

     

    Last edit: Radhika Patel 2016-02-29
    • Radhika Patel

      Radhika Patel - 2016-03-08

      Hi Nickolay did you got chance to reproduce our problems?

       

Log in to post a comment.