CMU Sphinx / Forums / Help: How to get Amplitude of existing language/acoustic models?

Toine db - 2015-11-02

In Short; what was the sound volume (amplitude) during the training of the English acoustic and language model?

I want to do some test with the english acoustic ans language models, and I want to get my microphone recordings as close as possible to the trainings recordings but I don't know at what level the trainings where recorded.

Is there someway to find out at what everage amplitude the training sounds where for these two models?

PS: I'm talking about the english models at http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/

Last edit: Toine db 2015-11-04

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Toine db - 2015-11-12
  
  nobody any clue what the best volume level is for recognition with the english models?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Nickolay V. Shmyrev - 2015-11-12
    
    It should not matter, amplitude is normalized during training.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Toine db - 2015-11-13
      
      OK, great news.
      
      But I’m confused.
      Because you mentioned (a long time ago) that amplitudes of training and recording was too far apart: in this thread message: https://sourceforge.net/p/cmusphinx/discussion/help/thread/9e85fd2f/#a09b/818b/309f/1e60/642f/eee6/d7ac/73d9/df30/0cf9/6d9b/737b/7cc2/cfc3/d580/0ec8
      
      Did you mean something else?
      (do I need to normalize and to what level? is it the training recordings that I need to normalize or is it nowadays done by the sphinx training automaticly)
      
      PS: this thread was about working with a cutom build training, not with a ready to use model from Sphinx.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2015-11-13

Custom training was not normalized, US English model training uses CMN, so it is normalized.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Toine db - 2015-11-18
  
  OK, good to know.
  
  So am I right if I say the normalization is aranged by the CMN during the training?
  
  And when I use CMN then I do not need to use any gain control on any device/mic input during recognizion?
  
  Last edit: Toine db 2015-11-18
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Toine db - 2016-01-11
  
  Hi Nickolay,
  
  Sorry it took a while but I'm still working with this issue.... or question about the CMN mechanisme.
  
  I do not understand what you where trying to say about CMN when you said
  
  US English model training uses CMN, so it is normalized.
  
  I thought the model gets normalized when you put CMN='None' so you don't need to use any gain control. Or am I mistaking?
  
  PS: how is the CMN used in the Phonetic US English model?
  
  Hope to hear from you and finaly understand the CMN, I think thats a mayor issue in my recognition number.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Toine db - 2016-01-18
    
    @Nickolay, if you have time, could you please give your opinion about my two questions?
    
    It would be very helpfull for me continuing with the Phonetic addon.
    
    Last edit: Toine db 2016-01-18
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Nickolay V. Shmyrev - 2016-01-24
      
      I thought the model gets normalized when you put CMN='None' so you don't need to use any gain control. Or am I mistaking?
      
      CMN=none means to disable cepstral mean normalization. If you disable normalization you need gain control. Gain control is another form of normalization which complements or substitutes CMN.
      
      PS: how is the CMN used in the Phonetic US English model?
      
      Acoustic model used in LVCSR and Phonetic mode is the same, there is no specifics for CMN.
      
      So am I right if I say the normalization is aranged by the CMN during the training?
      
      Yes
      
      And when I use CMN then I do not need to use any gain control on any device/mic input during recognizion?
      
      CMN does not work very well on very short samples. Gain control works faster on short samples. Gain control improves CMN if implemented properly.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

How to get Amplitude of existing language/acoustic models?

Speech Recognition Toolkit

Forums

Help

How to get Amplitude of existing language/acoustic models? document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

How to get Amplitude of existing language/acoustic models?