Menu

How to get Amplitude of existing language/acoustic models?

Help
Toine db
2015-11-02
2016-01-24
  • Toine db

    Toine db - 2015-11-02

    In Short; what was the sound volume (amplitude) during the training of the English acoustic and language model?

    I want to do some test with the english acoustic ans language models, and I want to get my microphone recordings as close as possible to the trainings recordings but I don't know at what level the trainings where recorded.

    Is there someway to find out at what everage amplitude the training sounds where for these two models?

    PS: I'm talking about the english models at http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/

     

    Last edit: Toine db 2015-11-04
    • Toine db

      Toine db - 2015-11-12

      nobody any clue what the best volume level is for recognition with the english models?

       
      • Nickolay V. Shmyrev

        It should not matter, amplitude is normalized during training.

         
        • Toine db

          Toine db - 2015-11-13

          OK, great news.

          But I’m confused.
          Because you mentioned (a long time ago) that amplitudes of training and recording was too far apart: in this thread message: https://sourceforge.net/p/cmusphinx/discussion/help/thread/9e85fd2f/#a09b/818b/309f/1e60/642f/eee6/d7ac/73d9/df30/0cf9/6d9b/737b/7cc2/cfc3/d580/0ec8

          Did you mean something else?
          (do I need to normalize and to what level? is it the training recordings that I need to normalize or is it nowadays done by the sphinx training automaticly)

          PS: this thread was about working with a cutom build training, not with a ready to use model from Sphinx.

           
  • Nickolay V. Shmyrev

    Custom training was not normalized, US English model training uses CMN, so it is normalized.

     
    • Toine db

      Toine db - 2015-11-18

      OK, good to know.

      So am I right if I say the normalization is aranged by the CMN during the training?

      And when I use CMN then I do not need to use any gain control on any device/mic input during recognizion?

       

      Last edit: Toine db 2015-11-18
    • Toine db

      Toine db - 2016-01-11

      Hi Nickolay,

      Sorry it took a while but I'm still working with this issue.... or question about the CMN mechanisme.

      I do not understand what you where trying to say about CMN when you said

      US English model training uses CMN, so it is normalized.

      I thought the model gets normalized when you put CMN='None' so you don't need to use any gain control. Or am I mistaking?

      PS: how is the CMN used in the Phonetic US English model?

      Hope to hear from you and finaly understand the CMN, I think thats a mayor issue in my recognition number.

       
      • Toine db

        Toine db - 2016-01-18

        @Nickolay, if you have time, could you please give your opinion about my two questions?

        It would be very helpfull for me continuing with the Phonetic addon.

         

        Last edit: Toine db 2016-01-18
        • Nickolay V. Shmyrev

          I thought the model gets normalized when you put CMN='None' so you don't need to use any gain control. Or am I mistaking?

          CMN=none means to disable cepstral mean normalization. If you disable normalization you need gain control. Gain control is another form of normalization which complements or substitutes CMN.

          PS: how is the CMN used in the Phonetic US English model?

          Acoustic model used in LVCSR and Phonetic mode is the same, there is no specifics for CMN.

          So am I right if I say the normalization is aranged by the CMN during the training?

          Yes

          And when I use CMN then I do not need to use any gain control on any device/mic input during recognizion?

          CMN does not work very well on very short samples. Gain control works faster on short samples. Gain control improves CMN if implemented properly.

           

Log in to post a comment.