Menu

accuracy didn't improve on my adaptation set

Help
kk_huk
2016-07-22
2016-08-02
  • kk_huk

    kk_huk - 2016-07-22

    Hi guys,

    I have tried to adapt en-us acoustic model following the document. I have also mixed my language model with existing en-us.lm file following this document.

    Anyway, I have created a record "close active transaction" . And test it using sphinx4 with the models that I mentioned above. Sphinx4 got the result like "close to transaction".
    footnote : My language model have the "close active transaction" sentence

    So, to get better accuracy, my acoustic model needs to be adapted . So I did.

    These are my steps:

    1 - Generatin acoustic feature files
    2 - Converting the sendump and mdef files
    3 - Accumulating observation counts
    4 - Creating transformation with MLLR
    5 - MLLR doceding
    6 - Updating the acoustic model files with MAP
    7 - Recreating the adapted sendump file
    8- Tunning speech recognition accuracy creating hyp file
    9- Tunning speech recognition accuracy word align script test

    Default en-us model is phonetically tied. You need to use map adaptation for it, not mllr.

    I ignored 4. and 5. step because of nickolay's answer.

    Every step has a log file and they have no any error ( you can check them )

    I have adapted my acoustic model 10 times using the same record to get better accuracy. However the result is the same.

    I have uploaded my models, so you can check them with this link .

    What Am I Missing Here?

     

    Last edit: kk_huk 2016-07-23
    • Nickolay V. Shmyrev

      What Am I Missing Here?

      MAP adaptation requires about 10-20 utterances, you have just 1.

      Adaptation is also not quite effective for the accented speech. It is better to train the model instead and adapt the dictionary to your pronunciation.

       
      • kk_huk

        kk_huk - 2016-07-25

        Adaptation is also not quite effective for the accented speech
        This is a really bad news for me. I spent lots of time to adapt acoustic model automaticaly. I hoped that adaptation will be effective to get better accuacy.

        In Training Acoustic Model For CMUSphinx documentation :

        When you don't need to train
        You need to improve accuracy - do acoustic model adaptation instead

        And in Adapting the default acoustic model tutorial:

        you can adapt to your own voice to make dictation good, but you also can adapt to your particular recording environment, your audio transmission channel, your accent or accent of your users.

        I know that I need more recordings to adaptation for accent speech accourding to the latter tutorial.

        If you are adapting to a channel, accent or some other generic property of the audio, then you need to collect a little bit more recordings manually.

        Anyway, I need to make progress in that with adapting acoustic model. And I really need you guys help to adapt more effectively.

        MAP adaptation requires about 10-20 utterances, you have just 1.
        Can I use the same record with 19 copies for that ? Does it work ? or should it be necessary different records ?

        Edit: I have tested 19 copied record for that but doesn't change anything. Until I add an alternative phonetic speel that fits in my accent like below. It works.
        active(2) AA K T IH V

        adapt the dictionary to your pronunciation.
        Do you mean that I need to change each word's phonetic spell in terms of the accent manually? We are talking about hundred of thousands words.

        if you mean, Is there any tool that you know to write this dynamically ?

        Thanks.

         

        Last edit: kk_huk 2016-07-25
        • Nickolay V. Shmyrev

          I have tested 19 copied record for that but doesn't change anything.

          It is not reasonable to copy same recording, you need different recordings. Also, adaptation works for slight accents, it does not work for such a different pronunciation like yours.

          if you mean, Is there any tool that you know to write this dynamically ?

          There is no such tool, you can develop it yourself.

           
          • Nickolay V. Shmyrev

            This is a really bad news for me. I spent lots of time to adapt acoustic model automaticaly. I hoped that adaptation will be effective to get better accuacy.

            You could spend much more time trying wrong way avoiding to ask proper questions with necessary details.

             
            • kk_huk

              kk_huk - 2016-07-26

              It is not reasonable to copy same recording, you need different recordings.

              Ye, I tested, It doesn't work with this way.

              Also, adaptation works for slight accents, it does not work for such a different pronunciation like yours.

              Could you please give me a sample/scenario to test adaptation process whether It works or not ? I need to ensure that I adapt my acoustic model properly. Because, either my whole test cases were already works before the adaptation process or the process doesn't change any accuracy just like my first/main question in this file.

              Thanks for your response Nickolay,

               
              • Nickolay V. Shmyrev

                If you are unable to record artic sentences yourself you can download them on festvox website:

                http://festvox.org/cmu_arctic/cmu_arctic/cmu_us_slt_arctic/wav/

                 
                • kk_huk

                  kk_huk - 2016-07-27

                  If you are unable to record artic sentences yourself you can download them on festvox website:

                  I can record a sound properly to use in my adaption process. You probably misunderstood me. I just need to ensure whether I adapt properly, or I don't. I need to show the adaptation process improve the accuracy.

                  In my adaptation process, it doesn't change anything.

                  For example, I called "just show" speech using my models and dictionary in sphinx4. And I need to tell you, the words were called using en-us accent. The sphinx4 gave me "just shawl" as a result. I check my dic file to ensure that words were called with right way.
                  I mean, In "close active transaction" sample, I was calling AActive instead of AE C tive. That is why It does't get the right result ( I guess) . So I went through my dic file to check the words pronunciation speeling.

                  in my dic file:

                  shawl SH AO L
                  show SH OW

                  I realised that I have already called SHOW properly, So, to change the wrong result, adaptation would be great to get improve the accuracy. I have recorded 10 different utterances "Just show" using en-us accent. With these sources, I have adapted my acoustic model. After this process, I have tested, and It doesn't change anything.

                  Oh my gosh, Why ?

                   

                  Last edit: kk_huk 2016-07-27
                  • Nickolay V. Shmyrev

                    You probably misunderstood me. I just need to ensure whether I adapt properly, or I don't.

                    You did not adapt properly

                    I have recorded 10 different utterances "Just show" using en-us accent.

                    You need to record diverse sentences as explained in tutorial, not same sentence 10 times.

                     
                    • kk_huk

                      kk_huk - 2016-07-27

                      You did not adapt properly
                      For every step of the adaptation, I am creating a log file. There is no any error and warning message.

                      Could you please go through to my adaptation folder for that ? In this folder, there is a bat file that I run for the adaptation process.So, you can also check my adaptest.bat file.

                      You need to record diverse sentences as explained in tutorial, not same sentence 10 times.

                      I really don't get it . 10 different "Just show" adaptation didn't work. But I should expect that 10 diverse sentences records will adapt my acoustic model to "just show". How it can be ?

                      By the way, my main aim is improve accuracy of "just show" sentences. To improve it, I need to use some record that are different sentences from "just show" sentence. I called below sentences respectively.

                      1. just show
                      2. do sometimes
                      3. evil queen
                      4. ex boyfriend
                      5. fewer errors
                      6. fiber optic
                      7. field position
                      8. fifteen balls
                      9. fifty hour
                      10. grammatical rules

                      After the adaptation process, I have tested. By the way, my test record is also in Test folder. It was tested by adapted acoustic model. The result is still wrong.
                      "just shawl"

                      please check my folder.

                       

                      Last edit: kk_huk 2016-07-27
                      • Nickolay V. Shmyrev

                        You have several issues

                        1) You do not have enough data for training. The adaptation set is 20 large sentences, not 10 small ones
                        2) Your language model is broken. It has preplexity of 7000 on test set, even default model is better than yours. You need to prepare language model properly otherwise it will damage any improvement in acoustic model.

                         
                        • kk_huk

                          kk_huk - 2016-07-28

                          Hi Nickolay,

                          1) You do not have enough data for training. The adaptation set is 20 large sentences, not 10 small ones

                          As I mentioned before, I just need to improve accuracy for command-texts. If I use 20 long senteces, how it can help it ?.

                          2) Your language model is broken. It has preplexity of 7000 on test set, even default model is better than yours. You need to prepare language model properly otherwise it will damage any improvement in acoustic model.

                          The default acoustic model was adapted using large sentences. And if I try to adapt the model using some small ones. It will not help to get better result.

                          Am I right ? If I am right,so I need a acoustic model that is trained using command-texts.

                          Is there any command-control acoustic model that I can use for that ?

                          By the way, Have you gone through my bat. file that has my whole cmd command lines for the adaptation ? It is created considering the documetation. And I just want to ensure, there is no wrong commands in there.

                          Thanks alot

                           

                          Last edit: kk_huk 2016-08-01
                          • kk_huk

                            kk_huk - 2016-08-02

                            Is there anyone to help me ? I need some help everyone. I am badly stuck on adaptation issue and just want some advice.

                             

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.