Menu

About training acoustic model

Help
kk_huk
2016-07-20
2020-07-20
  • kk_huk

    kk_huk - 2016-07-20

    I have followed the tutorial to train my acoustic model. However
    I have stucked on some step. Because It is not enough clear to understand for me.

    You will need to install software as an administrator root.

    Which software ? Does it mean that each softwares should be installed such as sphinxbase, pocketsphinx etc.

    If it does, I have already installed them.

    After you installed the software you may need to update the system configuration so the system will be able to find the dynamic libraries. For example

    I have used the cygwin64 to and my root file inside of it. However I don't know what "system configuration" file

    export PATH=/usr/local/bin:$PATH
    export LD_LIBRARY_PATH=/usr/local/lib
    export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
    

    Which file should I update with these code block to ?

    After that, when I run this code block, It creates 2 empty file that are feat.params and sphinx_train.cfg. Probably I have missed something to get these.

    python ../sphinxtrain/scripts/sphinxtrain -t mytest setup
    
     

    Last edit: kk_huk 2016-07-20
  • Arseniy Gorin

    Arseniy Gorin - 2016-07-20

    If it produces some output without throwing an error message, it is likely that you installed things properly.

    Looks like your "mytest" folder has wrongly prepared data files.

    Can you first run as the tutorial states for an4 folder? If this works and you want to train on your own data, you can further check if things are fine there.

     
    • kk_huk

      kk_huk - 2016-07-20

      I have tried this code for an4 folder. The output is the same. It gives me 2 empty files that are feat.params and sphinx_train.cfg inside of etc folder

       
      • Arseniy Gorin

        Arseniy Gorin - 2016-07-20

        Are you sure in this case an4 folder location was specified correctly? Try to find it and put an absolute path.
        If this does not help, post the full training log to trace your problem

         
        • kk_huk

          kk_huk - 2016-07-21

          mytest folder is located just like an4 folder. You can see in 3 attachments.

          C:\cygwin64\adaptation\an4>python C:\cygwin64\adaptation\sphinxtrain\scripts/sphinxtrain -t an4 setup

          There is no error message. The output is :

          Sphinxtrain path: C:/cygwin64/adaptation/sphinxtrain
          Sphinxtrain binaries path: C:/cygwin64/adaptation/sphinxtrain/bin/Release/Win32
          Setting up the database an4
          

          and 2 empy files inside of etc file

           

          Last edit: kk_huk 2016-07-21
          • Arseniy Gorin

            Arseniy Gorin - 2016-07-21

            Try to run python from C:\cygwin64\adaptation not from C:\cygwin64\adaptation\an4.
            Training does not see an4 folder

             
            • kk_huk

              kk_huk - 2016-07-21

              I have changed like you said

              C:\cygwin64\adaptation>python C:\cygwin64\adaptation\sphinxtrain\scripts/sphinx
              train -t /an4 setup
              

              The output is :

              Sphinxtrain path: C:/cygwin64/adaptation/sphinxtrain
              Sphinxtrain binaries path: C:/cygwin64/adaptation/sphinxtrain/bin/Release/Win32
              Setting up the database /an4
              

              Unfortunately, two empty files that are feat.params and sphinx_train.cfg inside of etc folder again.

               
              • Arseniy Gorin

                Arseniy Gorin - 2016-07-21

                I went through the tutorial on cygwin. I see no problem like the one you have.

                1) I noticed in your last command you use "/an4". It should be "an4". And please before running python check if it can be found and has files inside (do "ls an4" or whatever you put in train script option)
                2) Another thing is "system configuration file". It is located in ~/.bashrc (/home/Admin/.bashrc). You should open it with text editor and add the lines from the tutorial.

                Hope it helps

                 

                Last edit: Arseniy Gorin 2016-07-22
                • kk_huk

                  kk_huk - 2016-07-22

                  Hi Arseniy,

                  1) I noticed in your last command you use "/an4". It should be "an4". And please before running python check if it can be found and has files inside (do "ls an4" or whatever you put in train script option)

                  I tried it, But the result is same.

                  2) Another thing is "system configuration file". It is located in ~/.bashrc (/home/Admin/.bashrc). You should open it with text editor and add the lines from the tutorial.

                  I have already done as you said. I don't know why but, the result is the same :/

                   
                  • kk_huk

                    kk_huk - 2016-07-28

                    Hi again Arseniy,

                    I have reinstall all tools such as sphinxbase, pocketsphinx and sphinxtrain. And run below code on terminal.

                    python C:\cygwin64\adaptation\sphinxtrain\scripts/sphinxtrain -t an4 setup

                    Eventually, feat.params and sphinx_train.cfg inside of etc. However, after the process, there is no any folder like logdir, model-parameters, model-architecture and result as indicated in the documentation.

                    etc
                    wav

                    After training other data folders will be created, the database should look like this:

                    etc
                    feat
                    logdir
                    model_parameters
                    model_architecture
                    result
                    wav

                    You can see my adaptation and etc folder as attachments

                     
                    • Arseniy Gorin

                      Arseniy Gorin - 2016-07-28

                      Sorry, I really could not find out why this does not work for you...

                       
                      • Arseniy Gorin

                        Arseniy Gorin - 2016-07-29

                        So, if I understood correctly after discussing with you, this was essentially a bug in config file (wav used instead of sph).

                        In the future please check log files and provide them for us to find the solution easier.

                         
  • kk_huk

    kk_huk - 2016-08-01

    Hi Arseniy,

    After running the below code, bwaccumdir, feat, logdir folder and an4.html file are created.

    python C:\cygwin64\adaptation\sphinxtrain\scripts/sphinxtrain run

    I have checked the log files. And there are some errors.

    ERROR: "sphinx_fe.c", line 118: Failed to open C:/cygwin64/adaptation/an4/wav/an4test_clstk/fcaw/an406-fcaw-b.wav: No such file or directory

    In an4.html file has also some warning messages. For instance,

    WARNING: Error in 'C:/cygwin64/adaptation/an4/etc/an4_train.fileids', the feature file 'C:/cygwin64/adaptation/an4/feat/an4_clstk/fash/an251-fash-b.mfc' does not exist, or is empty

    I have attached my training data. Could you please check my trainning folder to help me to fix my problem.

    http://www.filedropper.com/an4_1

     

    Last edit: kk_huk 2016-08-01
    • Arseniy Gorin

      Arseniy Gorin - 2016-08-01

      Sorry, I thought I sent you the solution in personal responce after you provided the files a few days ago. Just in case I duplicate this:
      Your audio files are in sph format (NIST). You should modify your config file:

      Replace
      $CFG_WAVFILE_EXTENSION = 'wav';
      $CFG_WAVFILE_TYPE = 'mswav'; # one of nist, mswav, raw

      With
      $CFG_WAVFILE_EXTENSION = 'sph';
      $CFG_WAVFILE_TYPE = 'nist'; # one of nist, mswav, raw

       
  • kk_huk

    kk_huk - 2016-08-02

    Hi Arseniy,

    There is no any message that is sent by you. Probably, there is something wrong at this website.

    Anyway, Thanks a lot. I hadn't realised the different sound file typle of an4 folder, before you told to

    me. I quickly changed the config file. Unfortunately, I have got some different errors as well.

    In cmd terminal :

    ERROR: This step had 1 ERROR messages and 0 WARNING messages.  Please check the
    log file for details.
    
    Phase 3: Forw``ard-Backward
        Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
    Waiting for C:/cygwin64/adaptation/an4/model_parameters/an4.ci_cont_flatinitial/
    mixture_weights
    Waiting for C:/cygwin64/adaptation/an4/model_parameters/an4.ci_cont_flatinitial/
    mixture_weights
    Waiting for C:/cygwin64/adaptation/an4/model_parameters/an4.ci_cont_flatinitial/
    mixture_weights
    ...
    

    The above message keeps writing repeatedly.

    I have also checked log files ensure whether there is some error message or not.

    *In an4.cpmeancihmm.log file : *

    ERROR: "s3io.c", line 260: Unable to open C:\cygwin64\adaptation\an4\model_parameters
    
    \an4.ci_cont_flatinitial\globalmean for reading: No error
    Tue Aug  2 11:30:49 2016
    

    *In an4.cpvarcihmm.log file : *

    ERROR: "s3io.c", line 260: Unable to open C:\cygwin64\adaptation\an4\model_parameters
    
    \an4.ci_cont_flatinitial\globalvar for reading: No error
    Tue Aug  2 11:30:49 2016
    

    *In an4.makeflatcihmm.log file: *

    ERROR: "model_def_io.c", line 413: Unable to open C:/cygwin64/adaptation/an4/model_architecture/an4.ci.mdef 
    
    for reading: No error
    Tue Aug  2 11:30:46 2016
    

    *In an4.normmeancihmm.log file : *

    ERROR: "s3io.c", line 260: Unable to open C:/cygwin64/adaptation/an4/bwaccumdir/an4_buff_1/gauden_counts 
    
    for reading: No error
    

    *In an4.normvarcihmm.log file : *

    ERROR: "s3io.c", line 260: Unable to open C:/cygwin64/adaptation/an4/bwaccumdir/an4_buff_1/gauden_counts 
    
    for reading: No error
    
     

    Last edit: kk_huk 2016-08-02
    • Nickolay V. Shmyrev

      All your troubles are due to cygwin. You either need to use native Windows sphinxtrain or Linux, I don't think we support Cygwin.

       
  • kk_huk

    kk_huk - 2016-08-03

    All your troubles are due to cygwin. You either need to use native Windows sphinxtrain or Linux, I don't think we support Cygwin.

    Are you sure about that Nickolay?

    Your collegue who is Arseniy doesn't agree with you.

    Arseniy Gorin - 2016-07-21
    I went through the tutorial on cygwin. I see no problem like the one you have.

     
    • Arseniy Gorin

      Arseniy Gorin - 2016-08-03

      In general, I agree with Nickolay that Cygwin is kind of not supported and can produce various bugs. My personal opinion is that using Windows for open-source project is just wrong from many technical, ideological and general spirit points of view. So I'd suggest to install Linux asap for your experiments, which will make life easier and people around happier :)

      Nevertheless, it is true I succeeded running an4 example on Cygwin installed (not without pain though) on a Windows XP virtual machine running under Ubuntu. I strictly followed the tutorial, downloaded and installed sphinx5-prealpha as described here.

      The only things I had to modify in the config were:
      - replase wav with sph as I suggested earlier
      - language model name to make decoder happy: $CFG_DB_NAME.ug.lm.DMP

      Then I run from an4 folder: "sphinxtrain run"

      I attach the output and the full cfg file.

      As you can see, I also have some errors, but they are referred to misaligned utterances,

              Convergence Ratio = 0.12839726834477
              Baum welch starting for 1 Gaussian(s), iteration: 4 (1 of 1)
              0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
      ERROR: This step had 2 ERROR messages and 0 WARNING messages.  Please check the log file for details.
              Normalization for iteration: 4
      

      This can happen due to not accurate initial acoustic models or not precise correspondance between transcript and audio for those utterances. We can see this by grep in log messages of the trainer:

       grep ERR logdir/50.cd_hmm_tied/an4.*
       logdir/50.cd_hmm_tied/an4.2.2-1.bw.log:ERROR: "baum_welch.c", line 324: mdxs/cen7-mdxs-b ignored
      logdir/50.cd_hmm_tied/an4.2.3-1.bw.log:ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
      logdir/50.cd_hmm_tied/an4.2.3-1.bw.log:ERROR: "baum_welch.c", line 324: mdxs/cen7-mdxs-b ignored
      logdir/50.cd_hmm_tied/an4.2.4-1.bw.log:ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
      logdir/50.cd_hmm_tied/an4.2.4-1.bw.log:ERROR: "baum_welch.c", line 324: mdxs/cen7-mdxs-b ignored
      

      Make sure you do install the 5-prealpha sphinx version. If not, I vote for switching to Linux.

       

      Last edit: Arseniy Gorin 2016-08-03
  • ijazulhassan

    ijazulhassan - 2020-07-20

    can you plz help I have only error in .mfc where it says the feat .mfc files does not exist or empty I am attaching my db with log and cfg file. below is the log file and db

     

    Last edit: ijazulhassan 2020-07-20
  • ijazulhassan

    ijazulhassan - 2020-07-20

    here is the other files!

     

Log in to post a comment.