Menu

lda.py failed to create LDA transform with status 0

Help
Orest
2015-06-19
2020-01-06
  • Orest

    Orest - 2015-06-19

    Hi, I am trying to train an acoustic model with LDA and MLLT feature transforms, with the relevant web-page being situated at http://cmusphinx.sourceforge.net/wiki/ldamllt

    If I set $CFG_LDA_MLLT = 'yes'; my training fails, (if I leave that to "no" the same training database completes successfully)

    this is the log, using a small database for testing:

    sphinxtrain run
    Sphinxtrain path: /opt/sphinxtrain/lib/sphinxtrain
    Sphinxtrain binaries path: /opt/sphinxtrain/libexec/sphinxtrain
    Running the training
    MODULE: 000 Computing feature from audio files
    Extracting features from  segments starting at  (part 1 of 1)
    Extracting features from  segments starting at  (part 1 of 1)
    Feature extraction is done
    MODULE: 00 verify training files
        Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
            Found 129265 words using 40 phones
        Phase 2: Checking to make sure there are not duplicate entries in the dictionary
        Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
        Phase 4: Checking number of lines in the transcript file should match lines in fileids file
        Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
            Estimated Total Hours Training: 27.0226944444444
            Rule of thumb suggests 3000, however there is no correct answer
        Phase 6: Checking that all the words in the transcript are in the dictionary
            Words in dictionary: 129262
            Words in filler dictionary: 3
        Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
    MODULE: 0000 train grapheme-to-phoneme model
    Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
    MODULE: 01 Train LDA transformation
        Phase 1: Cleaning up directories:
            accumulator...logs...qmanager...
        Phase 2: Flat initialize
        Phase 3: Forward-Backward
            Baum welch starting for LDA, iteration: 1 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
            Normalization for iteration: 1
            Current Overall Likelihood Per Frame = -143.681494052838
            Baum welch starting for LDA, iteration: 2 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 120 ERROR messages and 0 WARNING messages.  Please check the log file for details.
            Normalization for iteration: 2
            Current Overall Likelihood Per Frame = -143.090839027346
            Convergence Ratio = 0.590655025491827
            Baum welch starting for LDA, iteration: 3 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 194 ERROR messages and 0 WARNING messages.  Please check the log file for details.
            Normalization for iteration: 3
            Current Overall Likelihood Per Frame = -141.891246586154
            Convergence Ratio = 1.19959244119201
            Baum welch starting for LDA, iteration: 4 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 226 ERROR messages and 0 WARNING messages.  Please check the log file for details.
            Normalization for iteration: 4
            Current Overall Likelihood Per Frame = -141.244086324103
            Convergence Ratio = 0.64716026205096
            Baum welch starting for LDA, iteration: 5 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 354 ERROR messages and 0 WARNING messages.  Please check the log file for details.
            Normalization for iteration: 5
            Current Overall Likelihood Per Frame = -141.049595544081
            Convergence Ratio = 0.194490780022079
            Baum welch starting for LDA, iteration: 6 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 434 ERROR messages and 0 WARNING messages.  Please check the log file for details.
            Normalization for iteration: 6
            Current Overall Likelihood Per Frame = -140.854223619843
            Convergence Ratio = 0.195371924237804
            Baum welch starting for LDA, iteration: 7 (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 476 ERROR messages and 0 WARNING messages.  Please check the log file for details.
            Normalization for iteration: 7
            Current Overall Likelihood Per Frame = -140.771353617286
            Current Overall Likelihood Per Frame = -140.771353617286
            Convergence Ratio = 0.0828700025566889
        Phase 4: LDA transform estimation
            Baum welch starting for LDA, iteration: N (1 of 1)
            0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 500 ERROR messages and 0 WARNING messages.  Please check the log file for details.
    lda.py failed to create LDA transform with status 0
    

    sphinxtrain hangs with that last line, and I have to press Ctrl-C to stop it.
    Things I tried in order to solve the problem:

    • I downloaded sphinxtrain and reinstalled to make sure it's the last version
    • I tried to make sure I have the necessary modules installed, this is my output:
    python
    Python 2.7.3 (default, Mar 13 2014, 11:03:55) 
    [GCC 4.7.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import numpy
    >>> import scipy
    >>> import scipy.optimize
    >>>
    
    • I tried:
    sudo apt-get install python-scipy
    Reading package lists... Done
    Building dependency tree       
    Reading state information... Done
    python-scipy is already the newest version.
    

    I don't understand if the issue is my installation or it's related to my sphinxtrain configuration file or something else

    I shared my log folder and sphinx_train.cfg if that helps: http://www.filedropper.com/folderldatar

     
  • Orest

    Orest - 2015-06-26

    I can't seem to find the solution, anyone knows what it can be?

     
    • Nickolay V. Shmyrev

      You can run lda.py from command line and check what is going on. In case you still have troubles you can share your whole model folder.

       
  • Orest

    Orest - 2015-07-06

    Unfortunately I can't share the whole model folder, but I still would appreciate any kind of help

    my training task for this training test is called LDA2, and after it failed, if I open LDA2/logdir/01.lda_train/LDA2.lda_train.log with a text-editor I can only see a date (for example "Mon Jul 6 12:06:03 2015"), so I don't know what parameters I'm supposed to use, however I gave it a try:

    I installed sphinxtrain using --prefix=/opt/sphinxtrain

    so if I run

    :/opt/sphinxtrain/lib/sphinxtrain/python/cmusphinx$ ./lda.py 
    
    Usage: ./lda.py OUTFILE ACCUMDIRS...
    

    so I assume "ACCUMDIRS" refers to "bwaccumdir" folder created during the training, and "OUTFILE" refers to a output file that is created during lda.py process.

    My "bwaccumdir" directory contains only one folder, in this case called "LDA2_buff_1" ($CFG_NPART = 1, $CFG_QUEUE_TYPE = "Queue";)

    So I try to run lda.py in this way:

    :/opt/sphinxtrain/lib/sphinxtrain/python/cmusphinx$ ./lda.py /home/sites/train/outputfileTest /home/sites/train/LDA2/bwaccumdir/LDA2_buff_1
    

    and I get this output:

    Sw:
    [[  2.66700592e+08   2.84648940e+07  -2.14563200e+07 ...,   2.23908725e+06
        2.14067637e+04   1.63672838e+06]
     [  2.84648940e+07   2.00562064e+08   1.03686380e+07 ...,  -1.45862175e+06
        7.75372312e+05  -8.24868125e+05]
     [ -2.14563200e+07   1.03686380e+07   3.06287840e+08 ...,   2.35109800e+06
       -2.00122888e+06  -5.52205820e+04]
     ..., 
     [  2.23908725e+06  -1.45862175e+06   2.35109800e+06 ...,   3.29112608e+08
        3.11835000e+07  -5.43095550e+06]
     [  2.14067637e+04   7.75372312e+05  -2.00122888e+06 ...,   3.11835000e+07
        2.70015584e+08   2.68993420e+07]
     [  1.63672838e+06  -8.24868125e+05  -5.52205820e+04 ...,  -5.43095550e+06
        2.68993420e+07   2.26384096e+08]]
    Sb:
    [[  1.88875598e+08   1.01030293e+08   1.67866634e+07 ...,   8.45230936e+05
        9.28340740e+05   3.17827635e+05]
     [  1.01030293e+08   9.77985688e+07   1.76942269e+07 ...,   1.38454926e+04
        8.15402921e+05  -6.57197107e+04]
     [  1.67866634e+07   1.76942269e+07   9.20856123e+07 ...,   1.78193998e+06
        3.13945561e+04  -5.76040886e+05]
     ..., 
     [  8.45230936e+05   1.38454926e+04   1.78193998e+06 ...,   5.19614766e+05
        4.21012672e+04  -7.28208964e+04]
     [  9.28340740e+05   8.15402921e+05   3.13945561e+04 ...,   4.21012672e+04
        1.26348326e+05   2.24965849e+04]
     [  3.17827635e+05  -6.57197107e+04  -5.76040886e+05 ...,  -7.28208964e+04
        2.24965849e+04   1.00663826e+05]]
    Illegal instruction
    

    If I try to run lda.py from the build folder (the lda.py situated in):

    /opt/sphinxtrain/lib/sphinxtrain/python/build/lib.linux-x86_64-2.7/cmusphinx
    

    I get the same output. The same training folder completes successfully if I don't use LDA_MLLT training option.
    Is this information useful for guessing what the reason of failure can be?

     
    • Nickolay V. Shmyrev

      Illegal instruction

      This is the actual error which you can google for.

      http://mail.scipy.org/pipermail/numpy-discussion/2013-January/065247.html

      suggests that your scipy/numpy installation is broken. Maybe you have some old BLAS/LAPACK packages somewhere.

      You need to verify your installation. You may provide more information about numpy version and the way you installed it.

       
  • Orest

    Orest - 2015-07-06

    thanks for the help Nickolay, a dist-upgrade to the latest Debian solved the problem, so I'm not exactly sure about the specifics of the issue

     
  • virginia

    virginia - 2016-01-20

    I also have this issue.

    lda_train.pl makes the call to run lda.py:

       *my $rv = RunTool(catfile($ST::CFGSPHINXTRAINDIR, 'python', 'cmusphinx', 'lda.py'),
         $logfile, 0,
         $ldafile, @bwaccumdirs);*
    

    These do not look like arguments to me.

    So when lda.py checks the arguments:
    * if len(sys.argv) < 3:
    sys.stderr.write("Usage: %s OUTFILE ACCUMDIRS...\n" % (sys.argv[0]))
    sys.exit(1)*

    There are none, so the file exits, and the call:

    makelda(gauden)

    is never made.

    I can't find any information on what the correct arguments or their order should be for lda.py. Maybe they are encoded somewhere in a python or perl helper file. I don't work with these two languages, and I may just not be seeing them.

    I would really appreciate someone explaining to me what is going on here.

    Thanks, V.

     
    • Nickolay V. Shmyrev

      This piece of code is correct $ldafile and @bwaccumdirs are properly passed to python.

      You can try to run python from command line, most likely the reason of failure is that you do not have scipy installed as provided in documentation:

      http://cmusphinx.sourceforge.net/wiki/ldamllt

      Overall we recommend to use Linux

       
  • Rati Skhirtladze

    I have the same problem:
    "ERROR: lda.py failed to create LDA transform with status 0"

    I checked for numpy, scipy. They are OK
    .
    I tried to run lda.py from cmd, but it is not running:
    Unable to create process using 'C:\Users\Rati Skhirtladze\AppData\Local\Programs\Python\Python38-32\python.exe "C:\Sphinx\sphinxtrain\python\cmusphinx\lda.py" '

    I think the issue is that I am running lda.py on Windows. I tried to install Debian as Orest mentioned, but without success:
    "WslRegisterDistribution failed with error: 0x8007019e
    The Windows Subsystem for Linux optional component is not enabled. Please enable it and try again."

    Your advise would be much appritiated.

     
    • Nickolay V. Shmyrev

      You need to install Linux (not WSL but real linux), its kinda hopeless to make all things work on Windows.

      You also need to look into kaldi, cmusphinx is kinda outdated.

       
  • Rati Skhirtladze

    Thanks for advice.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.