CMU Sphinx / Forums / Help: lda.py failed to create LDA transform with status 0

Hi, I am trying to train an acoustic model with LDA and MLLT feature transforms, with the relevant web-page being situated at http://cmusphinx.sourceforge.net/wiki/ldamllt

If I set $CFG_LDA_MLLT = 'yes'; my training fails, (if I leave that to "no" the same training database completes successfully)

this is the log, using a small database for testing:

sphinxtrain run
Sphinxtrain path: /opt/sphinxtrain/lib/sphinxtrain
Sphinxtrain binaries path: /opt/sphinxtrain/libexec/sphinxtrain
Running the training
MODULE: 000 Computing feature from audio files
Extracting features from  segments starting at  (part 1 of 1)
Extracting features from  segments starting at  (part 1 of 1)
Feature extraction is done
MODULE: 00 verify training files
    Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
        Found 129265 words using 40 phones
    Phase 2: Checking to make sure there are not duplicate entries in the dictionary
    Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
    Phase 4: Checking number of lines in the transcript file should match lines in fileids file
    Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
        Estimated Total Hours Training: 27.0226944444444
        Rule of thumb suggests 3000, however there is no correct answer
    Phase 6: Checking that all the words in the transcript are in the dictionary
        Words in dictionary: 129262
        Words in filler dictionary: 3
    Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
MODULE: 0000 train grapheme-to-phoneme model
Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
MODULE: 01 Train LDA transformation
    Phase 1: Cleaning up directories:
        accumulator...logs...qmanager...
    Phase 2: Flat initialize
    Phase 3: Forward-Backward
        Baum welch starting for LDA, iteration: 1 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
        Normalization for iteration: 1
        Current Overall Likelihood Per Frame = -143.681494052838
        Baum welch starting for LDA, iteration: 2 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
This step had 120 ERROR messages and 0 WARNING messages.  Please check the log file for details.
        Normalization for iteration: 2
        Current Overall Likelihood Per Frame = -143.090839027346
        Convergence Ratio = 0.590655025491827
        Baum welch starting for LDA, iteration: 3 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
This step had 194 ERROR messages and 0 WARNING messages.  Please check the log file for details.
        Normalization for iteration: 3
        Current Overall Likelihood Per Frame = -141.891246586154
        Convergence Ratio = 1.19959244119201
        Baum welch starting for LDA, iteration: 4 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
This step had 226 ERROR messages and 0 WARNING messages.  Please check the log file for details.
        Normalization for iteration: 4
        Current Overall Likelihood Per Frame = -141.244086324103
        Convergence Ratio = 0.64716026205096
        Baum welch starting for LDA, iteration: 5 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
This step had 354 ERROR messages and 0 WARNING messages.  Please check the log file for details.
        Normalization for iteration: 5
        Current Overall Likelihood Per Frame = -141.049595544081
        Convergence Ratio = 0.194490780022079
        Baum welch starting for LDA, iteration: 6 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
This step had 434 ERROR messages and 0 WARNING messages.  Please check the log file for details.
        Normalization for iteration: 6
        Current Overall Likelihood Per Frame = -140.854223619843
        Convergence Ratio = 0.195371924237804
        Baum welch starting for LDA, iteration: 7 (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
This step had 476 ERROR messages and 0 WARNING messages.  Please check the log file for details.
        Normalization for iteration: 7
        Current Overall Likelihood Per Frame = -140.771353617286
        Current Overall Likelihood Per Frame = -140.771353617286
        Convergence Ratio = 0.0828700025566889
    Phase 4: LDA transform estimation
        Baum welch starting for LDA, iteration: N (1 of 1)
        0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
This step had 500 ERROR messages and 0 WARNING messages.  Please check the log file for details.
lda.py failed to create LDA transform with status 0

sphinxtrain hangs with that last line, and I have to press Ctrl-C to stop it.
Things I tried in order to solve the problem:

I downloaded sphinxtrain and reinstalled to make sure it's the last version
I tried to make sure I have the necessary modules installed, this is my output:

python
Python 2.7.3 (default, Mar 13 2014, 11:03:55) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> import scipy
>>> import scipy.optimize
>>>

I tried:

sudo apt-get install python-scipy
Reading package lists... Done
Building dependency tree       
Reading state information... Done
python-scipy is already the newest version.

I don't understand if the issue is my installation or it's related to my sphinxtrain configuration file or something else

I shared my log folder and sphinx_train.cfg if that helps: http://www.filedropper.com/folderldatar

Orest - 2015-06-26

I can't seem to find the solution, anyone knows what it can be?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2015-06-26
  
  You can run lda.py from command line and check what is going on. In case you still have troubles you can share your whole model folder.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Orest - 2015-07-06

Unfortunately I can't share the whole model folder, but I still would appreciate any kind of help

my training task for this training test is called LDA2, and after it failed, if I open LDA2/logdir/01.lda_train/LDA2.lda_train.log with a text-editor I can only see a date (for example "Mon Jul 6 12:06:03 2015"), so I don't know what parameters I'm supposed to use, however I gave it a try:

I installed sphinxtrain using --prefix=/opt/sphinxtrain

so if I run

:/opt/sphinxtrain/lib/sphinxtrain/python/cmusphinx$ ./lda.py Usage: ./lda.py OUTFILE ACCUMDIRS...

so I assume "ACCUMDIRS" refers to "bwaccumdir" folder created during the training, and "OUTFILE" refers to a output file that is created during lda.py process.

My "bwaccumdir" directory contains only one folder, in this case called "LDA2_buff_1" ($CFG_NPART = 1, $CFG_QUEUE_TYPE = "Queue";)

So I try to run lda.py in this way:

:/opt/sphinxtrain/lib/sphinxtrain/python/cmusphinx$ ./lda.py /home/sites/train/outputfileTest /home/sites/train/LDA2/bwaccumdir/LDA2_buff_1

and I get this output:

Sw: [[ 2.66700592e+08 2.84648940e+07 -2.14563200e+07 ..., 2.23908725e+06 2.14067637e+04 1.63672838e+06] [ 2.84648940e+07 2.00562064e+08 1.03686380e+07 ..., -1.45862175e+06 7.75372312e+05 -8.24868125e+05] [ -2.14563200e+07 1.03686380e+07 3.06287840e+08 ..., 2.35109800e+06 -2.00122888e+06 -5.52205820e+04] ..., [ 2.23908725e+06 -1.45862175e+06 2.35109800e+06 ..., 3.29112608e+08 3.11835000e+07 -5.43095550e+06] [ 2.14067637e+04 7.75372312e+05 -2.00122888e+06 ..., 3.11835000e+07 2.70015584e+08 2.68993420e+07] [ 1.63672838e+06 -8.24868125e+05 -5.52205820e+04 ..., -5.43095550e+06 2.68993420e+07 2.26384096e+08]] Sb: [[ 1.88875598e+08 1.01030293e+08 1.67866634e+07 ..., 8.45230936e+05 9.28340740e+05 3.17827635e+05] [ 1.01030293e+08 9.77985688e+07 1.76942269e+07 ..., 1.38454926e+04 8.15402921e+05 -6.57197107e+04] [ 1.67866634e+07 1.76942269e+07 9.20856123e+07 ..., 1.78193998e+06 3.13945561e+04 -5.76040886e+05] ..., [ 8.45230936e+05 1.38454926e+04 1.78193998e+06 ..., 5.19614766e+05 4.21012672e+04 -7.28208964e+04] [ 9.28340740e+05 8.15402921e+05 3.13945561e+04 ..., 4.21012672e+04 1.26348326e+05 2.24965849e+04] [ 3.17827635e+05 -6.57197107e+04 -5.76040886e+05 ..., -7.28208964e+04 2.24965849e+04 1.00663826e+05]] Illegal instruction

If I try to run lda.py from the build folder (the lda.py situated in):

/opt/sphinxtrain/lib/sphinxtrain/python/build/lib.linux-x86_64-2.7/cmusphinx

I get the same output. The same training folder completes successfully if I don't use LDA_MLLT training option.
Is this information useful for guessing what the reason of failure can be?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2015-07-06
  
  Illegal instruction
  
  This is the actual error which you can google for.
  
  http://mail.scipy.org/pipermail/numpy-discussion/2013-January/065247.html
  
  suggests that your scipy/numpy installation is broken. Maybe you have some old BLAS/LAPACK packages somewhere.
  
  You need to verify your installation. You may provide more information about numpy version and the way you installed it.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Orest - 2015-07-06

thanks for the help Nickolay, a dist-upgrade to the latest Debian solved the problem, so I'm not exactly sure about the specifics of the issue

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

virginia - 2016-01-20

I also have this issue.

lda_train.pl makes the call to run lda.py:

*my $rv = RunTool(catfile($ST::CFGSPHINXTRAINDIR, 'python', 'cmusphinx', 'lda.py'), $logfile, 0, $ldafile, @bwaccumdirs);*

These do not look like arguments to me.

So when lda.py checks the arguments:
* if len(sys.argv) < 3:
sys.stderr.write("Usage: %s OUTFILE ACCUMDIRS...\n" % (sys.argv[0]))
sys.exit(1)*

There are none, so the file exits, and the call:

makelda(gauden)

is never made.

I can't find any information on what the correct arguments or their order should be for lda.py. Maybe they are encoded somewhere in a python or perl helper file. I don't work with these two languages, and I may just not be seeing them.

I would really appreciate someone explaining to me what is going on here.

Thanks, V.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-01-23
  
  This piece of code is correct $ldafile and @bwaccumdirs are properly passed to python.
  
  You can try to run python from command line, most likely the reason of failure is that you do not have scipy installed as provided in documentation:
  
  http://cmusphinx.sourceforge.net/wiki/ldamllt
  
  Overall we recommend to use Linux
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Rati Skhirtladze - 2020-01-05

I have the same problem:
"ERROR: lda.py failed to create LDA transform with status 0"

I checked for numpy, scipy. They are OK
.
I tried to run lda.py from cmd, but it is not running:
Unable to create process using 'C:\Users\Rati Skhirtladze\AppData\Local\Programs\Python\Python38-32\python.exe "C:\Sphinx\sphinxtrain\python\cmusphinx\lda.py" '

I think the issue is that I am running lda.py on Windows. I tried to install Debian as Orest mentioned, but without success:
"WslRegisterDistribution failed with error: 0x8007019e
The Windows Subsystem for Linux optional component is not enabled. Please enable it and try again."

Your advise would be much appritiated.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2020-01-05
  
  You need to install Linux (not WSL but real linux), its kinda hopeless to make all things work on Windows.
  
  You also need to look into kaldi, cmusphinx is kinda outdated.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Rati Skhirtladze - 2020-01-06

Thanks for advice.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

lda.py failed to create LDA transform with status 0

Speech Recognition Toolkit

Forums

Help

lda.py failed to create LDA transform with status 0 document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

lda.py failed to create LDA transform with status 0