CMU Sphinx / Forums / Help: Lda Matrix Problem

Marco - 2010-11-30

Hi
I'm following the tutorial http://cmusphinx.sourceforge.net/wiki/LDAMLLT to train my italian acoustic model
with LDA and MLLT feature transforms. Anyway i have this error

MODULE: 06 Train MLLT transformation
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...
Phase 2: Flat initialize
FATAL_ERROR: "main.c", line 98: Failed to read LDA matrix
This step had 1 ERROR messages and 0 WARNING messages. Please check the log
file for details.

i looked the logdir file

/home/marco/SpeechRecognition/evalita/bin/init_gau \
-ctlfn /home/marco/SpeechRecognition/evalita/etc/evalita_train.fileids \
-part 1 \
-npart 1 \
-cepdir /home/marco/SpeechRecognition/evalita/feat \
-cepext mfc \
-accumdir /home/marco/SpeechRecognition/evalita/bwaccumdir/evalita_buff_1 \
-agc none \
-cmn current \
-varnorm no \
-feat 1s_c_d_dd \
-ceplen 13 \
-ldafn /home/marco/SpeechRecognition/evalita/model_parameters/evalita.lda \
-ldadim 29

-help no no
-example no no
-moddeffn
-ts2cbfn
-accumdir /home/marco/SpeechRecognition/evalita/bwaccumdir/evalita_buff_1
-meanfn
-fullvar no no
-ctlfn /home/marco/SpeechRecognition/evalita/etc/evalita_train.fileids
-nskip
-runlen
-part 1
-npart 1
-lsnfn
-dictfn
-fdictfn
-segdir
-segext v8_seg v8_seg
-scaleseg no no
-cepdir /home/marco/SpeechRecognition/evalita/feat
-cepext mfc mfc
-silcomp none none
-cmn current current
-varnorm no no
-agc max none
-feat 1s_c_d_dd 1s_c_d_dd
-svspec
-ceplen 13 13
-cepwin 0 0
-ldafn /home/marco/SpeechRecognition/evalita/model_parameters/evalita.lda
-ldadim 29 29
WARN: "s3io.c", line 256: Unable to open
/home/marco/SpeechRecognition/evalita/model_parameters/evalita.lda for
reading; No such file or directory
ERROR: "lda.c", line 63:
s3open(/home/marco/SpeechRecognition/evalita/model_parameters/evalita.lda, rb)
failed; No such file or directory
FATAL_ERROR: "main.c", line 98: Failed to read LDA matrix
Tue Nov 30 19:50:28 201

I have python working right and modules numpy scipy installed correctly. I
looked at topic https://sourceforge.net/projects/cmusphinx/forums/forum/5471/
topic/3846734/index/page/1 and i discovered that this can happen
with ubuntu.
I have ubuntu 9.04. i'm using sphinx 3.0.8 and SphinxTrain 1.0

Thanks in advance for any suggestion

Marco

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-12-02

Marco.

Matrix was not created because previous training stage failed. You need to
check previous training stage logs, not last training stage logs.

I also recommend you to follow the recommendation you were already given as
well as recommendations in the thread you cited. I don't have much to add.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Marco - 2010-12-02

Hi Nickolay,

I have checked log file related to MODULE: 05 in which I Train LDA
transformation and of 5862 utterances i have 959
ERRORS like this

utt> 5858 noisy2984 1037 0 308 33 ERROR: "backward.c", line 431: final state
not reached

ERROR: "baum_welch.c", line 331: training_raw/noisy2984 ignored

I have also this WARNING: "mod_inv.c", line 257: n_top 8 > n_density 1. n_top
<- 1

Can you help me to resolve this error?

Cheers

Marco

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-12-02

I have checked log file related to MODULE: 05 in which I Train LDA
transformation and of 5862 utterances i have 959 ERRORS like this utt> 5858
noisy2984 1037 0 308 33 ERROR: "backward.c", line 431: final state not reached
ERROR: "baum_welch.c", line 331: training_raw/noisy2984 ignored

This is a pretty standard warning that the audio you have doesn't match the
transcription.

I have also this WARNING: "mod_inv.c", line 257: n_top 8 > n_density 1.
n_top <- 1

You can ignore this

The real issue you have is that it fails to run python. Python script output
must be available in

model_name.lda_train.log│

and this log should contain a line

LDA training complete

If you don't have this file it means you didn't properly copy python scripts
to a database folder or there is some other issue.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Marco - 2010-12-02

Thanks for the information nickolay

I have controlled my file evalita.lda_train.log. The output is:

Thu Dec 2 15:37:26 2010
Thu Dec 2 15:37:26 2010
LDA training complete

It seems python works

So you think the problem is related to the audio that doesn't match the
transcription?

How Can i resolve it?

Sorry if I make a lot of questions

Marco

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-12-02

Do the following.

1) Run lda training stage.
2) Check if matrix file model_parameters/your_name.lda is created

If it's not created run

./scripts/lda_train/lda_train.pl

If file is still not created run python script manually. To get the command to
run, put the following lines in the file lda_train.pl:

print catfile($ST::CFG_BASE_DIR, 'python', 'sphinx', 'lda.py'), $logfile, 0, $ldafile, @bwaccumdirs);

Make sure that lda.py is located in training_folder/python/sphinx/lda.py and
not in training_folder/python/cmusphinx/lda.py (see difference between sphinx
and cmusphinx).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Berker Batur - 2010-12-06

Hi Marco,

In sphinxTrain there is python folder.

Go to this directory and type:

python setup.py

Than copy this python folder to your /home/marco/SpeechRecognition/evalita/
Go to /home/marco/SpeechRecognition/evalita/python/

python setup.py

Than change permissions of python folder.

sudo chmod -R 777 /home/marco/SpeechRecognition/evalita/python/

Now try training. I have faced with same error before, than I found out that,
in training procedure, Sphinx looked python folder in my an4 directory(your
/home/marco/SpeechRecognition/evalita/ directory) but not in SphinxTrain
folder.
Than I made changes above and than I have successfully train with LDA & MLLT
feature transforms in Sphinx 3.
Maybe one or more steps above could not be necessary but that's how I solved
the problem.

Berker.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Lda Matrix Problem

Speech Recognition Toolkit

Forums

Help

Lda Matrix Problem document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Lda Matrix Problem