CMU Sphinx / Forums / Help: Fatal Error with SphinxTrain and AN4 database

creative64 - 2010-07-07

Hi,

I'm trying to build the acoustic model for AN4 database (targeted for
PocketSphinx) as per the tutorial provided
at "http://cmusphinx.sourceforge.net/html/tutorial.html". Everythhing seems to be going fine till the
point where
make_s2_models.pl is run. At this point I'm getting a FATAL ERROR from
mk_s2sendump.c.

End part of an4.html is pasted below.

#####################################################################

MODULE: 90 deleted interpolation (2010-07-07 17:36)
Phase 1: Cleaning up directories: logs...

Phase 2: Doing interpolation...

delint Log File

WARNING: This step had 0 ERROR messages and 6 WARNING messages. Please check
the log file for details.

completed
Phase 3: Dumping senones for PocketSphinx...

mk_s2sendump Log File

completed

MODULE: 99 Convert to Sphinx2 format models (2010-07-07 17:36)
Phase 1: Cleaning up old log files...

Phase 2: Copy noise dictionary

Phase 3: Make codebooks

Log File
mk_s2cb Log File

completed
Phase 4: Make chmm files

mk_s2hmm Log File

completed
Phase 5: Make senone file

Log File
mk_s2sendump Log File

FATAL_ERROR: "........\src\programs\mk_s2sendump\mk_s2sendump.c", line 199:

States(3) != 5

FAILED

#######################################################################

Note: My platform in Windows7, SphinxTrain and AN4 tarballs, are obtained from
the links provided in the
above mentioned tutorial, Little Endian database for AN4 is selected and
Microsoft Visual C++ 2008 express
is used for compiling SphinxTrain.

What could be causing this behavior ?

Thanks,

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-07-07

Tutorial needs little upgrade but in short:

** you can skip this step**

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

creative64 - 2010-07-08

Hi NS,

Thanks,

Should I not run "\99.make_s2_models\make_s2_models.pl" at all or need to mask-off parts within this script ?

Few more:

After creating the AN4 acoustic model, I'd like to use it to decode utterances provided in AN4 database with my
regular pocketsphinx_batch setup. Which directory should I take that will have
all the hmm parameters (i.e the file to
be used for "-hmm" argument in pocketsphinx.batch)

SphinxTrain tutorial talkes of a filler dictionary along with regular dictionary for training as well as for decoding.
In my experience with pocketsphinx so far, I'm used to giving only one
dictionary file (aregument for "-dict" in
pocketsphinx_batch). How do I provide filler dictionary to pocketsphinx_batch
?

Thanks,
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

creative64 - 2010-07-08

Should I not run "\99.make_s2_models\make_s2_models.pl" at all or need to mask-off parts within this script ?

After creating the AN4 acoustic model, I'd like to use it to decode utterances provided in AN4 database with my
regular pocketsphinx_batch setup. Which directory should I take that will have
all the hmm parameters (i.e the file to
be used for "-hmm" argument in pocketsphinx.batch)

Note: I used an4.cd_semi_1000 hmm directory and was successful in running
pocketsphinx_bat with language model
provided with an4. Tried decoding some of the test files provided. Accuracy
wasn't too good but the flow worked ;-)

Am I picking the right model directory. There are 2 more directories that have
the same or a later timestamps
(an4.cd_semi_1000-delinterp and an4.cd_semi_1000.s2models....)

SphinxTrain tutorial talkes of a filler dictionary along with regular dictionary for training as well as for decoding.
In my experience with pocketsphinx so far, I'm used to giving only one
dictionary file (aregument for "-dict" in
pocketsphinx_batch). How do I provide filler dictionary to pocketsphinx_batch
?

Note: In the above mentioned decoder run, I didn't bother to use the filler
directory. Hope it is OK.

Thanks,
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-07-08

Should I not run "\99.make_s2_models\make_s2_models.pl" at all or need
to mask-off parts within this script ?

Yes, just don't run it.

After creating the AN4 acoustic model, I'd like to use it to decode
utterances provided in AN4 database with my regular pocketsphinx_batch setup.
Which directory should I take that will have all the hmm parameters (i.e the
file to be used for "-hmm" argument in pocketsphinx.batch) Note: I used
an4.cd_semi_1000 hmm directory and was successful in running pocketsphinx_bat
with language model provided with an4. Tried decoding some of the test files
provided. Accuracy wasn't too good but the flow worked ;-) Am I picking the
right model directory. There are 2 more directories that have the same or a
later timestamps (an4.cd_semi_1000-delinterp and
an4.cd_semi_1000.s2models....)

You picked the right one

SphinxTrain tutorial talkes of a filler dictionary along with regular
dictionary for training as well as for decoding. In my experience with
pocketsphinx so far, I'm used to giving only one dictionary file (aregument
for "-dict" in pocketsphinx_batch). How do I provide filler dictionary to
pocketsphinx_batch ?

You can provide filler dictionary with -fdict option but actually you
shouldn't worry about that. Filler dictionary is automatically placed inside
the model (an4.cd_semi_1000/noisedict) and automatically loaded when you
provide model with -hmm option.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

creative64 - 2010-07-08

Thanks so much NS.

I have another related question:

Suppose I want to create an accoustic model which caters to only a "single person". Will it be Ok to train the model with
let's say 100 sentences spoken by the person. These are short command-and-
control type of sentences and the
assumptioin is that "only that person" will use the system and the he/she will
only use sentences out of these 100
for using the system.

Will SphinxTrain be able to train the models with small data provided for
above scenario ?

Can I extend the same thing to cater to say 4 persons....... that means I'll train the model with 100 sentences
spoken by all 4 users and only they will use the system by speaking any of
these sentences.

Thanks,
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-07-08

Suppose I want to create an accoustic model which caters to only a
"single person". Will it be Ok to train the model with let's say 100 sentences
spoken by the person. These are short command-and-control type of sentences
and the assumptioin is that "only that person" will use the system and the
he/she will only use sentences out of these 100 for using the system. Will
SphinxTrain be able to train the models with small data provided for above
scenario ?

It's better to adapt generic model to the specific person in that case. You
can't train anything good with 100 sentences.

Can I extend the same thing to cater to say 4 persons....... that means
I'll train the model with 100 sentences spoken by all 4 users and only they
will use the system by speaking any of these sentences.

Again, this is the case where it's better to use generic model adaptation.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

creative64 - 2010-07-09

Thanks NS,

I was coming more from the model size point of view ("adapted generic model"
vs "newly trained user specific for a limited vocabulary task model") but from
your comments, looks like the generic one will be much better in accuracy.
Thanks
for comments again.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

creative64 - 2010-07-09

Hi NS,

Just to get a feel of creating an acoustic model, I went ahead and started
training one for myself (based on above mentioned
100 utterances). This went fine upto the point where "Baum Welch" started.
Then the .exe stopped with windows message
."bw.exe has stopped working".

Could this be because of non convergance of algorithm due to small amout of
data or it is something else missing here ?

log file is uploaded at "http://www.mediafire.com/file/wrhteeoxzym/an4.html"

PS: Just to make it work I also tried to inflate the data by increasing the
number of utterances simply by duplicating the files (and making suitable
adjustments in .fileids and .transcription files) to make the program feel
that the data has
increased (there was no good logic in doing this just wanted to see if it made
any differences....)

Thanks,

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

creative64 - 2010-07-10

Hi NS,

Tried another run. This time with 100 sentences each from 4 speakers
(totalling 0.27 hours of recording). "Baum Welch"
failed in iteration 1 exactly the same way as before (log available at http:/
/www.mediafire.com/file/jizn2xjnwmt/an4.html).
My data is recorded at 16 Khz and has mono audio. Is insufficient data or
something else ?

Regards,

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-07-10

In order to find the reason of your problem you need to check training logs
for corresponding steps and for earlier steps. Training logs are located in
logdir folder.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

creative64 - 2010-07-12

Hi NS,

Thankx.

I looked into logdir directory and tried comparing it with working an4
training run directory. Here is what I'm seeing,

In my run, I see only two directories "05.vector_quantize" and "20.ci_hmm" created.

"05.vector_quantize looks" OK whereas "20.ci_hmm" has only 4 fiels
"an4.makeflat_cihmm.log" ------- looks OK
"an4.make_ci_mdef_fromphonelist.log" ------- looks OK
"an4.1.1-1.bw.log" ------- Doesn't look OK. It abruptly terminates.
" an4.1.1.norm.log" ------- Doesn't look OK. Has the error message 'Only 0
parts of 1 of Baum
Welch were successfully completed Parts 1 failed to run!"

Is there something wrong with the format of my .dic, .filler, .phone, .fileids or .transcription files due to which
"an4.1.1-1.bw.log" shows an abrupt termination !!!

I'm putting some relevant directories of my database at "http://www.mediafire
.com/file/t54egzdneid/an4.zip".

Regards,
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-07-12

Your source files are crazy. They are full of windows-style newlines, empty
lines in the dictionary (you are the first who did that), spaces after phones
in the end of lines. You have two ways to solve this problem:

1) Cleanup all whitespaces and make all input files have proper format
2) Download and use latest SphinxTrain from svn/snapshot. This last version is
more tolerant to whitespaces.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

creative64 - 2010-07-13

Hi NS,

Thanks for the pointers.

As far as dictionary empty spaces are concerned, I had put them at places
where I had
changed pronunciations generated by lmtool or added new pronunciations (I
didn't have a way of putting comments
there). Since this dictionary works perfectly fine with PocketSphinx I never
really suspected that it could be a problem
with SphinxTrain !

I'll do the cleanup and try option 1) suggested by you.

Thanks again,

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

creative64 - 2010-07-23

Hi NS,

I did clean-up the setup files (dos to unix) and now am able to successfully
run the training session. Thanks.
For 4 person case the acoustic model is giving excellent average accuracy when
training-set is used as the test-set however
when acoustic model is trained only for one person (100 odd utterances)
accuracy gets a beating even when training-set is
used as test-set. Insufficient trainng data I suppose !

I remember having seen some writeup on thumbrules for selecting number of senones according to the length of training
data but am not able to locate it now. Could you please point me to the
relevant link.

Where can I find the most up-to-date writeup on acoustic model adaptation ?

Thanks and regards,
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

rams - 2010-07-23

hi
you should run the pearl scripts for all the documents in scripts_pl. The file
named slave*.pl in every directory should be run by perl if u dont have that
file u will be having a file named .pl......... run .pl file it will create
new files in the directories . it is the feature files that is needed by the
sphinxtrain to train the acoustic model.........

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

creative64 - 2010-07-24

Hi ramsdoe,

I didn't exactly understand the explaination provided by you..... As I
mentioned in my post, I've been able to use SphinxTrain successfully for
training my acoustic models. What I'm looking for is:

A writeup/tutorial on procedure for acoustic model adaptation.

Any writeup which describes deciding number of senones based on amount of training data (I had seen such a document
but am not anle to locate it now).

Regards,
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-07-25

Hello

A writeup/tutorial on procedure for acoustic model adaptation.

See

http://cmusphinx.sourceforge.net/wiki/acousticmodeladaptation

but that will be updated soon. The core idea is that you can combine MLLR with
MAP to get best adaptation results.

Any writeup which describes deciding number of senones based on amount
of training data (I had seen such a document but am not anle to locate it
now).

See http://cmusphinx.sourceforge.net/wiki/tutorialam#configure_model_type_and
_model_parameters
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

creative64 - 2010-07-26

Thanks NS,

Looks like the link for acoustic model training "http://cmusphinx.sourceforge
.net/html/tutorial.html"
is now redirecting to
"http://cmusphinx.sourceforge.net/wiki/tutorialam". Does the old document still exist somehwhere ? It
had a very nice and informative appendix for starters.

Regards,

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-07-26

Does the old document still exist somehwhere ?

no

It had a very nice and informative appendix for starters.

there was nothing important that is missing in a new document

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Fatal Error with SphinxTrain and AN4 database

Speech Recognition Toolkit

Forums

Help

Fatal Error with SphinxTrain and AN4 database document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

#####################################################################

completed

States(3) != 5

#######################################################################

Fatal Error with SphinxTrain and AN4 database