Menu

Fatal Error with SphinxTrain and AN4 database

Help
creative64
2010-07-07
2012-09-22
  • creative64

    creative64 - 2010-07-07

    Hi,

    I'm trying to build the acoustic model for AN4 database (targeted for
    PocketSphinx) as per the tutorial provided
    at "http://cmusphinx.sourceforge.net/html/tutorial.html". Everythhing seems to be going fine till the
    point where
    make_s2_models.pl is run. At this point I'm getting a FATAL ERROR from
    mk_s2sendump.c.

    End part of an4.html is pasted below.

    #####################################################################

    MODULE: 90 deleted interpolation (2010-07-07 17:36)
    Phase 1: Cleaning up directories: logs...

    Phase 2: Doing interpolation...

    delint Log File

    WARNING: This step had 0 ERROR messages and 6 WARNING messages. Please check
    the log file for details.

    completed
    Phase 3: Dumping senones for PocketSphinx...

    mk_s2sendump Log File

    completed

    MODULE: 99 Convert to Sphinx2 format models (2010-07-07 17:36)
    Phase 1: Cleaning up old log files...

    Phase 2: Copy noise dictionary

    Phase 3: Make codebooks

    Log File
    mk_s2cb Log File

    completed
    Phase 4: Make chmm files

    mk_s2hmm Log File

    completed
    Phase 5: Make senone file

    Log File
    mk_s2sendump Log File

    FATAL_ERROR: "........\src\programs\mk_s2sendump\mk_s2sendump.c", line 199:

    States(3) != 5

    FAILED

    #######################################################################

    Note: My platform in Windows7, SphinxTrain and AN4 tarballs, are obtained from
    the links provided in the
    above mentioned tutorial, Little Endian database for AN4 is selected and
    Microsoft Visual C++ 2008 express
    is used for compiling SphinxTrain.

    What could be causing this behavior ?

    Thanks,

     
  • Nickolay V. Shmyrev

    Tutorial needs little upgrade but in short:

    ** you can skip this step**

     
  • creative64

    creative64 - 2010-07-08

    Hi NS,

    Thanks,

    1. Should I not run "\99.make_s2_models\make_s2_models.pl" at all or need to mask-off parts within this script ?

    Few more:

    1. After creating the AN4 acoustic model, I'd like to use it to decode utterances provided in AN4 database with my
      regular pocketsphinx_batch setup. Which directory should I take that will have
      all the hmm parameters (i.e the file to
      be used for "-hmm" argument in pocketsphinx.batch)

    2. SphinxTrain tutorial talkes of a filler dictionary along with regular dictionary for training as well as for decoding.
      In my experience with pocketsphinx so far, I'm used to giving only one
      dictionary file (aregument for "-dict" in
      pocketsphinx_batch). How do I provide filler dictionary to pocketsphinx_batch
      ?

    Thanks,

     
  • creative64

    creative64 - 2010-07-08
    1. Should I not run "\99.make_s2_models\make_s2_models.pl" at all or need to mask-off parts within this script ?

    2. After creating the AN4 acoustic model, I'd like to use it to decode utterances provided in AN4 database with my
      regular pocketsphinx_batch setup. Which directory should I take that will have
      all the hmm parameters (i.e the file to
      be used for "-hmm" argument in pocketsphinx.batch)

    Note: I used an4.cd_semi_1000 hmm directory and was successful in running
    pocketsphinx_bat with language model
    provided with an4. Tried decoding some of the test files provided. Accuracy
    wasn't too good but the flow worked ;-)

    Am I picking the right model directory. There are 2 more directories that have
    the same or a later timestamps
    (an4.cd_semi_1000-delinterp and an4.cd_semi_1000.s2models....)

    1. SphinxTrain tutorial talkes of a filler dictionary along with regular dictionary for training as well as for decoding.
      In my experience with pocketsphinx so far, I'm used to giving only one
      dictionary file (aregument for "-dict" in
      pocketsphinx_batch). How do I provide filler dictionary to pocketsphinx_batch
      ?

    Note: In the above mentioned decoder run, I didn't bother to use the filler
    directory. Hope it is OK.

    Thanks,

     
  • Nickolay V. Shmyrev

    1. Should I not run "\99.make_s2_models\make_s2_models.pl" at all or need
      to mask-off parts within this script ?

    Yes, just don't run it.

    1. After creating the AN4 acoustic model, I'd like to use it to decode
      utterances provided in AN4 database with my regular pocketsphinx_batch setup.
      Which directory should I take that will have all the hmm parameters (i.e the
      file to be used for "-hmm" argument in pocketsphinx.batch) Note: I used
      an4.cd_semi_1000 hmm directory and was successful in running pocketsphinx_bat
      with language model provided with an4. Tried decoding some of the test files
      provided. Accuracy wasn't too good but the flow worked ;-) Am I picking the
      right model directory. There are 2 more directories that have the same or a
      later timestamps (an4.cd_semi_1000-delinterp and
      an4.cd_semi_1000.s2models....)

    You picked the right one

    1. SphinxTrain tutorial talkes of a filler dictionary along with regular
      dictionary for training as well as for decoding. In my experience with
      pocketsphinx so far, I'm used to giving only one dictionary file (aregument
      for "-dict" in pocketsphinx_batch). How do I provide filler dictionary to
      pocketsphinx_batch ?

    You can provide filler dictionary with -fdict option but actually you
    shouldn't worry about that. Filler dictionary is automatically placed inside
    the model (an4.cd_semi_1000/noisedict) and automatically loaded when you
    provide model with -hmm option.

     
  • creative64

    creative64 - 2010-07-08

    Thanks so much NS.

    I have another related question:

    1. Suppose I want to create an accoustic model which caters to only a "single person". Will it be Ok to train the model with
      let's say 100 sentences spoken by the person. These are short command-and-
      control type of sentences and the
      assumptioin is that "only that person" will use the system and the he/she will
      only use sentences out of these 100
      for using the system.

    Will SphinxTrain be able to train the models with small data provided for
    above scenario ?

    1. Can I extend the same thing to cater to say 4 persons....... that means I'll train the model with 100 sentences
      spoken by all 4 users and only they will use the system by speaking any of
      these sentences.

    Thanks,

     
  • Nickolay V. Shmyrev

    1. Suppose I want to create an accoustic model which caters to only a
      "single person". Will it be Ok to train the model with let's say 100 sentences
      spoken by the person. These are short command-and-control type of sentences
      and the assumptioin is that "only that person" will use the system and the
      he/she will only use sentences out of these 100 for using the system. Will
      SphinxTrain be able to train the models with small data provided for above
      scenario ?

    It's better to adapt generic model to the specific person in that case. You
    can't train anything good with 100 sentences.

    1. Can I extend the same thing to cater to say 4 persons....... that means
      I'll train the model with 100 sentences spoken by all 4 users and only they
      will use the system by speaking any of these sentences.

    Again, this is the case where it's better to use generic model adaptation.

     
  • creative64

    creative64 - 2010-07-09

    Thanks NS,

    I was coming more from the model size point of view ("adapted generic model"
    vs "newly trained user specific for a limited vocabulary task model") but from
    your comments, looks like the generic one will be much better in accuracy.
    Thanks
    for comments again.

     
  • creative64

    creative64 - 2010-07-09

    Hi NS,

    Just to get a feel of creating an acoustic model, I went ahead and started
    training one for myself (based on above mentioned
    100 utterances). This went fine upto the point where "Baum Welch" started.
    Then the .exe stopped with windows message
    ."bw.exe has stopped working".

    Could this be because of non convergance of algorithm due to small amout of
    data or it is something else missing here ?

    log file is uploaded at "http://www.mediafire.com/file/wrhteeoxzym/an4.html"

    PS: Just to make it work I also tried to inflate the data by increasing the
    number of utterances simply by duplicating the files (and making suitable
    adjustments in .fileids and .transcription files) to make the program feel
    that the data has
    increased (there was no good logic in doing this just wanted to see if it made
    any differences....)

    Thanks,

     
  • creative64

    creative64 - 2010-07-10

    Hi NS,

    Tried another run. This time with 100 sentences each from 4 speakers
    (totalling 0.27 hours of recording). "Baum Welch"
    failed in iteration 1 exactly the same way as before (log available at http:/
    /www.mediafire.com/file/jizn2xjnwmt/an4.html).

    My data is recorded at 16 Khz and has mono audio. Is insufficient data or
    something else ?

    Regards,

     
  • Nickolay V. Shmyrev

    In order to find the reason of your problem you need to check training logs
    for corresponding steps and for earlier steps. Training logs are located in
    logdir folder.

     
  • creative64

    creative64 - 2010-07-12

    Hi NS,

    Thankx.

    I looked into logdir directory and tried comparing it with working an4
    training run directory. Here is what I'm seeing,

    • In my run, I see only two directories "05.vector_quantize" and "20.ci_hmm" created.
    • "05.vector_quantize looks" OK whereas "20.ci_hmm" has only 4 fiels
      "an4.makeflat_cihmm.log" ------- looks OK
      "an4.make_ci_mdef_fromphonelist.log" ------- looks OK
      "an4.1.1-1.bw.log" ------- Doesn't look OK. It abruptly terminates.
      " an4.1.1.norm.log" ------- Doesn't look OK. Has the error message 'Only 0
      parts of 1 of Baum
      Welch were successfully completed Parts 1 failed to run!"
    • Is there something wrong with the format of my .dic, .filler, .phone, .fileids or .transcription files due to which
      "an4.1.1-1.bw.log" shows an abrupt termination !!!

    I'm putting some relevant directories of my database at "http://www.mediafire
    .com/file/t54egzdneid/an4.zip
    ".

    Regards,

     
  • Nickolay V. Shmyrev

    Your source files are crazy. They are full of windows-style newlines, empty
    lines in the dictionary (you are the first who did that), spaces after phones
    in the end of lines. You have two ways to solve this problem:

    1) Cleanup all whitespaces and make all input files have proper format
    2) Download and use latest SphinxTrain from svn/snapshot. This last version is
    more tolerant to whitespaces.

     
  • creative64

    creative64 - 2010-07-13

    Hi NS,

    Thanks for the pointers.

    As far as dictionary empty spaces are concerned, I had put them at places
    where I had
    changed pronunciations generated by lmtool or added new pronunciations (I
    didn't have a way of putting comments
    there). Since this dictionary works perfectly fine with PocketSphinx I never
    really suspected that it could be a problem
    with SphinxTrain !

    I'll do the cleanup and try option 1) suggested by you.

    Thanks again,

     
  • creative64

    creative64 - 2010-07-23

    Hi NS,

    I did clean-up the setup files (dos to unix) and now am able to successfully
    run the training session. Thanks.
    For 4 person case the acoustic model is giving excellent average accuracy when
    training-set is used as the test-set however
    when acoustic model is trained only for one person (100 odd utterances)
    accuracy gets a beating even when training-set is
    used as test-set. Insufficient trainng data I suppose !

    1. I remember having seen some writeup on thumbrules for selecting number of senones according to the length of training
      data but am not able to locate it now. Could you please point me to the
      relevant link.

    2. Where can I find the most up-to-date writeup on acoustic model adaptation ?

    Thanks and regards,

     
  • rams

    rams - 2010-07-23

    hi
    you should run the pearl scripts for all the documents in scripts_pl. The file
    named slave*.pl in every directory should be run by perl if u dont have that
    file u will be having a file named .pl......... run .pl file it will create
    new files in the directories . it is the feature files that is needed by the
    sphinxtrain to train the acoustic model.........

     
  • creative64

    creative64 - 2010-07-24

    Hi ramsdoe,

    I didn't exactly understand the explaination provided by you..... As I
    mentioned in my post, I've been able to use SphinxTrain successfully for
    training my acoustic models. What I'm looking for is:

    1. A writeup/tutorial on procedure for acoustic model adaptation.

    2. Any writeup which describes deciding number of senones based on amount of training data (I had seen such a document
      but am not anle to locate it now).

    Regards,

     
  • Nickolay V. Shmyrev

    Does the old document still exist somehwhere ?

    no

    It had a very nice and informative appendix for starters.

    there was nothing important that is missing in a new document

     

Log in to post a comment.