Menu

sphinxtrain Training failed-all files ignored

Help
2012-01-24
2012-09-22
  • Melvin Jose

    Melvin Jose - 2012-01-24

    Hi,

    I am working on building an LVCSR System for Tamil ( Indian Language). I built
    an acoustic model already with 8 hours of speech data. The training was
    successful with only 41 out of 350 files ignored in the final iteration of
    baum-welch. The model worked fine giving satisfactory results.

    Recently I collected another 5 hours of data. When I added the new data to the
    existing 8 hours and tried training the total 13 hours of data,
    Sphinxtrain failed after iteration 5 of Phase 3 ignoring all the 627
    files
    .

    I have thoroughly checked all the audio files, the dictionary, transcription
    and fileids again and again. There is no problem with those aspects to my
    knowledge. The wav files are also in the correct format.

    I have uploaded the log files of both the models for your reference.

    Model1 (8 hours): http://www.4shared.com/document/QRGWQew6/tamilcontinue-
    8gau.html

    Model2 (13 hours): http://www.4shared.com/document/PjULcutW/tamilnew.html

    I can even upload samples of my data if you need them. I dont understand why
    data that was already trained successfully also gets ignored totally after
    adding the additional data.

    I have been trying to solve this problem for long. Any suggestion and help
    will be greatly appreciated.

     
  • Nickolay V. Shmyrev

    In order to help us to solve your problem, you need to provide the database.

     
  • Melvin Jose

    Melvin Jose - 2012-01-25

    Dear Nickolay,

    I have uploaded my project folder (etc and wav) containing the new data which
    causes these problems. I have sent you the download link via the messaging
    facility available in this forum.

    Kindly check if you have obtained the data and tell me if I have to provide
    anything more. Do you also need samples of the old data which trained
    successfully??

    Thanks,
    Melvin

     
  • Nickolay V. Shmyrev

    Hello

    Your data consists of a long files with quite long silences between
    utterances. I think what happens is that recgnizer fails to find a good
    starting point to get initial phonetic segmentation. You definitely want to
    help him and to split the audio on utterances. Silence inside each utterance
    should be minimal. Utterance should include about 0.2 of silence at the
    beginning and in the end.

    Dither option must be enabled during feature extraction to help with zero
    energy regions you ave in your data.

    You might want to use long audio aligner framework recently developed in
    sphinx4 to obtain segmentation for each of your files. You can use your
    initial 8 hour iteration.

     
  • Melvin Jose

    Melvin Jose - 2012-01-26

    Dear Nickolay,

    Thank you so much for your reply. I repeated the feature extraction step again
    with the dither option enabled. This led to a little improvement i.e 163 of
    263
    files were ignored compared to 238 of 263 previously.

    I removed the long silences in between the utterances for 30 files and
    repeated the training again , hoping that some of the 30 files would get
    accepted. But there was no change at all, all the 30 files were ignored again.

    What are your thoughts on this ?? Why didn't correcting those files help? I
    need your guidance in this regard.

    I will remove the silences from all the files and train again and report the
    results soon. And how do I use the long audio aligner ? Is there any tutorial
    for it?

    Thanks,
    Melvin

     

Log in to post a comment.