Menu

Sphinx Training Problem

Help
2008-12-17
2012-09-22
  • Sumeet Khullar

    Sumeet Khullar - 2008-12-17

    Hi,

    I am using some recorded wav files with SphinxTrain. I am training Sphinx for a limited vocabulary engine. When I start the script of vector quantization, I get a lot of warnings (total 1203) in my log such as this.

    INFO: main.c(572): -> Aborting k-means, bad initialization
    INFO: kmeans.c(153): km iter [0] 1.000000e+00 ...
    WARNING: "kmeans.c", line 431: Empty cluster 109
    WARNING: "kmeans.c", line 431: Empty cluster 140
    WARNING: "kmeans.c", line 431: Empty cluster 175
    WARNING: "kmeans.c", line 431: Empty cluster 194
    WARNING: "kmeans.c", line 431: Empty cluster 202
    WARNING: "kmeans.c", line 431: Empty cluster 227
    ....
    ....

    and the following errors

    ERROR: "main.c", line 800: Too few observations for kmeans
    ERROR: "main.c", line 1363: Unable to do k-means for state 0; skipping...

    From information on the web, I think that the problem is with my audio data. I have 539 frames , so number of frames is not a problem.

    Do I need to remove headers from my wav files using some software ? Any other troubleshooting ideas?

    Thanks!

     
    • Nickolay V. Shmyrev

      Sphinxtrain determines the size by the size of the extracted features, not by the size of the audio. Most probably you extracted features incorrectly, for 8kHz you need to edit ./scripts_pl/make_feats.pl and change parameters like upper frequency and number of filters.
      Btw, the reasonable amount of audio starts from one hour, not from one minute. Also, please don't forget to set the smaller number of senones for training, something around 200 instead of 1000.

       
    • Nickolay V. Shmyrev

      It says you have not enough training data for the models you are trying to train.

       
    • suresh chandra sekaran

      This error may come if there are certain phones for which you don't even have atleast a single word to model in training.

       
    • Sumeet Khullar

      Sumeet Khullar - 2008-12-18

      Thanks for your help.

      Somehow, SphinxTrain is severely underestimating the amount of audio data.

      eg. If I put a recorded audio file which is one minute long in my wav directory(with the appropriate transcription file in etc) and run the script verify_all.pl, It shows me

      Total Hours Training: 0.00170299145299145 (0.1 minute or 6 seconds)

      Why is this the case? I have recorded my audio files with Audacity at 8kHz / 16 bit and they are not defective.

      Thanks.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.