CMU Sphinx / Forums / Help: Training: huge likelihood after 1st iteration

Anonymous - 2006-03-29

Hello,

I have a problem with training module 2. The first iteration seems to run perfectly. But the following normalization leads to a huge likelihood (end of logfile: Current Overall Likelihood Per Frame = -33794.0269704748).

So, I inspected the logfile of the first Baum-Welch iteration.

A typical utterance produces something like:

utt> 64 sr429 261 0 148 116 60 119 1.107095e-41 -5.068369e+01 -1.322844e+04 utt 0.182x 1.076e upd 0.182x 1.073e fwd 0.038x 1.045e bwd 0.144x 1.079e gau 4.007x 1.065e rsts 0.048x 1.139e rstf 0.004x 0.539e rstu -0.000x 0.000e

But the logfile reveals several strange utterances, e.g.

utt> 63 sr428 140 0 64 53 31 62 0.000000e+00 -5.280189e+01 -7.392265e+03 utt 0.106x 1.153e upd 0.106x 1.141e fwd 0.024x 1.071e bwd 0.081x 1.161e gau 1.206x 0.944e rsts 0.020x 1.782e rstf 0.001x 0.949e rstu -0.000x 0.000e

I think the 0.000000e+00 is very suspicious (compared to the e-41). Why aren't utterances that failed excluded automatically? What went wrong?

Thanks
Andreas

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous - 2006-04-04
  
  At first I want to thank all of you for your help !!
  
  It was the suggestion of Chris which made my corpus running ! (although it is still running, module 2 has been processed sucessfully now).
  
  Simply switching on the dithering of the audio (I added '-dither yes' as parameter to the wave2feat call) solved all the problems.
  
  I also tested removing the silences in the whole corpus. This reduced the likelihood after the the normalization of the first baum-welch data from about -34000 to about -20000. This was still a too big value, so the second iteration failed.
  
  Now I'm analyzing which part of the corpus caused the problems or if it is the whole corpus in average. Maybe I should have a look at contiguous nulls in the audio data or I have a look at the generated cepstra files.
  
  Best regards
  Andreas
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- The Grand Janitor - 2006-03-29
  
  Indeed.
  Check the following: Is there long silences in the sentencs? If there is, think of a way to cut it out. BW will be easily confused by it.
  Arthur
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- hiyassat - 2006-03-30
  
  Hi andreas_w
  I faced the same problem in my first trails
  And as Arthur said I solve the problem by trimming silences from the audio file
  There is quite many software which do it , me my self I use audio city it is open source and excellent tool.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous - 2006-04-03
  
  Hello Andreas,
  yust an idea but have you tryed to use the -dither flag in the feature extraction in wave2feat. This flag adds a 1/2 bit noise to the silent parts and prevents so divisions by zero...
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Training: huge likelihood after 1st iteration

Speech Recognition Toolkit

Forums

Help

Training: huge likelihood after 1st iteration document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Training: huge likelihood after 1st iteration