CMU Sphinx / Forums / Help: Mean vector with zero entries

osman b - 2008-08-12

Hi,
I am trying to use Sphinx in Turkish speaker identification task. However I had a problem during adaptation. In my speaker identification experiment, I re-estimate background HMM's using small amount of adaptation data from each speaker with some iterations of baum welch. However when the amount of training data is very small (something like 2-3 seconds of speech material, while some phones are never observed, some phones are observed at most 2 times in the database), and just the mean vectors are reestimated, some of the mean vectors in HMM becomes zero after re-estimation. I would like to ask possible reasons and solutions for such mean vectors since I think they might be hurting identification performance in my experiment.

An example problematic mean vector;
density 0 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000

Any suggestion will be helpfull

Thanks

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2008-08-12
  
  Bad things happens, I don't think it's sensible to reestimate from such a limited amount of data.
  
  Btw, do you know about Alize?
  
  http://old.lia.univ-avignon.fr/heberges/ALIZE/#aboutAlize
  http://old.lia.univ-avignon.fr/heberges/ALIZE/Doc/alize_v1.21.tar.gz
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- David Huggins-Daines - 2008-08-12
  
  Hi! I think Sphinx, and in fact HMMs in general, are not well suited to this task given the small amount of training data you are dealing with.
  
  If your task is text-dependent I suggest using Dynamic Time Warping on LDA-transformed features (LDA is very helpful here because it not only adds discrimination, it normalizes variances, which is important for DTW).
  
  Another option for text-independent speaker recognition is to use phoneme decoding and language models, as in Qin Jin's thesis: http://www.lti.cs.cmu.edu/Research/Thesis/QinJin.pdf
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- osman b - 2008-08-13
  
  Thank you for your guidance. I will be looking at these options, too.
  
  In fact my question was this: Is there any command line option in Sphinx to set the minimum number of training examples required to update model parameters. If the actual number falls below this value, parameters are not updated and original parameters are used for the new version. (I know that there is such an option in HTK) Because as I could see in output model definition of Sphinx, the mean vectors of HMM states which are never observed during training becomes zero.
  
  Do u have any suggestion to overcome this problem?
  
  Thanks for your help
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Mean vector with zero entries

Speech Recognition Toolkit

Forums

Help

Mean vector with zero entries document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Mean vector with zero entries