Menu

Mean vector with zero entries

Help
osman b
2008-08-12
2012-09-22
  • osman b

    osman b - 2008-08-12

    Hi,
    I am trying to use Sphinx in Turkish speaker identification task. However I had a problem during adaptation. In my speaker identification experiment, I re-estimate background HMM's using small amount of adaptation data from each speaker with some iterations of baum welch. However when the amount of training data is very small (something like 2-3 seconds of speech material, while some phones are never observed, some phones are observed at most 2 times in the database), and just the mean vectors are reestimated, some of the mean vectors in HMM becomes zero after re-estimation. I would like to ask possible reasons and solutions for such mean vectors since I think they might be hurting identification performance in my experiment.

    An example problematic mean vector;
    density 0 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000 0.000e+000

    Any suggestion will be helpfull

    Thanks

     
    • Nickolay V. Shmyrev

      Bad things happens, I don't think it's sensible to reestimate from such a limited amount of data.

      Btw, do you know about Alize?

      http://old.lia.univ-avignon.fr/heberges/ALIZE/#aboutAlize
      http://old.lia.univ-avignon.fr/heberges/ALIZE/Doc/alize_v1.21.tar.gz

       
    • David Huggins-Daines

      Hi! I think Sphinx, and in fact HMMs in general, are not well suited to this task given the small amount of training data you are dealing with.

      If your task is text-dependent I suggest using Dynamic Time Warping on LDA-transformed features (LDA is very helpful here because it not only adds discrimination, it normalizes variances, which is important for DTW).

      Another option for text-independent speaker recognition is to use phoneme decoding and language models, as in Qin Jin's thesis: http://www.lti.cs.cmu.edu/Research/Thesis/QinJin.pdf

       
    • osman b

      osman b - 2008-08-13

      Thank you for your guidance. I will be looking at these options, too.

      In fact my question was this: Is there any command line option in Sphinx to set the minimum number of training examples required to update model parameters. If the actual number falls below this value, parameters are not updated and original parameters are used for the new version. (I know that there is such an option in HTK) Because as I could see in output model definition of Sphinx, the mean vectors of HMM states which are never observed during training becomes zero.

      Do u have any suggestion to overcome this problem?

      Thanks for your help

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.