I'm trying to evaluate a two pass system with unsupervised adaptation of means
with MLLR first, and means and variances with CMLLR ( inspired by the python
script "mllr.py" and the paper M.J.F. Gales, "Maximum likehood Linear
Transformations for HMM-Based Speech recognition") later.
I've got a warning from 'bw' when i collect statistics from the hypothesis
done by the first pass , about 1000 s of speech by speaker :
"WARNING: "accum.c", line 627: Over 500 senones never occur in the input data. This is normal for CD untied training, but could indicate a serious problem otherwise."
I understand that some senones doesn't occur and i hope this a normal issue (
the adaptation is not supervised, i don't know which text is pronounced ).
But i've got to be sure in order to (in)validate my implementation of CMLLR in
Python and since the WER is a little bit higher with CMLLR adapation (MLLR
also).
Thanks by advance for your response,
Stephan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
E_WARN("Over 500 senones never occur in the input data. "
"This is normal for context-dependent untied senone training or for adaptation, "
"but could indicate a serious problem otherwise.\n");
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Ok but it's not really a contribution, because i'm not 100% sure of my code
(the WER is increasing a little bit after adaptation) ...
It's much more a "initialization".
I do some more tests and i come back.
Bye,
Stephan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Is this feature of two pass decoding has been added to sphinx or not yet? By
two pass decoding, I mean that while decoding we perform MLLR adapt for each
utterance then we perform decoding.
This feature is not supported yet.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I'm trying to evaluate a two pass system with unsupervised adaptation of means
with MLLR first, and means and variances with CMLLR ( inspired by the python
script "mllr.py" and the paper M.J.F. Gales, "Maximum likehood Linear
Transformations for HMM-Based Speech recognition") later.
I've got a warning from 'bw' when i collect statistics from the hypothesis
done by the first pass , about 1000 s of speech by speaker :
I understand that some senones doesn't occur and i hope this a normal issue (
the adaptation is not supervised, i don't know which text is pronounced ).
But i've got to be sure in order to (in)validate my implementation of CMLLR in
Python and since the WER is a little bit higher with CMLLR adapation (MLLR
also).
Thanks by advance for your response,
Stephan
In trunk it reads:
Ok thanks a lot.
I'll be back with my implementation of CMLLR in Python.
Did you interested to get the actual code and try to improve it with me ?
We are very interested in contributions!
Ok but it's not really a contribution, because i'm not 100% sure of my code
(the WER is increasing a little bit after adaptation) ...
It's much more a "initialization".
I do some more tests and i come back.
Bye,
Stephan
Dear Stephan
The faster you show the code the faster we will make it work. :) That's
chicken and egg problem.
Ok thanks Nickolay, the file can be downloaded here :
http://www.mediafire.com/?588en6s1vphs0bd
Is this feature of two pass decoding has been added to sphinx or not yet?
By two pass decoding, I mean that while decoding we perform MLLR adapt for
each utterance then we perform decoding.
This feature is not supported yet.