I have a few questions about MLLR adaptation with Sphinx 3. I was able to build a continuous model with SphinxTrain using the AN4 database with WER less than 20%. I was also able to train a model with my own small set of data and get around the same WER. My next step was to try to adapt the AN4 model with MLLR.
Here's what I did so far:
1. Single iteration of Baum-Welch with the baseline model using the adaptation data
2. Create MLLR Matrix file using mllr_solve
3. Use mllr_transform to create a new means using the matrix from step 2.
4. Decode with the adapted means
When I use this method I go from an accuracy of about 46% to about 52%. So I am thinking that there is something wrong with my method. Since I am using a small set of adaptation data, I decided not to use the mk_mllr_class method.
My main question is how do I use the -mllrctlfn parameter for s3decode-ing? Is it supposed to be a ctl file for my adaptation feature files or something else?
I get the following error when trying to use the -mllrctlfn parameter.
Ok, I figured out the -mllrctlfn parameter. It's for doing the adaption on-line, meaning you can specify the matrix from mllr_solve for each file you are decoding. It threw me off because I would assume that each file would use the same matrix for adaptation. So I believe this parameter is not needed if you are using the mllr_transform to adapt a new means. When using the -mllrctlfn parameter and specifying the output matrix for each file, I get approximately the same increase in my accuracy.
The question still remains on how much increase in accuracy should be seeing when using MLLR adapation. Here's some of my results so far.
Word Accuracy:
AN4 Trained Model: ~85%
Own data (subset of AN4) Model: ~70%
AN4 Model tested with my own data: ~45%
AN4 MLLR Adapted Model: ~55%
So I know the data I recorded and trained myself is reasonably good as I get 70% word accuracy. However, I don't know if should expect better results from the adaption than a jump from 45% to 55% word accuracy. Anyone have any better success than I have? I'm thinking that maybe I have done something wrong in my baum welch step.
Thanks for any advice/help,
Eric
PS Sorry for replying to my own post...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Eric,
In your case, I will think what you did is correct and I am glad that you got positive results from speaker adaptation
We don't have a very thorough MLL adaptation yet. I will recommend you to read a draft Sphinx 3 doc written by me.
Sphinx's speaker adaptation routine is still in its iinfancy. Some important features such as MLLR regression classes and MAP are not yet implemented The document I mention will be constantly updated.
Arthur
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you for the comments and the link. After reading the document I am still unclear on one thing.
After I have done the single iteration of baum-welch on the adapted data, at what point in the sphinx3 training do I use mllr_solve? I have done mllr_solve on the CI models, and the CD untied models; however, any of the CD tied models I try, I get an "n_mgau mismatch" error.
Eric
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I have a few questions about MLLR adaptation with Sphinx 3. I was able to build a continuous model with SphinxTrain using the AN4 database with WER less than 20%. I was also able to train a model with my own small set of data and get around the same WER. My next step was to try to adapt the AN4 model with MLLR.
Here's what I did so far:
1. Single iteration of Baum-Welch with the baseline model using the adaptation data
2. Create MLLR Matrix file using mllr_solve
3. Use mllr_transform to create a new means using the matrix from step 2.
4. Decode with the adapted means
When I use this method I go from an accuracy of about 46% to about 52%. So I am thinking that there is something wrong with my method. Since I am using a small set of adaptation data, I decided not to use the mk_mllr_class method.
My main question is how do I use the -mllrctlfn parameter for s3decode-ing? Is it supposed to be a ctl file for my adaptation feature files or something else?
I get the following error when trying to use the -mllrctlfn parameter.
WARNING no -matchsegfn argument
Error reading MLLR file .....001.mfc
mllr_read_regmat failed
I verified that 001.mfc file exists, but from the error I don't believe the function actually wants feature files.
Finally, is the method above correct for doing adaptation? I'm still new to the whole training side of things, and I'm hoping I'm on the right track.
Thank you for any help/advice,
Eric
Ok, I figured out the -mllrctlfn parameter. It's for doing the adaption on-line, meaning you can specify the matrix from mllr_solve for each file you are decoding. It threw me off because I would assume that each file would use the same matrix for adaptation. So I believe this parameter is not needed if you are using the mllr_transform to adapt a new means. When using the -mllrctlfn parameter and specifying the output matrix for each file, I get approximately the same increase in my accuracy.
The question still remains on how much increase in accuracy should be seeing when using MLLR adapation. Here's some of my results so far.
Word Accuracy:
AN4 Trained Model: ~85%
Own data (subset of AN4) Model: ~70%
AN4 Model tested with my own data: ~45%
AN4 MLLR Adapted Model: ~55%
So I know the data I recorded and trained myself is reasonably good as I get 70% word accuracy. However, I don't know if should expect better results from the adaption than a jump from 45% to 55% word accuracy. Anyone have any better success than I have? I'm thinking that maybe I have done something wrong in my baum welch step.
Thanks for any advice/help,
Eric
PS Sorry for replying to my own post...
Hi Eric,
In your case, I will think what you did is correct and I am glad that you got positive results from speaker adaptation
We don't have a very thorough MLL adaptation yet. I will recommend you to read a draft Sphinx 3 doc written by me.
http://www-2.cs.cmu.edu/~archan/documentation/chapter9.ps
Arthur
Arthur,
Thank you for the comments and the link. After reading the document I am still unclear on one thing.
After I have done the single iteration of baum-welch on the adapted data, at what point in the sphinx3 training do I use mllr_solve? I have done mllr_solve on the CI models, and the CD untied models; however, any of the CD tied models I try, I get an "n_mgau mismatch" error.
Eric