I have created a C++ application to train/adapt the default en-us acoustic model using 20 Arctic speech files that are in wav format. The sample rate is 16000 Hz.
The adaptation process seems to be quite smooth but I could not find any improvement in the decoding accuracy. I tried using both mllr as well as map adaptation for ultimate accuracy but of no use. Then I noticed something in mllr execution and it says "Estimation of 0th regression in MLLR failed.. Estimation of 1st regression in MLLR failed.. " and so on. I have copied my MLLR command line output and pasted it below for reference:
Currentconfiguration:[NAME][DEFLT][VALUE]-accumdir.,-cb2mllrfn.1cls..1cls.-cdonlynono-examplenono-fullvarnono-helpnono-meanfnen-us/means-mllraddyesyes-mllrmultyesyes-moddeffn-outmllrfnmllr_matrix-varfloor1e-31.000000e-03-varfnen-us/variancesINFO:main.c(382):-- 1. Read input mean, (var) and accumulation.INFO:s3gau_io.c(169):Readen-us/means[42x3x128 array]INFO:main.c(397):Readingandaccumulatingcountsfrom.INFO:s3gau_io.c(386):Read./gauden_countswithmeanswithvars[42x3x128 vector arrays]INFO:main.c(436):-- 2. Read cb2mllrfnINFO:main.c(455):n_mllr_class=1INFO:main.c(475):-- 3. Calculate mllr matricesINFO:main.c(127):INFO:main.c(128):---- mllr_solve(): Conventional MLLR methodINFO:s3gau_io.c(169):Readen-us/variances[42x3x128 array]INFO:main.c(208):---- A. Accum regl, regrINFO:main.c(209):Noclasses1,no.stream3INFO:main.c(281):---- B. Compute MLLR matrices (A,B)INFO:mllr.c(182):ComputingbothmultiplicativeandadditivepartofMLLRINFO:mllr.c(186):Estimationof0thregressioninMLLRfailedINFO:mllr.c(186):Estimationof1thregressioninMLLRfailedINFO:mllr.c(186):Estimationof2thregressioninMLLRfailedINFO:mllr.c(186):Estimationof3thregressioninMLLRfailedINFO:mllr.c(186):Estimationof4thregressioninMLLRfailedINFO:mllr.c(186):Estimationof5thregressioninMLLRfailedINFO:mllr.c(186):Estimationof6thregressioninMLLRfailedINFO:mllr.c(186):Estimationof7thregressioninMLLRfailedINFO:mllr.c(186):Estimationof8thregressioninMLLRfailedINFO:mllr.c(186):Estimationof9thregressioninMLLRfailedINFO:mllr.c(186):Estimationof10thregressioninMLLRfailedINFO:mllr.c(186):Estimationof11thregressioninMLLRfailedINFO:mllr.c(186):Estimationof12thregressioninMLLRfailedINFO:mllr.c(182):ComputingbothmultiplicativeandadditivepartofMLLRINFO:mllr.c(186):Estimationof0thregressioninMLLRfailedINFO:mllr.c(186):Estimationof1thregressioninMLLRfailedINFO:mllr.c(186):Estimationof2thregressioninMLLRfailedINFO:mllr.c(186):Estimationof3thregressioninMLLRfailedINFO:mllr.c(186):Estimationof4thregressioninMLLRfailedINFO:mllr.c(186):Estimationof5thregressioninMLLRfailedINFO:mllr.c(186):Estimationof6thregressioninMLLRfailedINFO:mllr.c(186):Estimationof7thregressioninMLLRfailedINFO:mllr.c(186):Estimationof8thregressioninMLLRfailedINFO:mllr.c(186):Estimationof9thregressioninMLLRfailedINFO:mllr.c(186):Estimationof10thregressioninMLLRfailedINFO:mllr.c(186):Estimationof11thregressioninMLLRfailedINFO:mllr.c(186):Estimationof12thregressioninMLLRfailedINFO:mllr.c(182):ComputingbothmultiplicativeandadditivepartofMLLRINFO:mllr.c(186):Estimationof0thregressioninMLLRfailedINFO:mllr.c(186):Estimationof1thregressioninMLLRfailedINFO:mllr.c(186):Estimationof2thregressioninMLLRfailedINFO:mllr.c(186):Estimationof3thregressioninMLLRfailedINFO:mllr.c(186):Estimationof4thregressioninMLLRfailedINFO:mllr.c(186):Estimationof5thregressioninMLLRfailedINFO:mllr.c(186):Estimationof6thregressioninMLLRfailedINFO:mllr.c(186):Estimationof7thregressioninMLLRfailedINFO:mllr.c(186):Estimationof8thregressioninMLLRfailedINFO:mllr.c(186):Estimationof9thregressioninMLLRfailedINFO:mllr.c(186):Estimationof10thregressioninMLLRfailedINFO:mllr.c(186):Estimationof11thregressioninMLLRfailedINFO:mllr.c(186):Estimationof12thregressioninMLLRfailedINFO:main.c(497):-- 4. Store mllr matrices (A,B) to mllr_matrix
And also I am not quite sure about the model type that I am using. I checked feat.params file and I saw
-feat 1s_c_d_dd
and
-model ptm
in there. So am I using continous or phonetically tied model?
I included all the adaptation commands in a shell script and I have pasted in here for reference. Please note that the below shown script saves the newly adapted model in a folder of user's choice as exported into environment as PSt_ADAPTED_MODEL_PATH variable. All the wav files, copied acoustic, language models and dictionary will be saved and processed in the path as mentioned by the user (PSt_TRAINING_WORKSPACE)
#! bin/bash#<==============================================================# This shell script runs configuration commands for training# the acoustic model of pocketsphinx. Training before decoding # the speech samples is not mandatory.#<==============================================================# Compiling ps trainer package. cd$PSt_TRAINING_WORKSPACEecho$"Copying model from pocketsphinx source to the working directory"
cp-a/usr/local/share/pocketsphinx/model/en-us/en-us.
echo$"Copying model: SUCCESSFUL"echo$"Copying dictionary from pocketsphinx source to the working directory"
cp-a/usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict.
echo$"Copying dictionary: SUCCESSFUL"echo$"Copying lm from pocketsphinx source to the working directory"
cp-a/usr/local/share/pocketsphinx/model/en-us/en-us.lm.bin.
echo$"Copying lm: SUCCESSFUL"# Generating acoustic feature files
sphinx_fe-argfile$PSt_FEAT_PARAMS\-samprate16000-c$PSt_FILEID\-di.-do.-eiwav-eomfc-mswavyes
# Convert the compressed mdef into mdef.txt
pocketsphinx_mdef_convert-texten-us/mdefen-us/mdef.txt
# Copy bw, map-adapt and mk-s2sendump
cp-a/usr/local/libexec/sphinxtrain/bw.
cp-a/usr/local/libexec/sphinxtrain/map_adapt.
cp-a/usr/local/libexec/sphinxtrain/mk_s2sendump.
# Accumulating observation counts
./bw\-hmmdiren-us\-moddeffnen-us/mdef.txt\-ts2cbfn.ptm.\-feat1s_c_d_dd\-svspec0-12/13-25/26-38\-cmncurrent\-agcnone\-dictfncmudict-en-us.dict\-ctlfn$PSt_FILEID\-lsnfn$PSt_TRANSCRIPTION\-accumdir.
# Copy the model once again so that we over write them
cp-aen-usen-us-adapt
# Copy mllr_solve executable first
cp-a/usr/local/libexec/sphinxtrain/mllr_solve.
# Create MLLR matrix for improved accuracy
./mllr_solve\-meanfnen-us/means\-varfnen-us/variances\-outmllrfnmllr_matrix-accumdir.
# Use MAP adaptation technique
./map_adapt\-moddeffnen-us/mdef.txt\-ts2cbfn.ptm.\-meanfnen-us/means\-varfnen-us/variances\-mixwfnen-us/mixture_weights\-tmatfnen-us/transition_matrices\-accumdir.\-mapmeanfnen-us-adapt/means\-mapvarfnen-us-adapt/variances\-mapmixwfnen-us-adapt/mixture_weights\-maptmatfnen-us-adapt/transition_matrices
# Moving the adapted model, dictionary and language model to the desired location
mkdir$PSt_ADAPTED_MODEL_PATHcd$PSt_ADAPTED_MODEL_PATH
mkdiren-us
cden-us
cp-a/usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict.
cp-a/usr/local/share/pocketsphinx/model/en-us/en-us.lm.bin.
mkdiren-us
cden-us
cp-a$PSt_TRAINING_WORKSPACE"en-us-adapt/".
# Come back to Training workspace and delete unnecessary thingscd$PSt_TRAINING_WORKSPACE# Delete the copied model
rmcmudict-en-us.dict
rmen-us.lm.bin
rm-ren-us
rm-ren-us-adapt
rmbw
rmmap_adapt
rmmllr_solve
rmmk_s2sendump
rmgauden_counts
rmmixw_counts
# rm mllr_matrix
rmtmat_counts
find.-typef-name'*.mfc'-delete
I also read that for continous model, we need to include MLLR matrix in the command line like -mllr mllr_matrix but we also need to change the model directory to the MAP adapted model path right?
Any help would be highly appreciated !!!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
in there. So am I using continous or phonetically tied model?
Default en-us model is phonetically tied. You need to use map adaptation for it, not mllr.
I also read that for continous model, we need to include MLLR matrix in the command line like -mllr mllr_matrix but we also need to change the model directory to the MAP adapted model path right?
You either apply map or mllr, joint map+mllr is not covered by tutorial.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I have created a C++ application to train/adapt the default en-us acoustic model using 20 Arctic speech files that are in wav format. The sample rate is 16000 Hz.
I successfully created the .fileids and .transcription files and followed the steps as mentioned in this site: http://cmusphinx.sourceforge.net/wiki/tutorialadapt
The adaptation process seems to be quite smooth but I could not find any improvement in the decoding accuracy. I tried using both mllr as well as map adaptation for ultimate accuracy but of no use. Then I noticed something in mllr execution and it says "Estimation of 0th regression in MLLR failed.. Estimation of 1st regression in MLLR failed.. " and so on. I have copied my MLLR command line output and pasted it below for reference:
And also I am not quite sure about the model type that I am using. I checked feat.params file and I saw
and
in there. So am I using continous or phonetically tied model?
I included all the adaptation commands in a shell script and I have pasted in here for reference. Please note that the below shown script saves the newly adapted model in a folder of user's choice as exported into environment as PSt_ADAPTED_MODEL_PATH variable. All the wav files, copied acoustic, language models and dictionary will be saved and processed in the path as mentioned by the user (PSt_TRAINING_WORKSPACE)
I also read that for continous model, we need to include MLLR matrix in the command line like
-mllr mllr_matrix
but we also need to change the model directory to the MAP adapted model path right?Any help would be highly appreciated !!!
Default en-us model is phonetically tied. You need to use map adaptation for it, not mllr.
You either apply map or mllr, joint map+mllr is not covered by tutorial.