CMU Sphinx / Forums / Help: Adapting Acoustic Spanish Model

Good morning,

I'd like to adapt the existing Spanish acoustic model "voxforge-es-0.2" using new recordings created by me. After having read the tutorial, I have understood:

1.- I have to use the acoustic model (mdef, feat.params, means... files); the ".dic" file and the ".lm" file of voxforge.

2.- I have to save in the same folder the new studio recordings with their corresponding "transcription" and "ids" files.

A) Is it correct or I am wrong?

After that, I have generated the ".mfc" files from the audio files without problem. Besides, in order to collect statistics, I have used...

./bw \
-hmmdir voxforge_es_sphinx.cd_ptm_3000 \
-moddeffn voxforge_es_sphinx.cd_ptm_3000/mdef.txt \
-ts2cbfn .ptm. \
-feat 1s_c_d_dd \
-svspec 0-12/13-25/26-38 \
-cmn current \
-agc none \
-dictfn voxforge_es_sphinx.dic \
-ctlfn prueba_train.fileids \
-lsnfn prueba_train.transcription \
-accumdir .

since the "feat.params" file of voxforge is as follows:

-lowerf 130
-upperf 6800
-nfilt 25
-transform dct
-lifter 22
-feat 1s_c_d_dd
-svspec 0-12/13-25/26-38
-agc none
-cmn current
-varnorm no
-model ptm
-cmninit 40,3,3

In the sphinx_train.cfg of voxforge the following variable is used:

$CFG_HMM_TYPE = '.ptm'

During the execution, a lot of warnings pop up (similar to the ones below):

WARN: Unable to lookup word 'aclaracion' in the dictionary
WARN: "next_utt_states.c": Unable to produce phonetic transcription for the utterance ' ~~aclaracion~~ '

B) Is the process correct or I am wrong?

Anyway, I've continued the process following the tutorial. Finally, I've used MAP adaption method:

./map_adapt \
-moddeffn voxforge_es_sphinx.cd_ptm_3000/mdef.txt \
-ts2cbfn .ptm. \
-meanfn voxforge_es_sphinx.cd_ptm_3000/means \
-varfn voxforge_es_sphinx.cd_ptm_3000/variances \
-mixwfn voxforge_es_sphinx.cd_ptm_3000/mixture_weights \
-tmatfn voxforge_es_sphinx.cd_ptm_3000/transition_matrices \
-accumdir . \
-mapmeanfn voxforge_es_sphinx.cd_ptm_3000_adapt/means \
-mapvarfn voxforge_es_sphinx.cd_ptm_3000_adapt/variances \
-mapmixwfn voxforge_es_sphinx.cd_ptm_3000_adapt/mixture_weights \
-maptmatfn voxforge_es_sphinx.cd_ptm_3000_adapt/transition_matrices

with no error. The output is shown below:

Current configuration:
[NAME] [DEFLT] [VALUE]
-accumdir .,
-bayesmean yes yes
-example no no
-fixedtau no no
-help no no
-mapmeanfn voxforge_es_sphinx.cd_ptm_3000_adapt/means
-mapmixwfn voxforge_es_sphinx.cd_ptm_3000_adapt/mixture_weights
-maptmatfn voxforge_es_sphinx.cd_ptm_3000_adapt/transition_matrices
-mapvarfn voxforge_es_sphinx.cd_ptm_3000_adapt/variances
-meanfn voxforge_es_sphinx.cd_ptm_3000/means
-mixwfn voxforge_es_sphinx.cd_ptm_3000/mixture_weights
-moddeffn voxforge_es_sphinx.cd_ptm_3000/mdef.txt
-mwfloor 0.00001 1.000000e-05
-tau 10.0 1.000000e+01
-tmatfn voxforge_es_sphinx.cd_ptm_3000/transition_matrices
-tpfloor 0.0001 1.000000e-04
-ts2cbfn .ptm.
-varfloor 0.00001 1.000000e-05
-varfn voxforge_es_sphinx.cd_ptm_3000/variances

INFO: s3gau_io.c(169): Read voxforge_es_sphinx.cd_ptm_3000/means [26x3x128 array]
INFO: s3gau_io.c(169): Read voxforge_es_sphinx.cd_ptm_3000/variances [26x3x128 array]
INFO: s3mixw_io.c(117): Read voxforge_es_sphinx.cd_ptm_3000/mixture_weights [3078x3x128 array]
INFO: s3tmat_io.c(118): Read voxforge_es_sphinx.cd_ptm_3000/transition_matrices [26x3x4 array]
INFO: main.c(433): Reading and accumulating observation counts from .
INFO: s3gau_io.c(386): Read ./gauden_counts with means with vars [26x3x128 vector arrays]
INFO: s3mixw_io.c(117): Read ./mixw_counts [3078x3x128 array]
INFO: s3tmat_io.c(118): Read ./tmat_counts [26x3x4 array]
INFO: main.c(78): Estimating tau hyperparameter from variances and observations
INFO: main.c(496): Reading voxforge_es_sphinx.cd_ptm_3000/mdef.txt
INFO: model_def_io.c(573): Model definition info:
INFO: model_def_io.c(574): 26506 total models defined (26 base, 26480 tri)
INFO: model_def_io.c(575): 106024 total states
INFO: model_def_io.c(576): 3078 total tied states
INFO: model_def_io.c(577): 78 total tied CI states
INFO: model_def_io.c(578): 26 total tied transition matrices
INFO: model_def_io.c(579): 4 max state/model
INFO: model_def_io.c(580): 4 min state/model
INFO: main.c(132): Re-estimating mixture weights using MAP
INFO: main.c(201): Re-estimating transition probabilities using MAP
INFO: main.c(534): Re-estimating means using Bayesian interpolation
INFO: main.c(540): Interpolating tau hyperparameter for PTM models
INFO: main.c(542): Re-estimating variances using MAP
INFO: s3gau_io.c(228): Wrote voxforge_es_sphinx.cd_ptm_3000_adapt/means [26x3x128 array]
INFO: s3gau_io.c(228): Wrote voxforge_es_sphinx.cd_ptm_3000_adapt/variances [26x3x128 array]
INFO: s3mixw_io.c(233): Wrote voxforge_es_sphinx.cd_ptm_3000_adapt/mixture_weights [3078x3x128 array]
INFO: s3tmat_io.c(176): Wrote voxforge_es_sphinx.cd_ptm_3000_adapt/transition_matrices [26x3x4 array]

C) Is everything OK? Is "voxforge_es_sphinx.cd_ptm_3000_adapt" folder my new acoustic model?

Sorry for the inconveniences and thanks in advance.

Alejandro

Adapting Acoustic Spanish Model

Speech Recognition Toolkit

Forums

Help

Adapting Acoustic Spanish Model document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

A) Is it correct or I am wrong?

B) Is the process correct or I am wrong?

C) Is everything OK? Is "voxforge_es_sphinx.cd_ptm_3000_adapt" folder my new acoustic model?

Adapting Acoustic Spanish Model