CMU Sphinx / Forums / Help: Force-aligned transcripts

Thanks, I read about what you suggested me and now I am trying to train a
model with forced aligned transcripts but I've got a error on the
03.force_align step.

This is the last INFO given in the log:

INFO: main_align.c(919): codDef-art001a: 2360 input frames
lt-sphinx3_align: s3_align.c:927: align_build_sent_hmm: Assertion `stail.predlist' failed.

but, it shows a error before:

ERROR: "main_align.c", line 851: Uttid mismatch: ctlfile = "codDef-art001a"; transcript = "CodDefConsumidor16k/codDef-art001a"

The codDef-art001a is the first transcription in my train.fileids file, and
that is not the first time I'm using it in a train, so to the others programs
that use it there is not any problem with it. Well, the fact is: I do not know
what is happening.

Here is entire log:

INFO:   Initialization of the log add table
INFO:   Log-Add table size = 29356 x 2 >> 0
INFO:   
INFO: feat.c(684): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO:   Reading HMM in Sphinx 3 Model format
INFO:   Model Definition File: /home/20102000030/sphinx/acoustic_model/acoustic_model/model_architecture/db.falign_ci.mdef
INFO:   Mean File: /home/20102000030/sphinx/acoustic_model/acoustic_model/model_parameters/db.falign_ci_cont/means
INFO:   Variance File: /home/20102000030/sphinx/acoustic_model/acoustic_model/model_parameters/db.falign_ci_cont/variances
INFO:   Mixture Weight File: /home/20102000030/sphinx/acoustic_model/acoustic_model/model_parameters/db.falign_ci_cont/mixture_weights
INFO:   Transition Matrices File: /home/20102000030/sphinx/acoustic_model/acoustic_model/model_parameters/db.falign_ci_cont/transition_matrices
INFO: mdef.c(683): Reading model definition: /home/20102000030/sphinx/acoustic_model/acoustic_model/model_architecture/db.falign_ci.mdef
INFO:   Initialization of mdef_t, report:
INFO:   38 CI-phone, 0 CD-phone, 3 emitstate/phone, 114 CI-sen, 114 Sen, 38 Sen-Seq
INFO:   
INFO: kbcore.c(299): Using optimized GMM computation for Continuous HMM, -topn will be ignored
INFO: cont_mgau.c(164): Reading mixture gaussian file '/home/20102000030/sphinx/acoustic_model/acoustic_model/model_parameters/db.falign_ci_cont/means'
INFO: cont_mgau.c(423): 114 mixture Gaussians, 1 components, 1 streams, veclen 39
INFO: cont_mgau.c(164): Reading mixture gaussian file '/home/20102000030/sphinx/acoustic_model/acoustic_model/model_parameters/db.falign_ci_cont/variances'
INFO: cont_mgau.c(423): 114 mixture Gaussians, 1 components, 1 streams, veclen 39
INFO: cont_mgau.c(524): Reading mixture weights file '/home/20102000030/sphinx/acoustic_model/acoustic_model/model_parameters/db.falign_ci_cont/mixture_weights'
WARNING: "cont_mgau.c", line 667: Weight normalization failed for 3 senones
INFO: cont_mgau.c(679): Read 114 x 1 mixture weights
INFO: cont_mgau.c(707): Removing uninitialized Gaussian densities
 0 1 2
WARNING: "cont_mgau.c", line 781: 3 densities removed (3 mixtures removed entirely)
INFO: cont_mgau.c(797): Applying variance floor
INFO: cont_mgau.c(815): 0 variance values floored
INFO: cont_mgau.c(863): Precomputing Mahalanobis distance invariants
INFO: tmat.c(119): Reading HMM transition probability matrices: /home/20102000030/sphinx/acoustic_model/acoustic_model/model_parameters/db.falign_ci_cont/transition_matrices
WARNING: "tmat.c", line 192: Normalization failed for tmat 0 from state 0
WARNING: "tmat.c", line 192: Normalization failed for tmat 0 from state 1
WARNING: "tmat.c", line 192: Normalization failed for tmat 0 from state 2
INFO:   Initialization of tmat_t, report:
INFO:   Read 38 transition matrices of size 3x4
INFO:   
INFO: dict.c(385): Reading main dictionary: /home/20102000030/sphinx/acoustic_model/acoustic_model/falignout/db.falign.dict
INFO: dict.c(388): 12874 words read
INFO: dict.c(393): Reading filler dictionary: /home/20102000030/sphinx/acoustic_model/acoustic_model/falignout/db.falign.fdict
INFO: dict.c(396): 3 words read
INFO: dict.c(429): Added 0 fillers from mdef file
INFO: s3_align.c(1357): logs3(beam)= -460586
ERROR: "main_align.c", line 851: Uttid mismatch: ctlfile = "codDef-art001a"; transcript = "CodDefConsumidor16k/codDef-art001a"
INFO: feat.c(1176): At directory /home/20102000030/sphinx/acoustic_model/acoustic_model/feat
INFO: feat.c(993): Reading mfc file: '/home/20102000030/sphinx/acoustic_model/acoustic_model/feat/CodDefConsumidor16k/codDef-art001a.mfc'[0..-1]
INFO: cmn.c(175): CMN: 11.56 -0.08  0.05  0.25 -0.17 -0.08 -0.11 -0.31 -0.18 -0.29 -0.12 -0.19 -0.18 
INFO: main_align.c(919): codDef-art001a: 2360 input frames
lt-sphinx3_align: s3_align.c:927: align_build_sent_hmm: Assertion `stail.predlist' failed.

Thanks

Force-aligned transcripts

Speech Recognition Toolkit

Forums

Help

Force-aligned transcripts document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Force-aligned transcripts