CMU Sphinx / Forums / Help: Building the acoustic model process

First, prepare the data:
1.etc/6965.dic:

0   Z IY R OW
1   W AH N
2   T UW
3   TH R IY
4   F OW R
5   F AY V
6   S IH K S
7   S EH V AH N
8   EY T
9   N AY N

2.etc/6965.filler:

<s>             SIL
</s>            SIL
<sil>           SIL

etc/6965.lm
4.etc/6965.lm.dmp
5.etc/6965.phone:

AH
AY
EH
EY
F
IH
IY
K
N
OW
R
S
SIL
T
TH
UW
V
W
Z
etc/6965_train.fileids:

gen_fest_0001
gen_fest_0002
gen_fest_0003
gen_fest_0004
gen_fest_0005
gen_fest_0006
gen_fest_0007
gen_fest_0008
gen_fest_0009
gen_fest_0010

7.etc/6965_train.transcription:

<s> 0 </s> (gen_fest_0001)
<s> 1 </s> (gen_fest_0002)
<s> 2 </s> (gen_fest_0003)
<s> 3 </s> (gen_fest_0004)
<s> 4 </s> (gen_fest_0005)
<s> 5 </s> (gen_fest_0006)
<s> 6 </s> (gen_fest_0007)
<s> 7 </s> (gen_fest_0008)
<s> 8 </s> (gen_fest_0009)
<s> 9 </s> (gen_fest_0010)

8.wav/gen_fest_0001....10

Started command：
1. perl ../pocketsphinx/scripts/setup_sphinx.pl -task 6965
perl ../sphinxtrain/scripts_pl/setup_SphinxTrain.pl -task 6965

6965 get the following folder structure：

  bin
  bwaccumdir 
  etc
  feat
  logdir
  model_parameters
  model_architecture   
  scripts_pl
  wav

2.Copy this folder from Sphinxtrain manually.
3.update sphinx_train.cfg:

# Configuration script for sphinx trainer                  -*-mode:Perl-*-

$CFG_VERBOSE = 1;       # Determines how much goes to the screen.

# These are filled in at configuration time
$CFG_DB_NAME = "6965";
$CFG_BASE_DIR = "/home/king/cmuclmtk/6965";
$CFG_SPHINXTRAIN_DIR = "../sphinxtrain";

# Directory containing SphinxTrain binaries
$CFG_BIN_DIR = "$CFG_BASE_DIR/bin";
$CFG_GIF_DIR = "$CFG_BASE_DIR/gifs";
$CFG_SCRIPT_DIR = "$CFG_BASE_DIR/scripts_pl";

# Experiment name, will be used to name model files and log files
$CFG_EXPTNAME = "$CFG_DB_NAME";

# Audio waveform and feature file information
$CFG_WAVFILES_DIR = "$CFG_BASE_DIR/wav";
$CFG_WAVFILE_EXTENSION = 'wav';
$CFG_WAVFILE_TYPE = 'mswav'; # one of nist, mswav, raw
$CFG_FEATFILES_DIR = "$CFG_BASE_DIR/feat";
$CFG_FEATFILE_EXTENSION = 'mfc';
$CFG_VECTOR_LENGTH = 13;

$CFG_MIN_ITERATIONS = 1;  # BW Iterate at least this many times
$CFG_MAX_ITERATIONS = 10; # BW Don't iterate more than this, somethings likely wrong.

# (none/max) Type of AGC to apply to input files
$CFG_AGC = 'none';
# (current/none) Type of cepstral mean subtraction/normalization
# to apply to input files
$CFG_CMN = 'current';
# (yes/no) Normalize variance of input files to 1.0
$CFG_VARNORM = 'no';
# (yes/no) Use letter-to-sound rules to guess pronunciations of
# unknown words (English, 40-phone specific)
$CFG_LTSOOV = 'no';
# (yes/no) Train full covariance matrices
$CFG_FULLVAR = 'no';
# (yes/no) Use diagonals only of full covariance matrices for
# Forward-Backward evaluation (recommended if CFG_FULLVAR is yes)
$CFG_DIAGFULL = 'no';

# (yes/no) Perform vocal tract length normalization in training.  This
# will result in a "normalized" model which requires VTLN to be done
# during decoding as well.
$CFG_VTLN = 'no';
# Starting warp factor for VTLN
$CFG_VTLN_START = 0.80;
# Ending warp factor for VTLN
$CFG_VTLN_END = 1.40;
# Step size of warping factors
$CFG_VTLN_STEP = 0.05;

# Directory to write queue manager logs to
$CFG_QMGR_DIR = "$CFG_BASE_DIR/qmanager";
# Directory to write training logs to
$CFG_LOG_DIR = "$CFG_BASE_DIR/logdir";
# Directory for re-estimation counts
$CFG_BWACCUM_DIR = "$CFG_BASE_DIR/bwaccumdir";
# Directory to write model parameter files to
$CFG_MODEL_DIR = "$CFG_BASE_DIR/model_parameters";

# Directory containing transcripts and control files for
# speaker-adaptive training
$CFG_LIST_DIR = "$CFG_BASE_DIR/etc";

# Decoding variables for MMIE training
$CFG_LANGUAGEWEIGHT = "11.5";
$CFG_BEAMWIDTH      = "1e-100";
$CFG_WORDBEAM       = "1e-80";
$CFG_LANGUAGEMODEL  = "$CFG_LIST_DIR/$CFG_DB_NAME.lm.DMP";
$CFG_WORDPENALTY    = "0.2";

# Lattice pruning variables
$CFG_ABEAM              = "1e-50";
$CFG_NBEAM              = "1e-10";
$CFG_PRUNED_DENLAT_DIR  = "$CFG_BASE_DIR/pruned_denlat";

# MMIE training related variables
$CFG_MMIE = "no";
$CFG_MMIE_MAX_ITERATIONS = 5;
$CFG_LATTICE_DIR = "$CFG_BASE_DIR/lattice";
$CFG_MMIE_TYPE   = "rand"; # Valid values are "rand", "best" or "ci"
$CFG_MMIE_CONSTE = "3.0";
$CFG_NUMLAT_DIR  = "$CFG_BASE_DIR/numlat";
$CFG_DENLAT_DIR  = "$CFG_BASE_DIR/denlat";

# Variables used in main training of models
$CFG_DICTIONARY     = "$CFG_LIST_DIR/$CFG_DB_NAME.dic";
$CFG_RAWPHONEFILE   = "$CFG_LIST_DIR/$CFG_DB_NAME.phone";
$CFG_FILLERDICT     = "$CFG_LIST_DIR/$CFG_DB_NAME.filler";
$CFG_LISTOFFILES    = "$CFG_LIST_DIR/${CFG_DB_NAME}_train.fileids";
$CFG_TRANSCRIPTFILE = "$CFG_LIST_DIR/${CFG_DB_NAME}_train.transcription";
$CFG_FEATPARAMS     = "$CFG_LIST_DIR/feat.params";

# Variables used in characterizing models

$CFG_HMM_TYPE = '.cont.'; # Sphinx III
#$CFG_HMM_TYPE  = '.semi.'; # PocketSphinx and Sphinx II
#$CFG_HMM_TYPE  = '.ptm.'; # PocketSphinx (larger data sets)

if (($CFG_HMM_TYPE ne ".semi.")
    and ($CFG_HMM_TYPE ne ".ptm.")
    and ($CFG_HMM_TYPE ne ".cont.")) {
  die "Please choose one CFG_HMM_TYPE out of '.cont.', '.ptm.', or '.semi.', " .
    "currently $CFG_HMM_TYPE\n";
}

# This configuration is fastest and best for most acoustic models in
# PocketSphinx and Sphinx-III.  See below for Sphinx-II.
$CFG_STATESPERHMM = 3;
$CFG_SKIPSTATE = 'no';

if ($CFG_HMM_TYPE eq '.semi.') {
  $CFG_DIRLABEL = 'semi';
# Four stream features for PocketSphinx
  $CFG_FEATURE = "s2_4x";
  $CFG_NUM_STREAMS = 4;
  $CFG_INITIAL_NUM_DENSITIES = 256;
  $CFG_FINAL_NUM_DENSITIES = 256;
  die "For semi continuous models, the initial and final models have the same density" 
    if ($CFG_INITIAL_NUM_DENSITIES != $CFG_FINAL_NUM_DENSITIES);
} elsif ($CFG_HMM_TYPE eq '.ptm.') {
  $CFG_DIRLABEL = 'ptm';
# Four stream features for PocketSphinx
  $CFG_FEATURE = "s2_4x";
  $CFG_NUM_STREAMS = 4;
  $CFG_INITIAL_NUM_DENSITIES = 64;
  $CFG_FINAL_NUM_DENSITIES = 64;
  die "For phonetically tied models, the initial and final models have the same density" 
    if ($CFG_INITIAL_NUM_DENSITIES != $CFG_FINAL_NUM_DENSITIES);
} elsif ($CFG_HMM_TYPE eq '.cont.') {
  $CFG_DIRLABEL = 'cont';
# Single stream features - Sphinx 3
  $CFG_FEATURE = "1s_c_d_dd";
  $CFG_NUM_STREAMS = 1;
  $CFG_INITIAL_NUM_DENSITIES = 1;
  $CFG_FINAL_NUM_DENSITIES = 2;
  die "The initial has to be less than the final number of densities" 
    if ($CFG_INITIAL_NUM_DENSITIES > $CFG_FINAL_NUM_DENSITIES);
}

# Number of top gaussians to score a frame. A little bit less accurate computations
# make training significantly faster. Uncomment to apply this during the training
# For good accuracy make sure you are using the same setting in decoder
# In theory this can be different for various training stages. For example 4 for
# CI stage and 16 for CD stage
# $CFG_CI_NTOP = 4;
# $CFG_CD_NTOP = 16;

# (yes/no) Train multiple-gaussian context-independent models (useful
# for alignment, use 'no' otherwise) in the models created
# specifically for forced alignment
$CFG_FALIGN_CI_MGAU = 'no';
# (yes/no) Train multiple-gaussian context-independent models (useful
# for alignment, use 'no' otherwise)
$CFG_CI_MGAU = 'no';
# Number of tied states (senones) to create in decision-tree clustering
$CFG_N_TIED_STATES = 50;
# How many parts to run Forward-Backward estimatinon in
$CFG_NPART = 1;

# (yes/no) Train a single decision tree for all phones (actually one
# per state) (useful for grapheme-based models, use 'no' otherwise)
$CFG_CROSS_PHONE_TREES = 'no';

# Use force-aligned transcripts (if available) as input to training
$CFG_FORCEDALIGN = 'no';

# Use a specific set of models for force alignment.  If not defined,
# context-independent models for the current experiment will be used.
$CFG_FORCE_ALIGN_MDEF = "$CFG_BASE_DIR/model_architecture/$CFG_EXPTNAME.falign_ci.mdef";
$CFG_FORCE_ALIGN_MODELDIR = "$CFG_MODEL_DIR/$CFG_EXPTNAME.falign_ci_$CFG_DIRLABEL";

# Use a specific dictionary and filler dictionary for force alignment.
# If these are not defined, a dictionary and filler dictionary will be
# created from $CFG_DICTIONARY and $CFG_FILLERDICT, with noise words
# removed from the filler dictionary and added to the dictionary (this
# is because the force alignment is not very good at inserting them)

# $CFG_FORCE_ALIGN_DICTIONARY = "$ST::CFG_BASE_DIR/falignout$ST::CFG_EXPTNAME.falign.dict";;
# $CFG_FORCE_ALIGN_FILLERDICT = "$ST::CFG_BASE_DIR/falignout/$ST::CFG_EXPTNAME.falign.fdict";;

# Use a particular beam width for force alignment.  The wider
# (i.e. smaller numerically) the beam, the fewer sentences will be
# rejected for bad alignment.
$CFG_FORCE_ALIGN_BEAM = 1e-60;

# Calculate an LDA/MLLT transform?
$CFG_LDA_MLLT = 'no';
# Dimensionality of LDA/MLLT output
$CFG_LDA_DIMENSION = 29;

# This is actually just a difference in log space (it doesn't make
# sense otherwise, because different feature parameters have very
# different likelihoods)
$CFG_CONVERGENCE_RATIO = 0.1;

# Queue::POSIX for multiple CPUs on a local machine
# Queue::PBS to use a PBS/TORQUE queue
$CFG_QUEUE_TYPE = "Queue";

# Name of queue to use for PBS/TORQUE
$CFG_QUEUE_NAME = "workq";

# (yes/no) Build questions for decision tree clustering automatically
$CFG_MAKE_QUESTS = "yes";
# If CFG_MAKE_QUESTS is yes, questions are written to this file.
# If CFG_MAKE_QUESTS is no, questions are read from this file.
$CFG_QUESTION_SET = "${CFG_BASE_DIR}/model_architecture/${CFG_EXPTNAME}.tree_questions";
#$CFG_QUESTION_SET = "${CFG_BASE_DIR}/linguistic_questions";

$CFG_CP_OPERATION = "${CFG_BASE_DIR}/model_architecture/${CFG_EXPTNAME}.cpmeanvar";

# This variable has to be defined, otherwise utils.pl will not load.
$CFG_DONE = 1;

return 1;

command: perl scripts_pl/make_feats.pl -ctl etc/6965_train.fileids
get the following folder structure feat:

feat/gen_fest_(0001...0010).mfc

5.command:

sudo  perl scripts_pl/RunAll.pl

command:

sudo perl scripts_pl/00.verify/verify_all.pl
sudo perl scripts_pl/10.vector_quantize/slave.VQ.pl
sudo perl scripts_pl/20.ci_hmm/slave_convg.pl
sudo perl scripts_pl/30.cd_hmm_untied/slave_convg.pl
sudo perl scripts_pl/40.buildtrees/slave.treebuilder.pl
sudo perl scripts_pl/45.prunetree/slave-state-tying.pl
sudo perl scripts_pl/50.cd_hmm_tied/slave_convg.pl
sudo perl scripts_pl/90.deleted_interpolation/deleted_interpolation.pl

Now, the folder structure is as follows 6965:

bin
bwaccumdir
denlat
etc
feat
lattice
logdir
model_architecture
model_parameters
numlat
pruned_denlat
python
qmanager
result
scripts_pl
trees
wav
6965.html

model_parameters get the following folder structure:

6965.cd_cont_50
6965.cd_cont_50_1
6965.cd_cont_50_2
6965.cd_cont_initial
6965.cd_cont_untied
6965.ci_cont
6965.ci_cont_flatinitial
6965.ci_lda
6965.ci_lda_flatinitial
6965.ci_semi_flatinitial

OK,I applied to the android model：

        c.setString("-hmm",
                "/sdcard/Android/data/edu.cmu.pocketsphinx/6965/model_parameters/6965.cd_cont_50");//hub4opensrc.cd_continuous_8gau
        c.setString("-dict",
                "/sdcard/Android/data/edu.cmu.pocketsphinx/6965/etc/6965.dic");//cmu07a tidigits.dic
        c.setString("-lm",
                "/sdcard/Android/data/edu.cmu.pocketsphinx/6965/etc/6965.lm.dmp");

run program,i got:

06-01 09:43:05.594: INFO/ActivityManager(1097): Process edu.cmu.pocketsphinx.demo (pid 2753) has died.

pocketsphinx.log:

INFO: cmd_ln.c(512): Parsing command line:


Current configuration:
[NAME]      [DEFLT]     [VALUE]
-agc        none        none
-agcthresh  2.0     2.000000e+00
-alpha      0.97        9.700000e-01
-ascale     20.0        2.000000e+01
-aw     1       1
-backtrace  no      no
-beam       1e-48       1.000000e-48
-bestpath   yes     yes
-bestpathlw 9.5     9.500000e+00
-bghist     no      no
-ceplen     13      13
-cmn        current     current
-cmninit    8.0     8.0
-compallsen no      no
-debug              0
-dict               
-dictcase   no      no
-dither     no      no
-doublebw   no      no
-ds     1       1
-fdict              
-feat       1s_c_d_dd   1s_c_d_dd
-featparams         
-fillprob   1e-8        1.000000e-08
-frate      100     100
-fsg                
-fsgusealtpron  yes     yes
-fsgusefiller   yes     yes
-fwdflat    yes     yes
-fwdflatbeam    1e-64       1.000000e-64
-fwdflatefwid   4       4
-fwdflatlw  8.5     8.500000e+00
-fwdflatsfwin   25      25
-fwdflatwbeam   7e-29       7.000000e-29
-fwdtree    yes     yes
-hmm                
-input_endian   little      little
-jsgf               
-kdmaxbbi   -1      -1
-kdmaxdepth 0       0
-kdtree             
-latsize    5000        5000
-lda                
-ldadim     0       0
-lextreedump    0       0
-lifter     0       0
-lm             
-lmctl              
-lmname     default     default
-logbase    1.0001      1.000100e+00
-logfn              
-logspec    no      no
-lowerf     133.33334   1.333333e+02
-lpbeam     1e-40       1.000000e-40
-lponlybeam 7e-29       7.000000e-29
-lw     6.5     6.500000e+00
-maxhmmpf   -1      -1
-maxnewoov  20      20
-maxwpf     -1      -1
-mdef               
-mean               
-mfclogdir          
-min_endfr  0       0
-mixw               
-mixwfloor  0.0000001   1.000000e-07
-mllr               
-mmap       yes     yes
-ncep       13      13
-nfft       512     512
-nfilt      40      40
-nwpen      1.0     1.000000e+00
-pbeam      1e-48       1.000000e-48
-pip        1.0     1.000000e+00
-pl_beam    1e-10       1.000000e-10
-pl_pbeam   1e-5        1.000000e-05
-pl_window  0       0
-rawlogdir          
-remove_dc  no      no
-round_filters  yes     yes
-samprate   16000       1.600000e+04
-seed       -1      -1
-sendump            
-senlogdir          
-senmgau            
-silprob    0.005       5.000000e-03
-smoothspec no      no
-svspec             
-tmat               
-tmatfloor  0.0001      1.000000e-04
-topn       4       4
-topn_beam  0       0
-toprule            
-transform  legacy      legacy
-unit_area  yes     yes
-upperf     6855.4976   6.855498e+03
-usewdphones    no      no
-uw     1.0     1.000000e+00
-var                
-varfloor   0.0001      1.000000e-04
-varnorm    no      no
-verbose    no      no
-warp_params            
-warp_type  inverse_linear  inverse_linear
-wbeam      7e-29       7.000000e-29
-wip        0.65        6.500000e-01
-wlen       0.025625    2.562500e-02

INFO: cmd_ln.c(512): Parsing command line:
\
    -alpha 0.97 \
    -doublebw no \
    -nfilt 40 \
    -ncep 13 \
    -lowerf 133.33334 \
    -upperf 6855.4976 \
    -nfft 512 \
    -wlen 0.0256 \
    -transform legacy \
    -feat 1s_c_d_dd \
    -agc none \
    -cmn current \
    -varnorm no

Current configuration:
[NAME]      [DEFLT]     [VALUE]
-agc        none        none
-agcthresh  2.0     2.000000e+00
-alpha      0.97        9.700000e-01
-ceplen     13      13
-cmn        current     current
-cmninit    8.0     8.0
-dither     no      no
-doublebw   no      no
-feat       1s_c_d_dd   1s_c_d_dd
-frate      100     100
-input_endian   little      little
-lda                
-ldadim     0       0
-lifter     0       0
-logspec    no      no
-lowerf     133.33334   1.333333e+02
-ncep       13      13
-nfft       512     512
-nfilt      40      40
-remove_dc  no      no
-round_filters  yes     yes
-samprate   16000       1.600000e+04
-seed       -1      -1
-smoothspec no      no
-svspec             
-transform  legacy      legacy
-unit_area  yes     yes
-upperf     6855.4976   6.855498e+03
-varnorm    no      no
-verbose    no      no
-warp_params            
-warp_type  inverse_linear  inverse_linear
-wlen       0.025625    2.560000e-02

INFO: acmod.c(238): Parsed model-specific feature parameters from /sdcard/Android/data/edu.cmu.pocketsphinx/6965/model_parameters/6965.cd_cont_50/feat.params
INFO: feat.c(860): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: mdef.c(520): Reading model definition: /sdcard/Android/data/edu.cmu.pocketsphinx/6965/model_parameters/6965.cd_cont_50/mdef
INFO: bin_mdef.c(173): Allocating 356 * 8 bytes (2 KiB) for CD tree
INFO: tmat.c(205): Reading HMM transition probability matrices: /sdcard/Android/data/edu.cmu.pocketsphinx/6965/model_parameters/6965.cd_cont_50/transition_matrices
INFO: acmod.c(117): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /sdcard/Android/data/edu.cmu.pocketsphinx/6965/model_parameters/6965.cd_cont_50/means
INFO: ms_gauden.c(292): 111 codebook, 1 feature, size: INFO: ms_gauden.c(294):  2x39INFO: ms_gauden.c(295): 
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /sdcard/Android/data/edu.cmu.pocketsphinx/6965/model_parameters/6965.cd_cont_50/variances
INFO: ms_gauden.c(292): 111 codebook, 1 feature, size: INFO: ms_gauden.c(294):  2x39INFO: ms_gauden.c(295): 
INFO: ms_gauden.c(356): 2485 variance values floored
INFO: acmod.c(119): Attempting to use PTHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /sdcard/Android/data/edu.cmu.pocketsphinx/6965/model_parameters/6965.cd_cont_50/means
INFO: ms_gauden.c(292): 111 codebook, 1 feature, size: INFO: ms_gauden.c(294):  2x39INFO: ms_gauden.c(295): 
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /sdcard/Android/data/edu.cmu.pocketsphinx/6965/model_parameters/6965.cd_cont_50/variances
INFO: ms_gauden.c(292): 111 codebook, 1 feature, size: INFO: ms_gauden.c(294):  2x39INFO: ms_gauden.c(295): 
INFO: ms_gauden.c(356): 2485 variance values floored
INFO: ptm_mgau.c(670): Reading mixture weights file '/sdcard/Android/data/edu.cmu.pocketsphinx/6965/model_parameters/6965.cd_cont_50/mixture_weights'
INFO: ptm_mgau.c(764): Read 111 x 1 x 2 mixture weights
INFO: ptm_mgau.c(830): Maximum top-N: 4
INFO: phone_loop_search.c(105): State beam -230231 Phone exit beam -115115 Insertion penalty 0
INFO: dict.c(306): Allocating 4109 * 20 bytes (80 KiB) for word entries
INFO: dict.c(321): Reading main dictionary: /sdcard/Android/data/edu.cmu.pocketsphinx/6965/etc/6965.dic
INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(324): 10 words read
INFO: dict.c(330): Reading filler dictionary: /sdcard/Android/data/edu.cmu.pocketsphinx/6965/model_parameters/6965.cd_cont_50/noisedict
INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(333): 3 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 19^3 * 2 bytes (13 KiB) for word-initial triphones
INFO: dict2pid.c(131): Allocated 4408 bytes (4 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 4408 bytes (4 KiB) for single-phone word triphones
INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
INFO: ngram_model_dmp.c(196): ngrams 1=12, 2=20, 3=10
INFO: ngram_model_dmp.c(242):       12 = LM.unigrams(+trailer) read
INFO: ngram_model_dmp.c(290):       20 = LM.bigrams(+trailer) read
INFO: ngram_model_dmp.c(315):       10 = LM.trigrams read
INFO: ngram_model_dmp.c(339):        3 = LM.prob2 entries read
INFO: ngram_model_dmp.c(358):        3 = LM.bo_wt2 entries read
INFO: ngram_model_dmp.c(378):        2 = LM.prob3 entries read
INFO: ngram_model_dmp.c(406):        1 = LM.tseg_base entries read
INFO: ngram_model_dmp.c(462):       12 = ascii word strings read
INFO: ngram_search_fwdtree.c(99): 10 unique initial diphones
INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 4 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 4 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 140
INFO: ngram_search_fwdtree.c(338): after: 10 root, 12 non-root channels, 3 single-phone words
INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25

Sorry, this relatively long-winded post, but I really need help. Thanks in
advance everyone. Where is my question?

hi,
I pocketsphinx copy pocketsphinx_batch into bin folder. After:

./scripts_pl/decode/slave.pl

I got :

king@ubuntu:~/cmuclmtk/6965$ sudo perl ./scripts_pl/decode/slave.pl
MODULE: DECODE Decoding using models previously trained
        Decoding 10 segments starting at 0 (part 1 of 1) 
        0% 
WARNING: This step had 0 ERROR messages and 1 WARNING messages.  Please check the log file for details.
        Aligning results to find error rate
        SENTENCE ERROR: 30.0% (3/10)   WORD ERROR RATE: 30.0% (3/10)

I think he has a successful decoding, then use the android recognition.
Program still died. I was not wrong there are other places?

thinks

6965_test.transcription:

0 (gen_fest_0001)
1 (gen_fest_0002)
2 (gen_fest_0003)
3 (gen_fest_0004)
4 (gen_fest_0005)
5 (gen_fest_0006)
6 (gen_fest_0007)
7 (gen_fest_0008)
8 (gen_fest_0009)
9 (gen_fest_0010)

6965_test.fileids:

gen_fest_0001
gen_fest_0002
gen_fest_0003
gen_fest_0004
gen_fest_0005
gen_fest_0006
gen_fest_0007
gen_fest_0008
gen_fest_0009
gen_fest_0010

Thanks in advance！

Building the acoustic model process

Speech Recognition Toolkit

Forums

Help

Building the acoustic model process document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Building the acoustic model process