Menu

How to create Phone 'acht'?

2016-03-19
2016-04-28
  • Jens Kallup

    Jens Kallup - 2016-03-19

    Hello Community,

    need Help:
    Please see the ERROR Output

    ERROR: "dict.c", line 195: Line 2: Phone 'acht' is mising in the acoustic model; word 'ACHT' ignored
    ERROR: "ngram_search_fwdtree.c", line 336: No word from the language model has pronunciation in the dictionary

    Thank you
    Jens Kallup

    root@debian:/opt/SpinxBase/bin/jkallup/model-de# pocketsphinx_continuous -hmm . -lm vocdeu.lm -dict vocdeu.dict -infile jkallup.acht.wav
    INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from ./feat.params
    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -allphone
    -allphone_ci no no
    -alpha 0.97 9.700000e-01
    -ascale 20.0 2.000000e+01
    -aw 1 1
    -backtrace no no
    -beam 1e-48 1.000000e-48
    -bestpath yes yes
    -bestpathlw 9.5 9.500000e+00
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 8.0
    -compallsen no no
    -debug 0
    -dict vocdeu.dict
    -dictcase no no
    -dither no yes
    -doublebw no no
    -ds 1 1
    -fdict
    -feat 1s_c_d_dd 1s_c_d_dd
    -featparams
    -fillprob 1e-8 1.000000e-08
    -frate 100 100
    -fsg
    -fsgusealtpron yes yes
    -fsgusefiller yes yes
    -fwdflat yes yes
    -fwdflatbeam 1e-64 1.000000e-64
    -fwdflatefwid 4 4
    -fwdflatlw 8.5 8.500000e+00
    -fwdflatsfwin 25 25
    -fwdflatwbeam 7e-29 7.000000e-29
    -fwdtree yes yes
    -hmm .
    -input_endian little little
    -jsgf
    -keyphrase
    -kws
    -kws_delay 10 10
    -kws_plp 1e-1 1.000000e-01
    -kws_threshold 1 1.000000e+00
    -latsize 5000 5000
    -lda
    -ldadim 0 0
    -lifter 0 0
    -lm vocdeu.lm
    -lmctl
    -lmname
    -logbase 1.0001 1.000100e+00
    -logfn
    -logspec no no
    -lowerf 133.33334 1.333333e+02
    -lpbeam 1e-40 1.000000e-40
    -lponlybeam 7e-29 7.000000e-29
    -lw 6.5 6.500000e+00
    -maxhmmpf 30000 30000
    -maxwpf -1 -1
    -mdef
    -mean
    -mfclogdir
    -min_endfr 0 0
    -mixw
    -mixwfloor 0.0000001 1.000000e-07
    -mllr
    -mmap yes yes
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 40
    -nwpen 1.0 1.000000e+00
    -pbeam 1e-48 1.000000e-48
    -pip 1.0 1.000000e+00
    -pl_beam 1e-10 1.000000e-10
    -pl_pbeam 1e-10 1.000000e-10
    -pl_pip 1.0 1.000000e+00
    -pl_weight 3.0 3.000000e+00
    -pl_window 5 5
    -rawlogdir
    -remove_dc no no
    -remove_noise yes yes
    -remove_silence yes yes
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -sendump
    -senlogdir
    -senmgau
    -silprob 0.005 5.000000e-03
    -smoothspec no no
    -svspec
    -tmat
    -tmatfloor 0.0001 1.000000e-04
    -topn 4 4
    -topn_beam 0 0
    -toprule
    -transform legacy legacy
    -unit_area yes yes
    -upperf 6855.4976 6.855498e+03
    -uw 1.0 1.000000e+00
    -vad_postspeech 50 50
    -vad_prespeech 20 20
    -vad_startspeech 10 10
    -vad_threshold 2.0 2.000000e+00
    -var
    -varfloor 0.0001 1.000000e-04
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wbeam 7e-29 7.000000e-29
    -wip 0.65 6.500000e-01
    -wlen 0.025625 2.560000e-02

    INFO: fe_interface.c(325): Using -1 as the seed.
    INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
    INFO: mdef.c(518): Reading model definition: ./mdef
    INFO: bin_mdef.c(181): Allocating 19470 * 8 bytes (152 KiB) for CD tree
    INFO: tmat.c(206): Reading HMM transition probability matrices: ./transition_matrices
    INFO: acmod.c(117): Attempting to use PTM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: ./means
    INFO: ms_gauden.c(292): 3123 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 8x39
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: ./variances
    INFO: ms_gauden.c(292): 3123 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 8x39
    INFO: ms_gauden.c(354): 1768 variance values floored
    INFO: ptm_mgau.c(801): Number of codebooks exceeds 256: 3123
    INFO: acmod.c(119): Attempting to use semi-continuous computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: ./means
    INFO: ms_gauden.c(292): 3123 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 8x39
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: ./variances
    INFO: ms_gauden.c(292): 3123 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 8x39
    INFO: ms_gauden.c(354): 1768 variance values floored
    INFO: acmod.c(121): Falling back to general multi-stream GMM computation
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: ./means
    INFO: ms_gauden.c(292): 3123 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 8x39
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: ./variances
    INFO: ms_gauden.c(292): 3123 codebook, 1 feature, size:
    INFO: ms_gauden.c(294): 8x39
    INFO: ms_gauden.c(354): 1768 variance values floored
    INFO: ms_senone.c(149): Reading senone mixture weights: ./mixture_weights
    INFO: ms_senone.c(200): Truncating senone logs3(pdf) values by 10 bits
    INFO: ms_senone.c(207): Not transposing mixture weights in memory
    WARN: "ms_senone.c", line 254: Weight normalization failed for 1 mixture weights components
    INFO: ms_senone.c(268): Read mixture weights for 3123 senones: 1 features x 8 codewords
    INFO: ms_senone.c(320): Mapping senones to individual codebooks
    INFO: ms_mgau.c(141): The value of topn: 4
    INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
    INFO: dict.c(320): Allocating 4101 * 32 bytes (128 KiB) for word entries
    INFO: dict.c(333): Reading main dictionary: vocdeu.dict
    ERROR: "dict.c", line 195: Line 2: Phone 'acht' is mising in the acoustic model; word 'ACHT' ignored
    INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(336): 1 words read
    INFO: dict.c(358): Reading filler dictionary: ./noisedict
    INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(361): 3 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(406): Allocating 41^3 * 2 bytes (134 KiB) for word-initial triphones
    INFO: dict2pid.c(132): Allocated 40672 bytes (39 KiB) for word-final triphones
    INFO: dict2pid.c(196): Allocated 40672 bytes (39 KiB) for single-phone word triphones
    INFO: ngram_model_trie.c(347): Trying to read LM in trie binary format
    INFO: ngram_model_trie.c(358): Header doesn't match
    INFO: ngram_model_trie.c(176): Trying to read LM in arpa format
    INFO: ngram_model_trie.c(192): LM of order 3
    INFO: ngram_model_trie.c(194): #1-grams: 5
    INFO: ngram_model_trie.c(194): #2-grams: 3
    INFO: ngram_model_trie.c(194): #3-grams: 3
    INFO: lm_trie.c(473): Training quantizer
    INFO: lm_trie.c(481): Building LM trie
    INFO: ngram_search_fwdtree.c(99): 0 unique initial diphones
    INFO: ngram_search_fwdtree.c(148): 0 root, 0 non-root channels, 5 single-phone words
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(192): before: 0 root, 0 non-root channels, 5 single-phone words
    INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 128
    ERROR: "ngram_search_fwdtree.c", line 336: No word from the language model has pronunciation in the dictionary
    INFO: ngram_search_fwdtree.c(339): after: 0 root, 0 non-root channels, 4 single-phone words
    INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
    INFO: continuous.c(307): pocketsphinx_continuous COMPILED ON: Mar 13 2016, AT: 22:44:14

    INFO: cmn_prior.c(131): cmn_prior_update: from < 8.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
    INFO: cmn_prior.c(149): cmn_prior_update: to < 5.18 -0.38 -0.10 -0.08 -0.01 0.01 0.02 -0.05 -0.06 -0.02 -0.05 -0.04 -0.01 >
    INFO: ngram_search_fwdtree.c(1553): 372 words recognized (3/fr)
    INFO: ngram_search_fwdtree.c(1555): 621 senones evaluated (5/fr)
    INFO: ngram_search_fwdtree.c(1559): 459 channels searched (3/fr), 0 1st, 459 last
    INFO: ngram_search_fwdtree.c(1562): 459 words for which last channels evaluated (3/fr)
    INFO: ngram_search_fwdtree.c(1564): 0 candidate words for entering last phone (0/fr)
    INFO: ngram_search_fwdtree.c(1567): fwdtree 0.02 CPU 0.016 xRT
    INFO: ngram_search_fwdtree.c(1570): fwdtree 0.02 wall 0.015 xRT
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 2 words
    INFO: ngram_search_fwdflat.c(948): 372 words recognized (3/fr)
    INFO: ngram_search_fwdflat.c(950): 384 senones evaluated (3/fr)
    INFO: ngram_search_fwdflat.c(952): 378 channels searched (2/fr)
    INFO: ngram_search_fwdflat.c(954): 378 words searched (2/fr)
    INFO: ngram_search_fwdflat.c(957): 76 word transitions (0/fr)
    INFO: ngram_search_fwdflat.c(960): fwdflat 0.00 CPU 0.000 xRT
    INFO: ngram_search_fwdflat.c(963): fwdflat 0.00 wall 0.001 xRT
    INFO: ngram_search.c(1253): lattice start node .0 end node .68
    INFO: ngram_search.c(1279): Eliminated 0 nodes before end node
    INFO: ngram_search.c(1384): Lattice has 6 nodes, 7 links
    INFO: ps_lattice.c(1380): Bestpath score: -305
    INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(:68:127) = -30935
    INFO: ps_lattice.c(1441): Joint P(O,S) = -36828 P(S|O) = -5893
    INFO: ngram_search.c(875): bestpath 0.00 CPU 0.000 xRT
    INFO: ngram_search.c(878): bestpath 0.00 wall 0.000 xRT

    INFO: cmn_prior.c(131): cmn_prior_update: from < 5.18 -0.38 -0.10 -0.08 -0.01 0.01 0.02 -0.05 -0.06 -0.02 -0.05 -0.04 -0.01 >
    INFO: cmn_prior.c(149): cmn_prior_update: to < 6.72 -0.11 -0.35 -0.23 -0.06 0.01 0.00 -0.04 -0.03 -0.07 -0.06 -0.06 -0.05 >
    INFO: ngram_search_fwdtree.c(1553): 227 words recognized (2/fr)
    INFO: ngram_search_fwdtree.c(1555): 525 senones evaluated (5/fr)
    INFO: ngram_search_fwdtree.c(1559): 254 channels searched (2/fr), 0 1st, 254 last
    INFO: ngram_search_fwdtree.c(1562): 254 words for which last channels evaluated (2/fr)
    INFO: ngram_search_fwdtree.c(1564): 0 candidate words for entering last phone (0/fr)
    INFO: ngram_search_fwdtree.c(1567): fwdtree 0.02 CPU 0.015 xRT
    INFO: ngram_search_fwdtree.c(1570): fwdtree 0.02 wall 0.015 xRT
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 3 words
    INFO: ngram_search_fwdflat.c(948): 241 words recognized (2/fr)
    INFO: ngram_search_fwdflat.c(950): 570 senones evaluated (5/fr)
    INFO: ngram_search_fwdflat.c(952): 394 channels searched (3/fr)
    INFO: ngram_search_fwdflat.c(954): 394 words searched (3/fr)
    INFO: ngram_search_fwdflat.c(957): 112 word transitions (1/fr)
    INFO: ngram_search_fwdflat.c(960): fwdflat 0.00 CPU 0.004 xRT
    INFO: ngram_search_fwdflat.c(963): fwdflat 0.00 wall 0.001 xRT
    INFO: ngram_search.c(1253): lattice start node .0 end node .66
    INFO: ngram_search.c(1279): Eliminated 0 nodes before end node
    INFO: ngram_search.c(1384): Lattice has 25 nodes, 19 links
    INFO: ps_lattice.c(1380): Bestpath score: -1199
    INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(:66:104) = -92083
    INFO: ps_lattice.c(1441): Joint P(O,S) = -97923 P(S|O) = -5840
    INFO: ngram_search.c(875): bestpath 0.00 CPU 0.000 xRT
    INFO: ngram_search.c(878): bestpath 0.00 wall 0.000 xRT
    A
    INFO: cmn_prior.c(131): cmn_prior_update: from < 6.72 -0.11 -0.35 -0.23 -0.06 0.01 0.00 -0.04 -0.03 -0.07 -0.06 -0.06 -0.05 >
    INFO: cmn_prior.c(149): cmn_prior_update: to < 6.72 -0.11 -0.35 -0.23 -0.06 0.01 0.00 -0.04 -0.03 -0.07 -0.06 -0.06 -0.05 >
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 0 words
    INFO: ngram_search_fwdtree.c(432): TOTAL fwdtree 0.04 CPU 0.015 xRT
    INFO: ngram_search_fwdtree.c(435): TOTAL fwdtree 0.04 wall 0.016 xRT
    INFO: ngram_search_fwdflat.c(176): TOTAL fwdflat 0.00 CPU 0.002 xRT
    INFO: ngram_search_fwdflat.c(179): TOTAL fwdflat 0.00 wall 0.001 xRT
    INFO: ngram_search.c(303): TOTAL bestpath 0.00 CPU 0.000 xRT
    INFO: ngram_search.c(306): TOTAL bestpath 0.00 wall 0.000 xRT
    root@debian:/opt/SpinxBase/bin/jkallup/model-de#

     
    • Nickolay V. Shmyrev

      You need to use proper phoneset in your model. You can look for available phones in mdef file in the model.

       

Log in to post a comment.

MongoDB Logo MongoDB