Menu

I have question about .wav and train.transcription

Help
Suwit
2016-10-20
2016-11-04
  • Suwit

    Suwit - 2016-10-20

    Can I use same transcription, but I add my friend voice for build acoustioc model?

    Example..

    .lm
    hello world


    .dic
    hello H EH L OW
    world W ER L D


    train.transcription
    hello (001)
    world (002)
    hello world (003)
    hello (001)
    world (002)
    hello world (003)


    train.fileids
    myvoice/001
    myvoice/002
    myvoice/002
    friendvoice/001
    friendvoice/002
    friendvoice/003


     
  • Arseniy Gorin

    Arseniy Gorin - 2016-10-20

    you better provide unique names for transcription records.
    probably the easiest way is to add prefixes to file names

    myvoice/myvoice-001
    "hello (myvoice-001)"

     
  • Suwit

    Suwit - 2016-10-20

    Thank you Arseniy Gorin

     
  • Suwit

    Suwit - 2016-10-25

    I don't understand about error when I train acousic model.
    This is my etc and wav files.
    https://www.dropbox.com/s/bj2nnhu66u3e54w/th.rar?dl=0

    I3asta@I3asta-PC ~/cmusphinx/th
    $ sphinxtrain run
    Sphinxtrain path: /usr/local/lib/sphinxtrain
    Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
    Running the training
    MODULE: 000 Computing feature from audio files
    Extracting features from segments starting at (part 1 of 1)
    Extracting features from segments starting at (part 1 of 1)
    Feature extraction is done
    MODULE: 00 verify training files
    Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
    Found 158 words using 36 phones
    Phase 2: Checking to make sure there are not duplicate entries in the dictionary
    Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
    Phase 4: Checking number of lines in the transcript file should match lines in fileids file
    Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
    Estimated Total Hours Training: 0.594975
    This is a small amount of data, no comment at this time
    Phase 6: Checking that all the words in the transcript are in the dictionary
    Words in dictionary: 155
    Words in filler dictionary: 3
    Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
    MODULE: 0000 train grapheme-to-phoneme model
    Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
    MODULE: 01 Train LDA transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 02 Train MLLT transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 05 Vector Quantization
    Skipped for continuous models
    MODULE: 10 Training Context Independent models for forced alignment and VTLN
    Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
    Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
    MODULE: 11 Force-aligning transcripts
    Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
    MODULE: 12 Force-aligning data for VTLN
    Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
    MODULE: 20 Training Context Independent models
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...models...
    Phase 2: Flat initialize
    Phase 3: Forward-Backward
    Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = -162.93761175773
    Baum welch starting for 1 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = -160.175077384204
    Convergence Ratio = 2.76253437352617
    Baum welch starting for 1 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 8 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = -153.612100897145
    Convergence Ratio = 6.56297648705882
    Baum welch starting for 1 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 6 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = -151.408540948568
    Convergence Ratio = 2.20355994857684
    Baum welch starting for 1 Gaussian(s), iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 24 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = -150.737858343772
    Convergence Ratio = 0.670682604796127
    Baum welch starting for 1 Gaussian(s), iteration: 6 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 74 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 6
    Current Overall Likelihood Per Frame = -150.139403201473
    Convergence Ratio = 0.598455142298604
    Baum welch starting for 1 Gaussian(s), iteration: 7 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 110 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 7
    Current Overall Likelihood Per Frame = -149.536280058695
    Convergence Ratio = 0.60312314277806
    Baum welch starting for 1 Gaussian(s), iteration: 8 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 152 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 8
    Current Overall Likelihood Per Frame = -149.294163836795
    Convergence Ratio = 0.242116221899551
    Baum welch starting for 1 Gaussian(s), iteration: 9 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 204 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 9
    WARNING: WARNING: NEGATIVE CONVERGENCE RATIO AT ITER 9! CHECK BW AND NORM LOGFILES
    Current Overall Likelihood Per Frame = -149.458003119633
    Training completed after 9 iterations
    MODULE: 30 Training Context Dependent models
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...
    Phase 2: Initialization
    Phase 3: Forward-Backward
    Baum welch starting for iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 246 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = -149.585194467099
    Baum welch starting for iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 236 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = -146.312275930276
    Convergence Ratio = 3.27291853682331
    Baum welch starting for iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 228 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = -145.047022110794
    Convergence Ratio = 1.26525381948198
    Baum welch starting for iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 240 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = -144.776889041995
    Convergence Ratio = 0.270133068798543
    Baum welch starting for iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 242 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = -144.627634306443
    Convergence Ratio = 0.14925473555175
    Baum welch starting for iteration: 6 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 246 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 6
    Current Overall Likelihood Per Frame = -144.515302800087
    Convergence Ratio = 0.112331506356185
    Baum welch starting for iteration: 7 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 250 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 7
    Current Overall Likelihood Per Frame = -144.43547465202
    Training completed after 7 iterations
    MODULE: 40 Build Trees
    Phase 1: Cleaning up old log files...
    Phase 2: Make Questions
    Phase 3: Tree building
    Processing each phone with each state
    AA 0
    AA 1
    AA 2
    AE 0
    AE 1
    AE 2
    AH 0
    AH 1
    AH 2
    AO 0
    AO 1
    AO 2
    AW 0
    AW 1
    AW 2
    AY 0
    AY 1
    AY 2
    B 0
    B 1
    B 2
    CH 0
    CH 1
    CH 2
    D 0
    D 1
    D 2
    ER 0
    ER 1
    ER 2
    EY 0
    EY 1
    EY 2
    H 0
    H 1
    H 2
    IH 0
    IH 1
    IH 2
    IY 0
    IY 1
    IY 2
    J 0
    J 1
    J 2
    K 0
    K 1
    K 2
    L 0
    L 1
    L 2
    M 0
    M 1
    M 2
    N 0
    N 1
    N 2
    NG 0
    NG 1
    NG 2
    NH 0
    NH 1
    NH 2
    OW 0
    OW 1
    OW 2
    OY 0
    OY 1
    OY 2
    P 0
    P 1
    P 2
    PL 0
    PL 1
    PL 2
    R 0
    R 1
    R 2
    S 0
    S 1
    S 2
    SH 0
    SH 1
    SH 2
    Skipping SIL
    T 0
    T 1
    T 2
    UA 0
    UA 1
    UA 2
    UE 0
    UE 1
    UE 2
    UH 0
    UH 1
    UH 2
    UW 0
    UW 1
    UW 2
    W 0
    W 1
    W 2
    Y 0
    Y 1
    Y 2
    MODULE: 45 Prune Trees
    Phase 1: Tree Pruning
    Phase 2: State Tying
    MODULE: 50 Training Context dependent models
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...
    Phase 2: Copy CI to CD initialize
    Phase 3: Forward-Backward
    Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 246 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = -149.585194467099
    Baum welch starting for 1 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 236 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = -148.575945604116
    Convergence Ratio = 1.00924886298256
    Baum welch starting for 1 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 232 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = -148.380488434365
    Convergence Ratio = 0.195457169751364
    Baum welch starting for 1 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 234 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = -148.259936016746
    Convergence Ratio = 0.12055241761928
    Baum welch starting for 1 Gaussian(s), iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 234 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = -148.241540055249
    Split Gaussians, increase by 1
    Current Overall Likelihood Per Frame = -148.241540055249
    Convergence Ratio = 0.0183959614973901
    Baum welch starting for 2 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 232 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = -148.530429813022
    Baum welch starting for 2 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 252 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = -147.314837139976
    Convergence Ratio = 1.21559267304568
    Baum welch starting for 2 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 282 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = -146.904560834429
    Convergence Ratio = 0.410276305547427
    Baum welch starting for 2 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 282 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = -146.487933402472
    Convergence Ratio = 0.416627431956812
    Baum welch starting for 2 Gaussian(s), iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 288 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = -146.343189383615
    Convergence Ratio = 0.144744018856812
    Baum welch starting for 2 Gaussian(s), iteration: 6 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 264 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 6
    Current Overall Likelihood Per Frame = -146.111521276654
    Convergence Ratio = 0.231668106961081
    Baum welch starting for 2 Gaussian(s), iteration: 7 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 238 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 7
    Current Overall Likelihood Per Frame = -145.919291876175
    Convergence Ratio = 0.192229400479278
    Baum welch starting for 2 Gaussian(s), iteration: 8 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 236 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 8
    Current Overall Likelihood Per Frame = -145.84483937846
    Split Gaussians, increase by 2
    Current Overall Likelihood Per Frame = -145.84483937846
    Convergence Ratio = 0.0744524977151855
    Baum welch starting for 4 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 232 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = -146.213872524476
    Baum welch starting for 4 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 186 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = -144.902801305623
    Convergence Ratio = 1.31107121885273
    Baum welch starting for 4 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 136 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = -143.756106945591
    Convergence Ratio = 1.1466943600324
    Baum welch starting for 4 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 142 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = -143.141857103
    Convergence Ratio = 0.61424984259051
    Baum welch starting for 4 Gaussian(s), iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 154 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = -142.929139555383
    Convergence Ratio = 0.212717547616563
    Baum welch starting for 4 Gaussian(s), iteration: 6 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 158 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 6
    Current Overall Likelihood Per Frame = -142.829640892717
    Split Gaussians, increase by 4
    Current Overall Likelihood Per Frame = -142.829640892717
    Convergence Ratio = 0.09949866266615
    Baum welch starting for 8 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 158 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = -143.139663037156
    Baum welch starting for 8 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 168 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = -141.988936976619
    Convergence Ratio = 1.15072606053687
    Baum welch starting for 8 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 176 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = -141.134116702025
    Convergence Ratio = 0.854820274594317
    Baum welch starting for 8 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 154 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = -140.21922385513
    Convergence Ratio = 0.91489284689456
    Baum welch starting for 8 Gaussian(s), iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 130 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = -139.628290235819
    Convergence Ratio = 0.590933619310874
    Baum welch starting for 8 Gaussian(s), iteration: 6 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 122 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 6
    Current Overall Likelihood Per Frame = -139.427454645023
    Convergence Ratio = 0.200835590796117
    Baum welch starting for 8 Gaussian(s), iteration: 7 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 118 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 7
    Current Overall Likelihood Per Frame = -139.308395085265
    Convergence Ratio = 0.119059559757687
    Baum welch starting for 8 Gaussian(s), iteration: 8 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    ERROR: This step had 116 ERROR messages and 0 WARNING messages. Please check the log file for details.
    Normalization for iteration: 8
    Current Overall Likelihood Per Frame = -139.233087306203
    Training for 8 Gaussian(s) completed after 8 iterations
    MODULE: 60 Lattice Generation
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 61 Lattice Pruning
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 62 Lattice Format Conversion
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 65 MMIE Training
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 90 deleted interpolation
    Skipped for continuous models
    MODULE: DECODE Decoding using models previously trained
    Decoding 1 segments starting at 0 (part 1 of 1)
    0% ERROR: FATAL: "batch.c", line 822: PocketSphinx decoder init failed

    ERROR: This step had 3 ERROR messages and 0 WARNING messages. Please check the log file for details.
    ERROR: Failed to start pocketsphinx_batch
    Aligning results to find error rate
    Can't open /home/I3asta/cmusphinx/th/result/th-1-1.match
    word_align.pl failed with error code 65280 at /usr/local/lib/sphinxtrain/scripts/decode/slave.pl line 173.

     
    • Nickolay V. Shmyrev

      Most of the files you have in your database have very bad quality - the bandwidth is just 2khz, you should have used bad microphone or resampled it somehow. You need to record with better quality.

      To deal with such small badwidth you can set in config:

      $CFG_NUM_FILT = 13; # For wideband speech it's 25, for telephone 8khz reasonable value is 15
      $CFG_LO_FILT = 130; # For telephone 8kHz speech value is 200
      $CFG_HI_FILT = 1800; # For telephone 8kHz speech value is 3500
      

      It is better to record properly though.

      To learn more about sound bandwidth you can check

      http://cmusphinx.sourceforge.net/wiki/faq/#qwhat_is_sample_rate_and_how_does_it_affect_accuracy

       
  • Suwit

    Suwit - 2016-10-26

    I change config that follow Nickolay V. Shmyrev told me which can build acoustic model sucess but when i test speech it nothing happen. what should i do ? buy a new microphones and new record my voice.


    $ pocketsphinx_continuous -hmm model_parameters/th.cd_cont_200 -lm th.lm -dict th.dic -inmic yes
    INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from model_parameters/th.cd_cont_200/feat.params
    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -allphone
    -allphone_ci no no
    -alpha 0.97 9.700000e-01
    -ascale 20.0 2.000000e+01
    -aw 1 1
    -backtrace no no
    -beam 1e-48 1.000000e-48
    -bestpath yes yes
    -bestpathlw 9.5 9.500000e+00
    -ceplen 13 13
    -cmn live batch
    -cmninit 40,3,-1 40,3,-1
    -compallsen no no
    -debug 0
    -dict th.dic
    -dictcase no no
    -dither no no
    -doublebw no no
    -ds 1 1
    -fdict
    -feat 1s_c_d_dd 1s_c_d_dd
    -featparams
    -fillprob 1e-8 1.000000e-08
    -frate 100 100
    -fsg
    -fsgusealtpron yes yes
    -fsgusefiller yes yes
    -fwdflat yes yes
    -fwdflatbeam 1e-64 1.000000e-64
    -fwdflatefwid 4 4
    -fwdflatlw 8.5 8.500000e+00
    -fwdflatsfwin 25 25
    -fwdflatwbeam 7e-29 7.000000e-29
    -fwdtree yes yes
    -hmm model_parameters/th.cd_cont_200
    -input_endian little little
    -jsgf
    -keyphrase
    -kws
    -kws_delay 10 10
    -kws_plp 1e-1 1.000000e-01
    -kws_threshold 1 1.000000e+00
    -latsize 5000 5000
    -lda
    -ldadim 0 0
    -lifter 0 22
    -lm th.lm
    -lmctl
    -lmname
    -logbase 1.0001 1.000100e+00
    -logfn
    -logspec no no
    -lowerf 133.33334 1.300000e+02
    -lpbeam 1e-40 1.000000e-40
    -lponlybeam 7e-29 7.000000e-29
    -lw 6.5 6.500000e+00
    -maxhmmpf 30000 30000
    -maxwpf -1 -1
    -mdef
    -mean
    -mfclogdir
    -min_endfr 0 0
    -mixw
    -mixwfloor 0.0000001 1.000000e-07
    -mllr
    -mmap yes yes
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 13
    -nwpen 1.0 1.000000e+00
    -pbeam 1e-48 1.000000e-48
    -pip 1.0 1.000000e+00
    -pl_beam 1e-10 1.000000e-10
    -pl_pbeam 1e-10 1.000000e-10
    -pl_pip 1.0 1.000000e+00
    -pl_weight 3.0 3.000000e+00
    -pl_window 5 5
    -rawlogdir
    -remove_dc no no
    -remove_noise yes yes
    -remove_silence yes yes
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -sendump
    -senlogdir
    -senmgau
    -silprob 0.005 5.000000e-03
    -smoothspec no no
    -svspec
    -tmat
    -tmatfloor 0.0001 1.000000e-04
    -topn 4 4
    -topn_beam 0 0
    -toprule
    -transform legacy dct
    -unit_area yes yes
    -upperf 6855.4976 1.800000e+03
    -uw 1.0 1.000000e+00
    -vad_postspeech 50 50
    -vad_prespeech 20 20
    -vad_startspeech 10 10
    -vad_threshold 2.0 2.000000e+00
    -var
    -varfloor 0.0001 1.000000e-04
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wbeam 7e-29 7.000000e-29
    -wip 0.65 6.500000e-01
    -wlen 0.025625 2.562500e-02

    INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='batch', VARNORM='no', AGC='none'
    INFO: mdef.c(518): Reading model definition: model_parameters/th.cd_cont_200/mdef
    INFO: bin_mdef.c(181): Allocating 4756 * 8 bytes (37 KiB) for CD tree
    INFO: tmat.c(149): Reading HMM transition probability matrices: model_parameters/th.cd_cont_200/transition_matrices
    INFO: acmod.c(113): Attempting to use PTM computation module
    INFO: ms_gauden.c(127): Reading mixture gaussian parameter: model_parameters/th.cd_cont_200/means
    INFO: ms_gauden.c(242): 308 codebook, 1 feature, size:
    INFO: ms_gauden.c(244): 8x39
    INFO: ms_gauden.c(127): Reading mixture gaussian parameter: model_parameters/th.cd_cont_200/variances
    INFO: ms_gauden.c(242): 308 codebook, 1 feature, size:
    INFO: ms_gauden.c(244): 8x39
    INFO: ms_gauden.c(304): 2 variance values floored
    INFO: ptm_mgau.c(804): Number of codebooks exceeds 256: 308
    INFO: acmod.c(115): Attempting to use semi-continuous computation module
    INFO: ms_gauden.c(127): Reading mixture gaussian parameter: model_parameters/th.cd_cont_200/means
    INFO: ms_gauden.c(242): 308 codebook, 1 feature, size:
    INFO: ms_gauden.c(244): 8x39
    INFO: ms_gauden.c(127): Reading mixture gaussian parameter: model_parameters/th.cd_cont_200/variances
    INFO: ms_gauden.c(242): 308 codebook, 1 feature, size:
    INFO: ms_gauden.c(244): 8x39
    INFO: ms_gauden.c(304): 2 variance values floored
    INFO: acmod.c(117): Falling back to general multi-stream GMM computation
    INFO: ms_gauden.c(127): Reading mixture gaussian parameter: model_parameters/th.cd_cont_200/means
    INFO: ms_gauden.c(242): 308 codebook, 1 feature, size:
    INFO: ms_gauden.c(244): 8x39
    INFO: ms_gauden.c(127): Reading mixture gaussian parameter: model_parameters/th.cd_cont_200/variances
    INFO: ms_gauden.c(242): 308 codebook, 1 feature, size:
    INFO: ms_gauden.c(244): 8x39
    INFO: ms_gauden.c(304): 2 variance values floored
    INFO: ms_senone.c(149): Reading senone mixture weights: model_parameters/th.cd_cont_200/mixture_weights
    INFO: ms_senone.c(200): Truncating senone logs3(pdf) values by 10 bits
    INFO: ms_senone.c(207): Not transposing mixture weights in memory
    INFO: ms_senone.c(268): Read mixture weights for 308 senones: 1 features x 8 codewords
    INFO: ms_senone.c(320): Mapping senones to individual codebooks
    INFO: ms_mgau.c(144): The value of topn: 4
    INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
    INFO: dict.c(320): Allocating 4254 * 32 bytes (132 KiB) for word entries
    INFO: dict.c(333): Reading main dictionary: th.dic
    INFO: dict.c(213): Dictionary size 155, allocated 1 KiB for strings, 1 KiB for phones
    INFO: dict.c(336): 155 words read
    INFO: dict.c(358): Reading filler dictionary: model_parameters/th.cd_cont_200/noisedict
    INFO: dict.c(213): Dictionary size 158, allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(361): 3 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(406): Allocating 36^3 * 2 bytes (91 KiB) for word-initial triphones
    INFO: dict2pid.c(132): Allocated 31392 bytes (30 KiB) for word-final triphones
    INFO: dict2pid.c(196): Allocated 31392 bytes (30 KiB) for single-phone word triphones
    INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
    INFO: ngram_model_trie.c(365): Header doesn't match
    INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
    INFO: ngram_model_trie.c(193): LM of order 3
    INFO: ngram_model_trie.c(195): #1-grams: 157
    INFO: ngram_model_trie.c(195): #2-grams: 352
    INFO: ngram_model_trie.c(195): #3-grams: 407
    INFO: lm_trie.c(474): Training quantizer
    INFO: lm_trie.c(482): Building LM trie
    INFO: ngram_search_fwdtree.c(74): Initializing search tree
    INFO: ngram_search_fwdtree.c(101): 97 unique initial diphones
    INFO: ngram_search_fwdtree.c(186): Creating search channels
    INFO: ngram_search_fwdtree.c(323): Max nonroot chan increased to 128
    INFO: ngram_search_fwdtree.c(333): Created 0 root, 0 non-root channels, 3 single-phone words
    INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
    INFO: continuous.c(307): pocketsphinx_continuous COMPILED ON: Oct 13 2016, AT: 13:31:48

    Allocating 32 buffers of 2500 samples each
    INFO: continuous.c(252): Ready....
    INFO: continuous.c(261): Listening...
    INFO: cmn_live.c(120): Update from < 40.00 3.00 -1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
    INFO: cmn_live.c(138): Update to < 15.67 2.14 -7.09 11.89 0.39 -6.04 12.70 4.77 -0.76 1.02 6.76 1.39 0.18 >
    INFO: ngram_search_fwdtree.c(1550): 847 words recognized (3/fr)
    INFO: ngram_search_fwdtree.c(1552): 978 senones evaluated (3/fr)
    INFO: ngram_search_fwdtree.c(1556): 929 channels searched (2/fr), 0 1st, 929 last
    INFO: ngram_search_fwdtree.c(1559): 929 words for which last channels evaluated (2/fr)
    INFO: ngram_search_fwdtree.c(1561): 0 candidate words for entering last phone (0/fr)
    INFO: ngram_search_fwdtree.c(1564): fwdtree 0.01 CPU 0.005 xRT
    INFO: ngram_search_fwdtree.c(1567): fwdtree 3.45 wall 1.051 xRT
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 2 words
    INFO: ngram_search_fwdflat.c(948): 946 words recognized (3/fr)
    INFO: ngram_search_fwdflat.c(950): 981 senones evaluated (3/fr)
    INFO: ngram_search_fwdflat.c(952): 975 channels searched (2/fr)
    INFO: ngram_search_fwdflat.c(954): 975 words searched (2/fr)
    INFO: ngram_search_fwdflat.c(957): 76 word transitions (0/fr)
    INFO: ngram_search_fwdflat.c(960): fwdflat 0.00 CPU 0.000 xRT
    INFO: ngram_search_fwdflat.c(963): fwdflat 0.00 wall 0.000 xRT
    INFO: ngram_search.c(1250): lattice start node .0 end node .276
    INFO: ngram_search.c(1276): Eliminated 0 nodes before end node
    INFO: ngram_search.c(1381): Lattice has 14 nodes, 17 links
    INFO: ps_lattice.c(1380): Bestpath score: -1425
    INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(:276:326) = -102633
    INFO: ps_lattice.c(1441): Joint P(O,S) = -110917 P(S|O) = -8284
    INFO: ngram_search.c(872): bestpath 0.00 CPU 0.000 xRT
    INFO: ngram_search.c(875): bestpath 0.00 wall 0.000 xRT

    INFO: continuous.c(275): Ready....
    INFO: continuous.c(261): Listening...
    INFO: cmn_live.c(120): Update from < 15.67 2.14 -7.09 11.89 0.39 -6.04 12.70 4.77 -0.76 1.02 6.76 1.39 0.18 >
    INFO: cmn_live.c(138): Update to < 12.62 3.14 -0.00 18.44 0.27 3.56 13.94 5.02 6.26 2.67 7.69 1.80 1.01 >
    INFO: ngram_search_fwdtree.c(1550): 331 words recognized (3/fr)
    INFO: ngram_search_fwdtree.c(1552): 363 senones evaluated (3/fr)
    INFO: ngram_search_fwdtree.c(1556): 357 channels searched (2/fr), 0 1st, 357 last
    INFO: ngram_search_fwdtree.c(1559): 357 words for which last channels evaluated (2/fr)
    INFO: ngram_search_fwdtree.c(1561): 0 candidate words for entering last phone (0/fr)
    INFO: ngram_search_fwdtree.c(1564): fwdtree 0.00 CPU 0.000 xRT
    INFO: ngram_search_fwdtree.c(1567): fwdtree 1.81 wall 1.486 xRT
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 2 words
    INFO: ngram_search_fwdflat.c(948): 331 words recognized (3/fr)
    INFO: ngram_search_fwdflat.c(950): 363 senones evaluated (3/fr)
    INFO: ngram_search_fwdflat.c(952): 357 channels searched (2/fr)
    INFO: ngram_search_fwdflat.c(954): 357 words searched (2/fr)
    INFO: ngram_search_fwdflat.c(957): 76 word transitions (0/fr)
    INFO: ngram_search_fwdflat.c(960): fwdflat 0.00 CPU 0.000 xRT
    INFO: ngram_search_fwdflat.c(963): fwdflat 0.00 wall 0.000 xRT
    INFO: ngram_search.c(1250): lattice start node .0 end node .53
    INFO: ngram_search.c(1276): Eliminated 0 nodes before end node
    INFO: ngram_search.c(1381): Lattice has 11 nodes, 7 links
    INFO: ps_lattice.c(1380): Bestpath score: -398
    INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(:53:120) = -41611
    INFO: ps_lattice.c(1441): Joint P(O,S) = -47328 P(S|O) = -5717
    INFO: ngram_search.c(872): bestpath 0.00 CPU 0.000 xRT
    INFO: ngram_search.c(875): bestpath 0.00 wall 0.000 xRT

    INFO: continuous.c(275): Ready....
    INFO: continuous.c(261): Listening...

     
  • Suwit

    Suwit - 2016-10-26

    I bought a new microphone (SAMSON GO MIC). I'm not sure that the my new audio file have good quality. Can you help me to check Audio files that I can use it or not?

    https://www.dropbox.com/s/z9kacgvwr3tbd8e/wav.rar?dl=0

    Thank you very much.

     
    • Nickolay V. Shmyrev

      Still not perfect, there must be something with your recorder, it seems you record compressed audio and then convert it to wav. You need to use raw audio recorder, like audacity.

       
  • Suwit

    Suwit - 2016-10-26

    I use audacity for recording now. Can you help me for check it again please?

    https://www.dropbox.com/s/dnarbd1xp6gb0tl/testAudio.rar?dl=0

     
    • Nickolay V. Shmyrev

      This is much better. Exactly what needed. Put microphone a bit further from your mouth to avoid random bursts from breath and it will be perfect.

       
  • Suwit

    Suwit - 2016-10-27

    Thank you very much Nickolay V. Shmyrev. I have a new question. I record new audio files and I used it without error. But when decoding state I found new error about .dmp files.


    (th1-1.log)

    INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
    ERROR: "ngram_model_trie.c", line 356: File /home/I3asta/cmusphinx/th/etc/th.lm.DMP not found
    INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
    ERROR: "ngram_model_trie.c", line 179: File /home/I3asta/cmusphinx/th/etc/th.lm.DMP not found
    INFO: ngram_model_trie.c(445): Trying to read LM in dmp format
    ERROR: "ngram_model_trie.c", line 447: Dump file /home/I3asta/cmusphinx/th/etc/th.lm.DMP not found
    FATAL: "batch.c", line 822: PocketSphinx decoder init failed
    Thu Oct 27 21:27:07 2016

     
    • Nickolay V. Shmyrev

      Your file is called th.lm, not th.lm.DMP, you need to change appropriate line in etc/sphinx_train.cfg:

      $DEC_CFG_LANGUAGEMODEL  = "$CFG_BASE_DIR/etc/${CFG_DB_NAME}.lm";
      
       
  • Suwit

    Suwit - 2016-10-27

    I think, I haven't enough audio files for train acoustic model.

    Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
    Estimated Total Hours Training: 0.770966666666667
    This is a small amount of data, no comment at this time


    Training for 8 Gaussian(s) completed after 7 iterations
    MODULE: 60 Lattice Generation
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 61 Lattice Pruning
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 62 Lattice Format Conversion
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 65 MMIE Training
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 90 deleted interpolation
    Skipped for continuous models
    MODULE: DECODE Decoding using models previously trained
    Decoding 150 segments starting at 0 (part 1 of 1)
    0%
    Aligning results to find error rate
    SENTENCE ERROR: 100.0% (150/150) WORD ERROR RATE: 100.0% (150/150)

     
    • Nickolay V. Shmyrev

      I wrote above you need a better language model as well.

       
  • Suwit

    Suwit - 2016-10-27

    I don't understand. Can you explain me more please?

     
    • Nickolay V. Shmyrev

      You need to collect at least ten megabyte of text data in order to train a good langauge model. Your language model th.lm is not properly trained.

      It is better to use more modern toolkit for language model training like SRILM.

       
  • Suwit

    Suwit - 2016-10-28

    How can I collect at least ten megabyte of text data? If I have 150 word and I
    I need to use just 20 sentences. Do I need sort word all possibilities?

    I have a new th.lm file to build from SRILM with 32 kilobyte. The results are still sentence and word error 100% .

    https://www.dropbox.com/s/bj2nnhu66u3e54w/th.rar?dl=0

    For example ( I need to use three sentence for command somthing. )

    th.dic

    ~~~~~~
    a A
    b B
    c C
    ~~~~

    th.txt ---> th.lm

    <s> a </s>
    <s> a b </s>
    <s> a b c </s> I need to use it.
    <s> a c b </s>
    <s> b </s>
    <s> b a </s>
    <s> b a c </s> I need to use it.
    <s> b c a </s>
    <s> c </s>
    <s> c a </s>
    <s> c a b </s> I need to use it.
    <s> c b a </s>
    
     

    Last edit: Nickolay V. Shmyrev 2016-10-28
  • Suwit

    Suwit - 2016-10-29

    I still don't understand. This is my problem files th.txt has too little sentences. that show "sentence and word error 100%" or other causes ?
    Can i genarator sentence from .dic or I must make a new sentence by my self ?
    Sorry Nickolay V. Shmyrev for repeat question

     

    Last edit: Suwit 2016-10-30
    • Nickolay V. Shmyrev

      The problem is that your th.txt th.lm are in windows-874 and your dictionary is in UTF-8 encoding. We recommend to use UTF-8 encoding everywhere.

       
      • Nickolay V. Shmyrev

        If you convert encoding to UTF-8, word error rate would be 6%.

         
  • Suwit

    Suwit - 2016-10-30

    Thanky you so much Nickolay V. Shmyrev for your advice.

     
  • Suwit

    Suwit - 2016-10-31

    I can't use acoustic model in Android when I build and run on Sumsung note 2.
    The program runs a few moments and then stop down. No error show up ,then I use it in debug mode.
    I found the following error :

    10/31 14:28:06: Launching app
    No apk changes detected since last installation, skipping installation of C:\Users\I3asta\Desktop\pocketsphinx-android-demo-master\app\build\outputs\apk\app-debug.apk
    $ adb shell am force-stop edu.cmu.sphinx.pocketsphinx
    $ adb shell am start -n "edu.cmu.sphinx.pocketsphinx/edu.cmu.pocketsphinx.demo.PocketSphinxActivity" -a android.intent.action.MAIN -c android.intent.category.LAUNCHER -D
    Waiting for application to come online: edu.cmu.sphinx.pocketsphinx.test | edu.cmu.sphinx.pocketsphinx
    Connecting to edu.cmu.sphinx.pocketsphinx
    I/System.out: Sending WAIT chunk
    W/ActivityThread: Application edu.cmu.sphinx.pocketsphinx is waiting for the debugger on port 8100...
    I/dalvikvm: Debugger is active
    I/System.out: Debugger has connected
    I/System.out: waiting for debugger to settle...
    I/System.out: waiting for debugger to settle...
    I/System.out: waiting for debugger to settle...
    Connected to the target VM, address: 'localhost:8600', transport: 'socket'
    I/System.out: waiting for debugger to settle...
    I/System.out: waiting for debugger to settle...
    I/System.out: waiting for debugger to settle...
    I/System.out: waiting for debugger to settle...
    I/System.out: waiting for debugger to settle...
    I/System.out: waiting for debugger to settle...
    I/System.out: debugger has settled (1387)
    I/InstantRun: Instant Run Runtime started. Android package is edu.cmu.sphinx.pocketsphinx, real application class is null.
    W/InstantRun: No instant run dex files added to classpath
    I/dalvikvm: Could not find method android.app.Activity.onRequestPermissionsResult, referenced from method edu.cmu.pocketsphinx.demo.PocketSphinxActivity.onRequestPermissionsResult
    W/dalvikvm: VFY: unable to resolve virtual method 149: Landroid/app/Activity;.onRequestPermissionsResult (I[Ljava/lang/String;[I)V
    D/dalvikvm: VFY: replacing opcode 0x6f at 0x001f
    E/MoreInfoHPW_ViewGroup: Parent view is not a TextView
    W/ContextImpl: Failed to ensure directory: /storage/extSdCard/Android/data/edu.cmu.sphinx.pocketsphinx/files
    I/Assets: Skipping asset th.lm: checksums are equal
    D/TextLayoutCache: Enable myanmar Zawgyi converter
    I/Assets: Skipping asset th.cd_cont_200/means: checksums are equal
    I/Assets: Skipping asset th.filler: checksums are equal
    I/Assets: Skipping asset th.txt: checksums are equal
    I/Assets: Skipping asset th_train.fileids: checksums are equal
    I/Assets: Skipping asset th.cd_cont_200/feat.params: checksums are equal
    D/TextLayoutCache: Enable myanmar Zawgyi converter
    I/Assets: Skipping asset th_test.fileids: checksums are equal
    I/Assets: Skipping asset th.cd_cont_200/transition_matrices: checksums are equal
    I/Assets: Skipping asset th.cd_cont_200/variances: checksums are equal
    I/Assets: Skipping asset th.cd_cont_200/noisedict: checksums are equal
    I/Assets: Skipping asset feat.params: checksums are equal
    I/Assets: Skipping asset th.phone: checksums are equal
    I/Assets: Skipping asset th.cd_cont_200/mdef: checksums are equal
    I/Assets: Skipping asset th.dic: checksums are equal
    I/Assets: Skipping asset th_train.transcription: checksums are equal
    I/Assets: Skipping asset th.cd_cont_200/assets.lst: checksums are equal
    I/Assets: Skipping asset th.cd_cont_200/mixture_weights: checksums are equal
    I/Assets: Skipping asset sphinx_train.cfg: checksums are equal
    I/Assets: Skipping asset th_test.transcription: checksums are equal
    D/libEGL: loaded /system/lib/egl/libEGL_mali.so
    D/libEGL: loaded /system/lib/egl/libGLESv1_CM_mali.so
    D/libEGL: loaded /system/lib/egl/libGLESv2_mali.so

          [ 10-31 14:28:15.715 13283:13283 E/         ]
          Device driver API match
          Device driver API version: 29
          User space API version: 29
    
          [ 10-31 14:28:15.715 13283:13283 E/         ]
          mali: REVISION=Linux-r3p2-01rel3 BUILD_DATE=Tue Jul 22 19:59:34 KST 2014
    

    D/dalvikvm: Trying to load lib /data/app-lib/edu.cmu.sphinx.pocketsphinx-31/libpocketsphinx_jni.so 0x424f8c78
    D/dalvikvm: Added shared lib /data/app-lib/edu.cmu.sphinx.pocketsphinx-31/libpocketsphinx_jni.so 0x424f8c78
    D/dalvikvm: No JNI_OnLoad found in /data/app-lib/edu.cmu.sphinx.pocketsphinx-31/libpocketsphinx_jni.so 0x424f8c78, skipping init
    I/cmusphinx: INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/th.cd_cont_200/feat.params
    D/OpenGLRenderer: Enabling debug mode 0
    E/cmusphinx: FATAL: "cmn.c", line 126: Unknown CMN type 'batch'
    Disconnected from the target VM, address: 'localhost:8600', transport: 'socket'

     
    • Nickolay V. Shmyrev

      Pocketsphinx-android-demo was slighlty outdated. As a temporary fix you can replace "batch" with "current' in model feat.params file.

      I also updated pocketsphinx-android-demo from recent soruces, you can clone fresh repo and try again, it should work with your current model.

       
  • Suwit

    Suwit - 2016-11-04

    Finally, I try sphinx4 import .wav files but the result is " เธ?เธตเน? เธฅเธนเธ?เธชเธฒเธง เธ?เธญเธ? เธ—เน?เธฒเธ? " . I'm sure .dic and .lm are "UTF-8"



    Building STT 1.0-SNAPSHOT

    --- exec-maven-plugin:1.2.1:exec (default-cli) @ STT ---
    13:19:46.788 INFO unitManager CI Unit: AA
    13:19:46.788 INFO unitManager CI Unit: AE
    13:19:46.788 INFO unitManager CI Unit: AH
    13:19:46.788 INFO unitManager CI Unit: AO
    13:19:46.788 INFO unitManager CI Unit: AW
    13:19:46.788 INFO unitManager CI Unit: AY
    13:19:46.788 INFO unitManager CI Unit: B
    13:19:46.788 INFO unitManager CI Unit: CH
    13:19:46.788 INFO unitManager CI Unit: D
    13:19:46.788 INFO unitManager CI Unit: ER
    13:19:46.788 INFO unitManager CI Unit: EY
    13:19:46.788 INFO unitManager CI Unit: H
    13:19:46.788 INFO unitManager CI Unit: IH
    13:19:46.788 INFO unitManager CI Unit: IY
    13:19:46.788 INFO unitManager CI Unit: J
    13:19:46.788 INFO unitManager CI Unit: K
    13:19:46.803 INFO unitManager CI Unit: L
    13:19:46.803 INFO unitManager CI Unit: M
    13:19:46.803 INFO unitManager CI Unit: N
    13:19:46.803 INFO unitManager CI Unit: NG
    13:19:46.803 INFO unitManager CI Unit: NH
    13:19:46.803 INFO unitManager CI Unit: OW
    13:19:46.803 INFO unitManager CI Unit: OY
    13:19:46.803 INFO unitManager CI Unit: P
    13:19:46.803 INFO unitManager CI Unit: PL
    13:19:46.803 INFO unitManager CI Unit: R
    13:19:46.803 INFO unitManager CI Unit: S
    13:19:46.803 INFO unitManager CI Unit: SH
    13:19:46.803 INFO unitManager CI Unit: T
    13:19:46.803 INFO unitManager CI Unit: UA
    13:19:46.803 INFO unitManager CI Unit: UE
    13:19:46.803 INFO unitManager CI Unit: UH
    13:19:46.803 INFO unitManager CI Unit: UW
    13:19:46.803 INFO unitManager CI Unit: W
    13:19:46.803 INFO unitManager CI Unit: Y
    13:19:46.944 INFO autoCepstrum Cepstrum component auto-configured as follows: autoCepstrum {MelFrequencyFilterBank, Denoise, DiscreteCosineTransform2, Lifter}
    13:19:46.944 INFO dictionary Loading dictionary from: file:/C:/Users/I3asta/Documents/NetBeansProjects/STT/target/classes/th.dic
    13:19:46.944 INFO dictionary Loading filler dictionary from: file:/C:/Users/I3asta/Documents/NetBeansProjects/STT/target/classes/th.cd_cont_200/noisedict
    13:19:46.944 INFO acousticModelLoader Loading tied-state acoustic model from: file:/C:/Users/I3asta/Documents/NetBeansProjects/STT/target/classes/th.cd_cont_200
    13:19:46.944 INFO acousticModelLoader Pool means Entries: 2464
    13:19:46.944 INFO acousticModelLoader Pool variances Entries: 2464
    13:19:46.944 INFO acousticModelLoader Pool transition_matrices Entries: 36
    13:19:46.944 INFO acousticModelLoader Pool senones Entries: 308
    13:19:46.944 INFO acousticModelLoader Gaussian weights: mixture_weights. Entries: 308
    13:19:46.960 INFO acousticModelLoader Pool senones Entries: 308
    13:19:46.960 INFO acousticModelLoader Context Independent Unit Entries: 36
    13:19:46.960 INFO acousticModelLoader HMM Manager: 4017 hmms
    13:19:46.960 INFO acousticModel CompositeSenoneSequences: 0
    13:19:46.960 INFO dictionary The dictionary is missing a phonetic transcription for the word 'เธ?เธถเน?เธ?'
    13:19:46.960 INFO dictionary The dictionary is missing a phonetic transcription for the word 'เธขเน?เธญเธข'
    13:19:46.960 INFO dictionary The dictionary is missing a phonetic transcription for the word 'เธชเธดเธ?'
    13:19:46.960 INFO dictionary The dictionary is missing a phonetic transcription for the word 'เธขเน?เธญเธข'
    13:19:46.975 INFO dictionary The dictionary is missing a phonetic transcription for the word 'เธ?เธถเน?เธ?'
    13:19:46.975 INFO dictionary The dictionary is missing a phonetic transcription for the word 'เธ?เธถเน?เธ?'
    13:19:46.975 INFO dictionary The dictionary is missing a phonetic transcription for the word 'เธขเน?เธญเธข'
    13:19:46.975 INFO dictionary The dictionary is missing a phonetic transcription for the word 'เธชเธดเธ?'
    13:19:46.975 INFO dictionary The dictionary is missing a phonetic transcription for the word 'เธชเธดเธ?'
    13:19:47.209 INFO dictionary The dictionary is missing a phonetic transcription for the word 'เธชเธดเธ?'
    13:19:47.209 INFO dictionary The dictionary is missing a phonetic transcription for the word 'เธ?เธถเน?เธ?'
    13:19:47.209 INFO dictionary The dictionary is missing a phonetic transcription for the word 'เธขเน?เธญเธข'
    13:19:47.240 INFO lexTreeLinguist Max CI Units 37
    13:19:47.240 INFO lexTreeLinguist Unit table size 50653
    13:19:47.240 INFO speedTracker # ----------------------------- Timers----------------------------------------
    13:19:47.240 INFO speedTracker # Name Count CurTime MinTime MaxTime AvgTime TotTime
    13:19:47.240 INFO speedTracker Load AM 1 0.2030s 0.2030s 0.2030s 0.2030s 0.2030s
    13:19:47.240 INFO speedTracker Load Dictionary 1 0.0000s 0.0000s 0.0000s 0.0000s 0.0000s
    13:19:47.240 INFO speedTracker Compile 1 0.2490s 0.2490s 0.2490s 0.2490s 0.2490s
    13:19:47.553 INFO liveCMN 35.37 17.23 7.54 -9.55 -4.68 -5.06 4.82 7.70 -0.17 2.43 5.90 -6.93 5.58
    13:19:47.631 INFO speedTracker This Time Audio: 2.73s Proc: 0.34s Speed: 0.13 X real time
    13:19:47.631 INFO speedTracker Total Time Audio: 2.73s Proc: 0.34s 0.13 X real time
    13:19:47.631 INFO memoryTracker Mem Total: 152.50 Mb Free: 125.29 Mb
    13:19:47.631 INFO memoryTracker Used: This: 27.21 Mb Avg: 27.21 Mb Max: 27.21 Mb
    Hypothesis: เธ?เธตเน? เธฅเธนเธ?เธชเธฒเธง เธ?เธญเธ? เธ—เน?เธฒเธ?
    13:19:47.646 INFO liveCMN 35.25 17.31 7.70 -9.53 -4.79 -5.09 4.88 7.93 -0.10 2.58 5.91 -6.77 5.52
    13:19:47.678 INFO speedTracker This Time Audio: 0.48s Proc: 0.03s Speed: 0.07 X real time
    13:19:47.678 INFO speedTracker Total Time Audio: 3.21s Proc: 0.38s 0.12 X real time
    13:19:47.678 INFO memoryTracker Mem Total: 152.50 Mb Free: 112.32 Mb
    13:19:47.678 INFO memoryTracker Used: This: 40.18 Mb Avg: 33.70 Mb Max: 40.18 Mb
    Hypothesis: เธ?เน?เธญเธ?
    13:19:47.693 INFO liveCMN 34.44 17.23 5.80 -9.92 -3.23 -3.11 5.64 7.10 -0.86 2.40 6.30 -7.92 4.73
    13:19:47.709 INFO speedTracker This Time Audio: 0.36s Proc: 0.03s Speed: 0.09 X real time
    13:19:47.709 INFO speedTracker Total Time Audio: 3.57s Proc: 0.41s 0.11 X real time
    13:19:47.709 INFO memoryTracker Mem Total: 152.50 Mb Free: 103.24 Mb
    13:19:47.709 INFO memoryTracker Used: This: 49.26 Mb Avg: 38.88 Mb Max: 49.26 Mb
    Hypothesis:
    13:19:47.725 INFO speedTracker This Time Audio: 0.42s Proc: 0.02s Speed: 0.04 X real time
    13:19:47.725 INFO speedTracker Total Time Audio: 3.99s Proc: 0.42s 0.11 X real time
    13:19:47.725 INFO memoryTracker Mem Total: 152.50 Mb Free: 90.26 Mb
    13:19:47.725 INFO memoryTracker Used: This: 62.24 Mb Avg: 44.72 Mb Max: 62.24 Mb
    Hypothesis: เธ?เธฐ
    13:19:47.725 INFO speedTracker # ----------------------------- Timers----------------------------------------
    13:19:47.725 INFO speedTracker # Name Count CurTime MinTime MaxTime AvgTime TotTime
    13:19:47.725 INFO speedTracker Load AM 1 0.2030s 0.2030s 0.2030s 0.2030s 0.2030s
    13:19:47.725 INFO speedTracker Frontend 484 0.0000s 0.0000s 0.0470s 0.0002s 0.1110s
    13:19:47.725 INFO speedTracker Load Dictionary 1 0.0000s 0.0000s 0.0000s 0.0000s 0.0000s
    13:19:47.725 INFO speedTracker Compile 1 0.2490s 0.2490s 0.2490s 0.2490s 0.2490s
    13:19:47.725 INFO speedTracker Score 956 0.0000s 0.0000s 0.0470s 0.0002s 0.1580s
    13:19:47.725 INFO speedTracker Prune 3336 0.0000s 0.0000s 0.0000s 0.0000s 0.0000s
    13:19:47.725 INFO speedTracker Grow 3346 0.0000s 0.0000s 0.0160s 0.0001s 0.2500s
    13:19:47.725 INFO speedTracker Total Time Audio: 3.99s Proc: 0.42s 0.11 X real time
    13:19:47.725 INFO memoryTracker Mem Total: 152.50 Mb Free: 90.26 Mb
    13:19:47.725 INFO memoryTracker Used: This: 62.24 Mb Avg: 48.23 Mb Max: 62.24 Mb
    ------------------------------------------------------------------------
    BUILD SUCCESS
    ------------------------------------------------------------------------
    Total time: 2.375s
    Finished at: Fri Nov 04 13:19:48 ICT 2016
    Final Memory: 5M/120M
    ------------------------------------------------------------------------

     
    • Nickolay V. Shmyrev

      You also need to set JVM file encoding probably with an option -Dfile.encoding=UTF-8

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.