I'm currently trying to build a Portuguese AM from scratch.
First of all, I have only about 9 hours of data from several speakers and we're currently recording audio files from sentences to fix this issue (but I think it's enought to train a poor model for a prototype and POC).
When I run: sphinxtrain run , i get:
Sphinxtrain path: /opt/sphinxtrain/lib/sphinxtrain
Sphinxtrain binaries path: /opt/sphinxtrain/libexec/sphinxtrain
Running the training
MODULE: 000 Computing feature from audio files
Feature extraction is done
MODULE: 00 verify training files
Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
Found 65789 words using 40 phones
Phase 2: Checking to make sure there are not duplicate entries in the dictionary
Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
Phase 4: Checking number of lines in the transcript file should match lines in fileids file
Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
Estimated Total Hours Training: 9.55443611111111
This is a small amount of data, no comment at this time
Phase 6: Checking that all the words in the transcript are in the dictionary
Words in dictionary: 65786
Words in filler dictionary: 3
Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
MODULE: 0000 train grapheme-to-phoneme model
Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
MODULE: 01 Train LDA transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 02 Train MLLT transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 05 Vector Quantization
Skipped for continuous models
MODULE: 10 Training Context Independent models for forced alignment and VTLN
Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 11 Force-aligning transcripts
Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
MODULE: 12 Force-aligning data for VTLN
Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 20 Training Context Independent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...models...
Phase 2: Flat initialize
Phase 3: Forward-Backward
Baum-Welch iteration 1 Average log-likelihood -162.413314892964
Baum-Welch iteration 2 Average log-likelihood -162.14782954067
Baum-Welch iteration 3 Average log-likelihood -158.192225151025
Baum-Welch iteration 4 Average log-likelihood -156.009983427828
Baum-Welch iteration 5 Average log-likelihood -155.391815954608
Baum-Welch iteration 6 Average log-likelihood -155.218206850727
Training completed after 7 iterations
MODULE: 30 Training Context Dependent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...
Phase 2: Initialization
Phase 3: Forward-Backward
Baum-Welch iteration 1 Average log-likelihood -155.106981310597
Baum-Welch iteration 2 Average log-likelihood -150.436046511628
Baum-Welch iteration 3 Average log-likelihood -148.66950408888
Baum-Welch iteration 4 Average log-likelihood -148.432459926147
Baum-Welch iteration 5 Average log-likelihood -148.332407983026
Training completed after 6 iterations
MODULE: 40 Build Trees
Phase 1: Cleaning up old log files...
Phase 2: Make Questions
Phase 3: Tree building
Processing each phone with each state
Skipping SIL
MODULE: 45 Prune Trees
Phase 1: Tree Pruning
Phase 2: State Tying
MODULE: 50 Training Context dependent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...
Phase 2: Copy CI to CD initialize
Phase 3: Forward-Backward
Baum-Welch gaussians 1 iteration 1 Average log-likelihood -155.106981310597
Baum-Welch gaussians 1 iteration 2 Average log-likelihood -150.015913170953
Baum-Welch gaussians 1 iteration 3 Average log-likelihood -149.35850117709
Baum-Welch gaussians 1 iteration 4 Average log-likelihood -149.160405174825
Baum-Welch gaussians 1 iteration 5 Average log-likelihood -149.160405174825
Baum-Welch gaussians 2 iteration 1 Average log-likelihood -149.498732012774
Baum-Welch gaussians 2 iteration 2 Average log-likelihood -148.561899362035
Baum-Welch gaussians 2 iteration 3 Average log-likelihood -147.579596109707
Baum-Welch gaussians 2 iteration 4 Average log-likelihood -146.815115859493
Baum-Welch gaussians 2 iteration 5 Average log-likelihood -146.552032900907
Baum-Welch gaussians 2 iteration 6 Average log-likelihood -146.428790561236
Baum-Welch gaussians 2 iteration 7 Average log-likelihood -146.347837980004
Baum-Welch gaussians 4 iteration 1 Average log-likelihood -146.754470295422
Baum-Welch gaussians 4 iteration 2 Average log-likelihood -145.874788646531
Baum-Welch gaussians 4 iteration 3 Average log-likelihood -145.110629787154
Baum-Welch gaussians 4 iteration 4 Average log-likelihood -144.212734457721
Baum-Welch gaussians 4 iteration 5 Average log-likelihood -143.929663028648
Baum-Welch gaussians 4 iteration 6 Average log-likelihood -143.802243234106
Baum-Welch gaussians 4 iteration 7 Average log-likelihood -143.725025697579
Baum-Welch gaussians 8 iteration 1 Average log-likelihood -143.725025697579
Baum-Welch gaussians 8 iteration 2 Average log-likelihood -143.169670485996
Baum-Welch gaussians 8 iteration 3 Average log-likelihood -142.336333100238
Baum-Welch gaussians 8 iteration 4 Average log-likelihood -141.007867215333
Baum-Welch gaussians 8 iteration 5 Average log-likelihood -140.677951210751
Baum-Welch gaussians 8 iteration 6 Average log-likelihood -140.553768706398
Baum-Welch gaussians 8 iteration 7 Average log-likelihood -140.479450478235
Baum-Welch gaussians 16 iteration 1 Average log-likelihood -140.479450478235
Baum-Welch gaussians 16 iteration 2 Average log-likelihood -140.890972591668
Baum-Welch gaussians 16 iteration 3 Average log-likelihood -139.708283772179
Baum-Welch gaussians 16 iteration 4 Average log-likelihood -136.201636336699
Baum-Welch gaussians 16 iteration 5 Average log-likelihood -135.761273926732
Baum-Welch gaussians 16 iteration 6 Average log-likelihood -135.6346988708
Baum-Welch gaussians 16 iteration 7 Average log-likelihood -135.6346988708
Training for 16 Gaussian(s) completed after 7 iterations
MODULE: 60 Lattice Generation
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 61 Lattice Pruning
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 62 Lattice Format Conversion
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 65 MMIE Training
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 90 deleted interpolation
Skipped for continuous models
MODULE: DECODE Decoding using models previously trained
Aligning results to find error rate
word_align.pl failed with error code 65280 at /opt/sphinxtrain/lib/sphinxtrain/scripts/decode/slave.pl line 173.
My LM was built from a Wikipedia corpora (this is a prototype, so gramatical erros are acceptable at this point)
I've tried changing the values of these parameters:
$CFG_FINAL_NUM_DENSITIES from 2 to 16
$CFG_N_TIED_STATES from 4 to 40000
I must admit I'm not completely aware of what they do exactly, but I've changed all of them to check the results, no luck thouth.
I've tried doing parallel and single processing, no outcomes changes at all.
Digging the log files, I've found a lot of messages like these:
aprendizado.1.7-8.bw.log:ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
aprendizado.1.7-8.bw.log:ERROR: "baum_welch.c", line 324: wav/courserasinais2_25 ignored
aprendizado.1.7-8.bw.log:ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
aprendizado.1.7-8.bw.log:ERROR: "baum_welch.c", line 324: wav/courserasinais2_26 ignored
I was reading in another thread in this forum, audio files need to have a bit of silence in the very begging and very ending.
after adding some silence (200ms) I've reduced the incidence of this kind of error, but now adding more or removing silence makes no diference in outcomes AND changing the silent length head me to an error in interactions with more gaucians.
Can you point me some approach to figure out what's wrong? I'm struggling this for 5 days in a row, reading a lot but I've exhauted all my possibilities.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've already done that, but as I told you, I'm reverting things to test and I've reverted some audio files. Now I've fixed it and I'm running again to test.
About senones, I've a big vocabulary and dictionary so I've tryed with a high senone value, but I've tryed with a lowe volue with no result either :/
What is a good value for a large vocabulary and low amount of hours of audio (I'm increasing the amount recording audios, but it will take some time).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm going to check permissions to share my whole dir (I'm using some audio and corpora from coursera, so I'm not aware of EULA).
I've found the same errors after fixing the bit rate, log folder added as attachment.
Sphinxtrain path: /opt/sphinxtrain/lib/sphinxtrain
Sphinxtrain binaries path: /opt/sphinxtrain/libexec/sphinxtrain
Running the training
MODULE: 000 Computing feature from audio files
Feature extraction is done
MODULE: 00 verify training files
Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
Found 65789 words using 40 phones
Phase 2: Checking to make sure there are not duplicate entries in the dictionary
Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
Phase 4: Checking number of lines in the transcript file should match lines in fileids file
Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
Estimated Total Hours Training: 9.56337222222222
This is a small amount of data, no comment at this time
Phase 6: Checking that all the words in the transcript are in the dictionary
Words in dictionary: 65786
Words in filler dictionary: 3
Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
MODULE: 0000 train grapheme-to-phoneme model
Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
MODULE: 01 Train LDA transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 02 Train MLLT transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 05 Vector Quantization
Skipped for continuous models
MODULE: 10 Training Context Independent models for forced alignment and VTLN
Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 11 Force-aligning transcripts
Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
MODULE: 12 Force-aligning data for VTLN
Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 20 Training Context Independent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...models...
Phase 2: Flat initialize
Phase 3: Forward-Backward
Baum-Welch iteration 1 Average log-likelihood -162.39917090131
Baum-Welch iteration 2 Average log-likelihood -162.134177313769
Baum-Welch iteration 3 Average log-likelihood -158.188818461649
Baum-Welch iteration 4 Average log-likelihood -156.013399272537
Baum-Welch iteration 5 Average log-likelihood -155.390498100719
Baum-Welch iteration 6 Average log-likelihood -155.219130062652
Training completed after 7 iterations
MODULE: 30 Training Context Dependent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...
Phase 2: Initialization
Phase 3: Forward-Backward
Baum-Welch iteration 1 Average log-likelihood -155.104490977145
Baum-Welch iteration 2 Average log-likelihood -150.4321072878
Baum-Welch iteration 3 Average log-likelihood -148.661829688466
Baum-Welch iteration 4 Average log-likelihood -148.425873324106
Baum-Welch iteration 5 Average log-likelihood -148.325007312306
Training completed after 6 iterations
MODULE: 40 Build Trees
Phase 1: Cleaning up old log files...
Phase 2: Make Questions
Phase 3: Tree building
Processing each phone with each state
Skipping SIL
MODULE: 45 Prune Trees
Phase 1: Tree Pruning
Phase 2: State Tying
MODULE: 50 Training Context dependent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...
Phase 2: Copy CI to CD initialize
Phase 3: Forward-Backward
Baum-Welch gaussians 1 iteration 1 Average log-likelihood -155.104490977145
Baum-Welch gaussians 1 iteration 2 Average log-likelihood -150.166472055873
Baum-Welch gaussians 1 iteration 3 Average log-likelihood -149.527066817521
Baum-Welch gaussians 1 iteration 4 Average log-likelihood -149.33248666777
Baum-Welch gaussians 1 iteration 5 Average log-likelihood -149.249526416857
Baum-Welch gaussians 2 iteration 1 Average log-likelihood -149.665685981939
Baum-Welch gaussians 2 iteration 2 Average log-likelihood -148.745463437401
Baum-Welch gaussians 2 iteration 3 Average log-likelihood -147.778809957695
Baum-Welch gaussians 2 iteration 4 Average log-likelihood -147.037401151327
Baum-Welch gaussians 2 iteration 5 Average log-likelihood -146.769784193625
Baum-Welch gaussians 2 iteration 6 Average log-likelihood -146.643775842846
Baum-Welch gaussians 2 iteration 7 Average log-likelihood -146.568047337278
Baum-Welch gaussians 4 iteration 1 Average log-likelihood -146.979894350542
Baum-Welch gaussians 4 iteration 2 Average log-likelihood -146.119120358493
Baum-Welch gaussians 4 iteration 3 Average log-likelihood -145.374026658987
Baum-Welch gaussians 4 iteration 4 Average log-likelihood -144.514868881076
Baum-Welch gaussians 4 iteration 5 Average log-likelihood -144.24001889989
Baum-Welch gaussians 4 iteration 6 Average log-likelihood -144.111642041253
Baum-Welch gaussians 4 iteration 7 Average log-likelihood -144.034846390345
Baum-Welch gaussians 8 iteration 1 Average log-likelihood -144.442647745659
Baum-Welch gaussians 8 iteration 2 Average log-likelihood -143.502220501764
Baum-Welch gaussians 8 iteration 3 Average log-likelihood -142.699651769135
Baum-Welch gaussians 8 iteration 4 Average log-likelihood -141.44292318433
Baum-Welch gaussians 8 iteration 5 Average log-likelihood -141.123430509348
Baum-Welch gaussians 8 iteration 6 Average log-likelihood -141.000609460655
Baum-Welch gaussians 8 iteration 7 Average log-likelihood -141.000609460655
Baum-Welch gaussians 16 iteration 1 Average log-likelihood -140.926907768827
Baum-Welch gaussians 16 iteration 2 Average log-likelihood -140.202714446483
Baum-Welch gaussians 16 iteration 3 Average log-likelihood -139.147380962402
Baum-Welch gaussians 16 iteration 4 Average log-likelihood -136.92606993813
Baum-Welch gaussians 16 iteration 5 Average log-likelihood -136.504641066069
Baum-Welch gaussians 16 iteration 6 Average log-likelihood -136.504641066069
Baum-Welch gaussians 16 iteration 7 Average log-likelihood -136.309599738404
Training for 16 Gaussian(s) completed after 7 iterations
MODULE: 60 Lattice Generation
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 61 Lattice Pruning
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 62 Lattice Format Conversion
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 65 MMIE Training
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 90 deleted interpolation
Skipped for continuous models
MODULE: DECODE Decoding using models previously trained
Aligning results to find error rate
word_align.pl failed with error code 65280 at /opt/sphinxtrain/lib/sphinxtrain/scripts/decode/slave.pl line 173.
thank you so much pointing me what was wrong with my setup.
I used to thought sphinx will complaing about this kind of misaligment on both bases, train and test, but it only does on train database (and this is why I didn't pay attention to this, so in my mind if it was wrong, sphinx will complain and drop me an error)!
But again, thank you so much!!!!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello.
I'm currently trying to build a Portuguese AM from scratch.
First of all, I have only about 9 hours of data from several speakers and we're currently recording audio files from sentences to fix this issue (but I think it's enought to train a poor model for a prototype and POC).
When I run: sphinxtrain run , i get:
Sphinxtrain path: /opt/sphinxtrain/lib/sphinxtrain
Sphinxtrain binaries path: /opt/sphinxtrain/libexec/sphinxtrain
Running the training
MODULE: 000 Computing feature from audio files
Feature extraction is done
MODULE: 00 verify training files
Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
Found 65789 words using 40 phones
Phase 2: Checking to make sure there are not duplicate entries in the dictionary
Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
Phase 4: Checking number of lines in the transcript file should match lines in fileids file
Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
Estimated Total Hours Training: 9.55443611111111
This is a small amount of data, no comment at this time
Phase 6: Checking that all the words in the transcript are in the dictionary
Words in dictionary: 65786
Words in filler dictionary: 3
Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
MODULE: 0000 train grapheme-to-phoneme model
Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
MODULE: 01 Train LDA transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 02 Train MLLT transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 05 Vector Quantization
Skipped for continuous models
MODULE: 10 Training Context Independent models for forced alignment and VTLN
Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 11 Force-aligning transcripts
Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
MODULE: 12 Force-aligning data for VTLN
Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 20 Training Context Independent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...models...
Phase 2: Flat initialize
Phase 3: Forward-Backward
Baum-Welch iteration 1 Average log-likelihood -162.413314892964
Baum-Welch iteration 2 Average log-likelihood -162.14782954067
Baum-Welch iteration 3 Average log-likelihood -158.192225151025
Baum-Welch iteration 4 Average log-likelihood -156.009983427828
Baum-Welch iteration 5 Average log-likelihood -155.391815954608
Baum-Welch iteration 6 Average log-likelihood -155.218206850727
Training completed after 7 iterations
MODULE: 30 Training Context Dependent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...
Phase 2: Initialization
Phase 3: Forward-Backward
Baum-Welch iteration 1 Average log-likelihood -155.106981310597
Baum-Welch iteration 2 Average log-likelihood -150.436046511628
Baum-Welch iteration 3 Average log-likelihood -148.66950408888
Baum-Welch iteration 4 Average log-likelihood -148.432459926147
Baum-Welch iteration 5 Average log-likelihood -148.332407983026
Training completed after 6 iterations
MODULE: 40 Build Trees
Phase 1: Cleaning up old log files...
Phase 2: Make Questions
Phase 3: Tree building
Processing each phone with each state
Skipping SIL
MODULE: 45 Prune Trees
Phase 1: Tree Pruning
Phase 2: State Tying
MODULE: 50 Training Context dependent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...
Phase 2: Copy CI to CD initialize
Phase 3: Forward-Backward
Baum-Welch gaussians 1 iteration 1 Average log-likelihood -155.106981310597
Baum-Welch gaussians 1 iteration 2 Average log-likelihood -150.015913170953
Baum-Welch gaussians 1 iteration 3 Average log-likelihood -149.35850117709
Baum-Welch gaussians 1 iteration 4 Average log-likelihood -149.160405174825
Baum-Welch gaussians 1 iteration 5 Average log-likelihood -149.160405174825
Baum-Welch gaussians 2 iteration 1 Average log-likelihood -149.498732012774
Baum-Welch gaussians 2 iteration 2 Average log-likelihood -148.561899362035
Baum-Welch gaussians 2 iteration 3 Average log-likelihood -147.579596109707
Baum-Welch gaussians 2 iteration 4 Average log-likelihood -146.815115859493
Baum-Welch gaussians 2 iteration 5 Average log-likelihood -146.552032900907
Baum-Welch gaussians 2 iteration 6 Average log-likelihood -146.428790561236
Baum-Welch gaussians 2 iteration 7 Average log-likelihood -146.347837980004
Baum-Welch gaussians 4 iteration 1 Average log-likelihood -146.754470295422
Baum-Welch gaussians 4 iteration 2 Average log-likelihood -145.874788646531
Baum-Welch gaussians 4 iteration 3 Average log-likelihood -145.110629787154
Baum-Welch gaussians 4 iteration 4 Average log-likelihood -144.212734457721
Baum-Welch gaussians 4 iteration 5 Average log-likelihood -143.929663028648
Baum-Welch gaussians 4 iteration 6 Average log-likelihood -143.802243234106
Baum-Welch gaussians 4 iteration 7 Average log-likelihood -143.725025697579
Baum-Welch gaussians 8 iteration 1 Average log-likelihood -143.725025697579
Baum-Welch gaussians 8 iteration 2 Average log-likelihood -143.169670485996
Baum-Welch gaussians 8 iteration 3 Average log-likelihood -142.336333100238
Baum-Welch gaussians 8 iteration 4 Average log-likelihood -141.007867215333
Baum-Welch gaussians 8 iteration 5 Average log-likelihood -140.677951210751
Baum-Welch gaussians 8 iteration 6 Average log-likelihood -140.553768706398
Baum-Welch gaussians 8 iteration 7 Average log-likelihood -140.479450478235
Baum-Welch gaussians 16 iteration 1 Average log-likelihood -140.479450478235
Baum-Welch gaussians 16 iteration 2 Average log-likelihood -140.890972591668
Baum-Welch gaussians 16 iteration 3 Average log-likelihood -139.708283772179
Baum-Welch gaussians 16 iteration 4 Average log-likelihood -136.201636336699
Baum-Welch gaussians 16 iteration 5 Average log-likelihood -135.761273926732
Baum-Welch gaussians 16 iteration 6 Average log-likelihood -135.6346988708
Baum-Welch gaussians 16 iteration 7 Average log-likelihood -135.6346988708
Training for 16 Gaussian(s) completed after 7 iterations
MODULE: 60 Lattice Generation
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 61 Lattice Pruning
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 62 Lattice Format Conversion
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 65 MMIE Training
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 90 deleted interpolation
Skipped for continuous models
MODULE: DECODE Decoding using models previously trained
Aligning results to find error rate
word_align.pl failed with error code 65280 at /opt/sphinxtrain/lib/sphinxtrain/scripts/decode/slave.pl line 173.
My LM was built from a Wikipedia corpora (this is a prototype, so gramatical erros are acceptable at this point)
I've tried changing the values of these parameters:
$CFG_FINAL_NUM_DENSITIES from 2 to 16
$CFG_N_TIED_STATES from 4 to 40000
I must admit I'm not completely aware of what they do exactly, but I've changed all of them to check the results, no luck thouth.
I've tried doing parallel and single processing, no outcomes changes at all.
Digging the log files, I've found a lot of messages like these:
aprendizado.1.7-8.bw.log:ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
aprendizado.1.7-8.bw.log:ERROR: "baum_welch.c", line 324: wav/courserasinais2_25 ignored
aprendizado.1.7-8.bw.log:ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
aprendizado.1.7-8.bw.log:ERROR: "baum_welch.c", line 324: wav/courserasinais2_26 ignored
I was reading in another thread in this forum, audio files need to have a bit of silence in the very begging and very ending.
after adding some silence (200ms) I've reduced the incidence of this kind of error, but now adding more or removing silence makes no diference in outcomes AND changing the silent length head me to an error in interactions with more gaucians.
Following my conf and log files, and some audio samples: https://www.dropbox.com/s/kzug2p53o0c3cbb/portuguese.zip?dl=0
Can you point me some approach to figure out what's wrong? I'm struggling this for 5 days in a row, reading a lot but I've exhauted all my possibilities.
You need to fix few issues in training first:
1) Some of your files are 8khz. You should exclude them from training. All audios for training should have same sample rate
2) 40000 tied states is too much for your amount of data. Tutorial recommends you 2000.
ooops, my mistake about bitrate.
I've already done that, but as I told you, I'm reverting things to test and I've reverted some audio files. Now I've fixed it and I'm running again to test.
About senones, I've a big vocabulary and dictionary so I've tryed with a high senone value, but I've tryed with a lowe volue with no result either :/
What is a good value for a large vocabulary and low amount of hours of audio (I'm increasing the amount recording audios, but it will take some time).
You need to fix the issues pointed first and provide the logs. Ideally you want to provide the whole training folder, not just log dir.
I'm going to check permissions to share my whole dir (I'm using some audio and corpora from coursera, so I'm not aware of EULA).
I've found the same errors after fixing the bit rate, log folder added as attachment.
Sphinxtrain path: /opt/sphinxtrain/lib/sphinxtrain
Sphinxtrain binaries path: /opt/sphinxtrain/libexec/sphinxtrain
Running the training
MODULE: 000 Computing feature from audio files
Feature extraction is done
MODULE: 00 verify training files
Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
Found 65789 words using 40 phones
Phase 2: Checking to make sure there are not duplicate entries in the dictionary
Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
Phase 4: Checking number of lines in the transcript file should match lines in fileids file
Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
Estimated Total Hours Training: 9.56337222222222
This is a small amount of data, no comment at this time
Phase 6: Checking that all the words in the transcript are in the dictionary
Words in dictionary: 65786
Words in filler dictionary: 3
Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
MODULE: 0000 train grapheme-to-phoneme model
Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
MODULE: 01 Train LDA transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 02 Train MLLT transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 05 Vector Quantization
Skipped for continuous models
MODULE: 10 Training Context Independent models for forced alignment and VTLN
Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 11 Force-aligning transcripts
Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
MODULE: 12 Force-aligning data for VTLN
Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 20 Training Context Independent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...models...
Phase 2: Flat initialize
Phase 3: Forward-Backward
Baum-Welch iteration 1 Average log-likelihood -162.39917090131
Baum-Welch iteration 2 Average log-likelihood -162.134177313769
Baum-Welch iteration 3 Average log-likelihood -158.188818461649
Baum-Welch iteration 4 Average log-likelihood -156.013399272537
Baum-Welch iteration 5 Average log-likelihood -155.390498100719
Baum-Welch iteration 6 Average log-likelihood -155.219130062652
Training completed after 7 iterations
MODULE: 30 Training Context Dependent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...
Phase 2: Initialization
Phase 3: Forward-Backward
Baum-Welch iteration 1 Average log-likelihood -155.104490977145
Baum-Welch iteration 2 Average log-likelihood -150.4321072878
Baum-Welch iteration 3 Average log-likelihood -148.661829688466
Baum-Welch iteration 4 Average log-likelihood -148.425873324106
Baum-Welch iteration 5 Average log-likelihood -148.325007312306
Training completed after 6 iterations
MODULE: 40 Build Trees
Phase 1: Cleaning up old log files...
Phase 2: Make Questions
Phase 3: Tree building
Processing each phone with each state
Skipping SIL
MODULE: 45 Prune Trees
Phase 1: Tree Pruning
Phase 2: State Tying
MODULE: 50 Training Context dependent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...
Phase 2: Copy CI to CD initialize
Phase 3: Forward-Backward
Baum-Welch gaussians 1 iteration 1 Average log-likelihood -155.104490977145
Baum-Welch gaussians 1 iteration 2 Average log-likelihood -150.166472055873
Baum-Welch gaussians 1 iteration 3 Average log-likelihood -149.527066817521
Baum-Welch gaussians 1 iteration 4 Average log-likelihood -149.33248666777
Baum-Welch gaussians 1 iteration 5 Average log-likelihood -149.249526416857
Baum-Welch gaussians 2 iteration 1 Average log-likelihood -149.665685981939
Baum-Welch gaussians 2 iteration 2 Average log-likelihood -148.745463437401
Baum-Welch gaussians 2 iteration 3 Average log-likelihood -147.778809957695
Baum-Welch gaussians 2 iteration 4 Average log-likelihood -147.037401151327
Baum-Welch gaussians 2 iteration 5 Average log-likelihood -146.769784193625
Baum-Welch gaussians 2 iteration 6 Average log-likelihood -146.643775842846
Baum-Welch gaussians 2 iteration 7 Average log-likelihood -146.568047337278
Baum-Welch gaussians 4 iteration 1 Average log-likelihood -146.979894350542
Baum-Welch gaussians 4 iteration 2 Average log-likelihood -146.119120358493
Baum-Welch gaussians 4 iteration 3 Average log-likelihood -145.374026658987
Baum-Welch gaussians 4 iteration 4 Average log-likelihood -144.514868881076
Baum-Welch gaussians 4 iteration 5 Average log-likelihood -144.24001889989
Baum-Welch gaussians 4 iteration 6 Average log-likelihood -144.111642041253
Baum-Welch gaussians 4 iteration 7 Average log-likelihood -144.034846390345
Baum-Welch gaussians 8 iteration 1 Average log-likelihood -144.442647745659
Baum-Welch gaussians 8 iteration 2 Average log-likelihood -143.502220501764
Baum-Welch gaussians 8 iteration 3 Average log-likelihood -142.699651769135
Baum-Welch gaussians 8 iteration 4 Average log-likelihood -141.44292318433
Baum-Welch gaussians 8 iteration 5 Average log-likelihood -141.123430509348
Baum-Welch gaussians 8 iteration 6 Average log-likelihood -141.000609460655
Baum-Welch gaussians 8 iteration 7 Average log-likelihood -141.000609460655
Baum-Welch gaussians 16 iteration 1 Average log-likelihood -140.926907768827
Baum-Welch gaussians 16 iteration 2 Average log-likelihood -140.202714446483
Baum-Welch gaussians 16 iteration 3 Average log-likelihood -139.147380962402
Baum-Welch gaussians 16 iteration 4 Average log-likelihood -136.92606993813
Baum-Welch gaussians 16 iteration 5 Average log-likelihood -136.504641066069
Baum-Welch gaussians 16 iteration 6 Average log-likelihood -136.504641066069
Baum-Welch gaussians 16 iteration 7 Average log-likelihood -136.309599738404
Training for 16 Gaussian(s) completed after 7 iterations
MODULE: 60 Lattice Generation
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 61 Lattice Pruning
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 62 Lattice Format Conversion
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 65 MMIE Training
Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
MODULE: 90 deleted interpolation
Skipped for continuous models
MODULE: DECODE Decoding using models previously trained
Aligning results to find error rate
word_align.pl failed with error code 65280 at /opt/sphinxtrain/lib/sphinxtrain/scripts/decode/slave.pl line 173.
This is the logdir with 2000 Senones.
Nickolay, this is the logdir with only part of data (constituicao*) as you told me to do.
You need to provide the whole training folder, not just the logdir.
Here is the training dir without wav:
https://dl.dropboxusercontent.com/u/60013693/aprendizado_withoutwav.zip
and wav dir:
https://dl.dropboxusercontent.com/u/60013693/wav.zip
In your test fileids you have 26 lines, in transcription you have 27, constituicao/wav/dt008c is missing
thank you so much pointing me what was wrong with my setup.
I used to thought sphinx will complaing about this kind of misaligment on both bases, train and test, but it only does on train database (and this is why I didn't pay attention to this, so in my mind if it was wrong, sphinx will complain and drop me an error)!
But again, thank you so much!!!!