Hello everybody.
I use Ubuntu 10.04, Installed python2.7.3 and perl 5.10.1, I've downloaded
snapshots versions a few days ago.
I've compiled and installed all three packages in this order: sphinxbase
pocketsphinx
SphinxTrain
In order to install them I entered these codes in the terminal:
sh autogen.sh
make
sudo make install
then I configured an4 directory according to the training tutorial and ran
sphinxtrain, here is the results:
ehsan@ehsan-laptop:~/an4$ sphinxtrain run
Running the training
MODULE: 000 Computing feature from audio files
Extracting features from segments starting at (part 1 of 1)
Extracting features from segments starting at (part 1 of 1)
Feature extraction is done
MODULE: 00 verify training files
Phase 1: Checking to see if the dict and filler dict agrees with the phonelist
file.
Found 133 words using 34 phones
Phase 2: Checking to make sure there are not duplicate entries in the
dictionary
Phase 3: Check general format for the fileids file; utterance length (must be
positive); files exist
Phase 4: Checking number of lines in the transcript file should match lines in
fileids file
Phase 5: Determine amount of training data, see if n_tied_states seems
reasonable.
Estimated Total Hours Training: 0.704672222222222
This is a small amount of data, no comment at this time
Phase 6: Checking that all the words in the transcript are in the dictionary
Words in dictionary: 130
Words in filler dictionary: 3
Phase 7: Checking that all the phones in the transcript are in the phonelist,
and all phones in the phonelist appear at least once
MODULE: 0000 train grapheme-to-phoneme model
Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
MODULE: 01 Train LDA transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 02 Train MLLT transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 05 Vector Quantization
Skipped for continuous models
MODULE: 10 Training Context Independent models for forced alignment and VTLN
Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 11 Force-aligning transcripts
Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
MODULE: 12 Force-aligning data for VTLN
Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 20 Training Context Independent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...models...
Phase 2: Flat initialize
...
If you run latest sphinxtrain built from github it should be no problem.
If you still have troubles instead of raising 3 year old thread you can provide more information on your particular case, most likely it is not related to this problem.
You need to share log folder. You can pack it into archive and upload here.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Nickolay, I have a similar problem. I'm using the latest version of sphinx train (5prealpha) and I'm trying to do the trainning of the voxforge data in french. I manage to build the configure file using sphinxtrain -t my_db setup on my_db folder. You will find attached the file. When I do sphinxtrain run, it says it will take 26.32... to do the trainning but it stops later in Phase 7 (Find attached the log file).
The structure of my directories is the following:
voxforge
|
| etc
| | french.dic
| | feat.params
| | french.filler
| | french_train.fileids
| | french_test.fileids
| | theothers files from etc
| selected
| |user1
| | | wavs_from user1
.
.
.
Find also attached the filids and transcriptions from the test files
This are the folders that created before exiting:
* bwaccumdir (empty)
* feat (all features computed)
* french.html
* logdir (000.comp_feat/french.test and french.train until 4-4)
* qmanager (000.comp_feat.err 000.comp_feat.out) no error
Last edit: Pedropablo 2016-04-22
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello everybody.
I use Ubuntu 10.04, Installed python2.7.3 and perl 5.10.1, I've downloaded
snapshots versions a few days ago.
I've compiled and installed all three packages in this order:
sphinxbase
pocketsphinx
SphinxTrain
In order to install them I entered these codes in the terminal:
then I configured an4 directory according to the training tutorial and ran
sphinxtrain, here is the results:
ehsan@ehsan-laptop:~/an4$ sphinxtrain run
Running the training
MODULE: 000 Computing feature from audio files
Extracting features from segments starting at (part 1 of 1)
Extracting features from segments starting at (part 1 of 1)
Feature extraction is done
MODULE: 00 verify training files
Phase 1: Checking to see if the dict and filler dict agrees with the phonelist
file.
Found 133 words using 34 phones
Phase 2: Checking to make sure there are not duplicate entries in the
dictionary
Phase 3: Check general format for the fileids file; utterance length (must be
positive); files exist
Phase 4: Checking number of lines in the transcript file should match lines in
fileids file
Phase 5: Determine amount of training data, see if n_tied_states seems
reasonable.
Estimated Total Hours Training: 0.704672222222222
This is a small amount of data, no comment at this time
Phase 6: Checking that all the words in the transcript are in the dictionary
Words in dictionary: 130
Words in filler dictionary: 3
Phase 7: Checking that all the phones in the transcript are in the phonelist,
and all phones in the phonelist appear at least once
MODULE: 0000 train grapheme-to-phoneme model
Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
MODULE: 01 Train LDA transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 02 Train MLLT transformation
Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
MODULE: 05 Vector Quantization
Skipped for continuous models
MODULE: 10 Training Context Independent models for forced alignment and VTLN
Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 11 Force-aligning transcripts
Skipped: $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
MODULE: 12 Force-aligning data for VTLN
Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
MODULE: 20 Training Context Independent models
Phase 1: Cleaning up directories:
accumulator...logs...qmanager...models...
Phase 2: Flat initialize
...
and It stops here.
I've read log files, but can't figure out what the problem is.
here is the link of logdir and HTML report file:
http://www.4shared.com/zip/2qjUO5A_/ehsabd_an4_logdir.html
I appreciate your help.
not sure if this is the cause but seems u haven't run ./configure after
autogen.sh like INSTALLs say.
Sorry, there was a bug in sphinxtrain which was fixed just now
Please update and reinstall sphinxtrain
Please run setup once again or just fix the value of the CFG_BIN_DIR
configuration variable in sphinx_train.cfg
I am also stuck in this, can anyone tell me what value should be in CFG_BIN_DIR as described by Nickolay.
Thanks
If you run latest sphinxtrain built from github it should be no problem.
If you still have troubles instead of raising 3 year old thread you can provide more information on your particular case, most likely it is not related to this problem.
You need to share log folder. You can pack it into archive and upload here.
Hi Nickolay, I have a similar problem. I'm using the latest version of sphinx train (5prealpha) and I'm trying to do the trainning of the voxforge data in french. I manage to build the configure file using sphinxtrain -t my_db setup on my_db folder. You will find attached the file. When I do sphinxtrain run, it says it will take 26.32... to do the trainning but it stops later in Phase 7 (Find attached the log file).
The structure of my directories is the following:
voxforge
|
| etc
| | french.dic
| | feat.params
| | french.filler
| | french_train.fileids
| | french_test.fileids
| | theothers files from etc
| selected
| |user1
| | | wavs_from user1
.
.
.
Find also attached the filids and transcriptions from the test files
Hope you can help me
Last edit: Pedropablo 2016-04-22
This are the folders that created before exiting:
* bwaccumdir (empty)
* feat (all features computed)
* french.html
* logdir (000.comp_feat/french.test and french.train until 4-4)
* qmanager (000.comp_feat.err 000.comp_feat.out) no error
Last edit: Pedropablo 2016-04-22
You need to fix the warnings first
If I would like to use different type of pronunciation, what is the notations that I need to use on my dictionnary?
Last edit: Pedropablo 2016-04-22
I managed to make it work including a counter for the repeated words. Thanks a lot. Shouldn't it be errors instead of warnings? Just wondering myself