I am totally new to sphinx...Hope someone can help me out.
I downloaded SphinxTrain trainer and an4_sphere database. I have copied both to a folder and have run the following.
cd SphinxTrain
./configure
make
perl scripts_pl/setup_tutorial.pl an4
cd an4
perl scripts_pl/make_feats.pl -ctl etc/an4_train.fileids
till now there is no problem....
After this rather than running all scripts together i am running one by one and got the following error messages in between when doing the training at module 30.
I got this when gave
perl scripts_pl/30.cd_hmmuntied/slave_convg.pl
Following was the details given in an4.html
What is the problem? what should i do? *********
MODULE: 30 Training Context Dependent models (2009-01-01 13:20)
Phase 1: Cleaning up directories:
accumulator... logs... qmanager... completed
Phase 2: Initialization
mk_mdef_gen Log File
WARNING: This step had 0 ERROR messages and 1 WARNING messages. Please check the log file for details.
completed
init_mixw Log File
completed
Phase 3: Forward-Backward
Baum welch starting for iteration: 1 (1 of 1)
bw Log File
completed
Normalization for iteration: 1
norm Log File
completed
Current Overall Likelihood Per Frame = 0.353931891107765
Baum welch starting for iteration: 2 (1 of 1)
bw Log File
<b>This step had 2 ERROR messages and 1 WARNING messages. </b>Please check the log file for details.
completed
Normalization for iteration: 2
norm Log File
WARNING: This step had 0 ERROR messages and 3 WARNING messages. Please check the log file for details.
completed
Current Overall Likelihood Per Frame = 3.2847686038117
Convergence Ratio = 8.28079296141062
Baum welch starting for iteration: 3 (1 of 1)
bw Log File
<b>This step had 2 ERROR messages and 1 WARNING messages. </b>Please check the log file for details.
completed
Normalization for iteration: 3
norm Log File
WARNING: This step had 0 ERROR messages and 3 WARNING messages. Please check the log file for details.
completed
Current Overall Likelihood Per Frame = 4.21795544474383
Convergence Ratio = 0.284095153567056
Baum welch starting for iteration: 4 (1 of 1)
bw Log File
This step had 2 ERROR messages and 1 WARNING messages. Please check the log file for details.
completed
Normalization for iteration: 4
norm Log File
WARNING: This step had 0 ERROR messages and 3 WARNING messages. Please check the log file for details.
completed
Current Overall Likelihood Per Frame = 4.34042914120474
Training completed after 4 iterations
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
i recorded my voice as .wav files and made the above mentioned changes in config file . But some error is coming during feature extraction
i am posting the same below.
-cfg not specified, using the default ./etc/sphinx_train.cfg
-param not specified, using the default ./etc/feat.params
bin/wave2feat \
-verbose yes \
-alpha 0.97 \
-dither yes \
-doublebw no \
-nfilt 40 \
-ncep 13 \
-lowerf 133.33334 \
-upperf 6855.4976 \
-nfft 512 \
-wlen 0.0256 \
-c etc/ncst_train.fileids \
-mswav yes \
-di /home/ahmad/sphinx/ncst/wav \
-ei wav \
-do /home/ahmad/sphinx/ncst/feat \
-eo mfc
[Switch][Default][Value]
-help no no
-example no no
-i
-o
-c etc/ncst_train.fileids
-nskip
-runlen
-di /home/ahmad/sphinx/ncst/wav
-ei wav
-do /home/ahmad/sphinx/ncst/feat
-eo mfc
-nist no no
-raw no no
-mswav no yes
-input_endian little little
-nchans 1 1
-whichchan 1 1
-logspec no no
-feat sphinx sphinx
-mach_endian little little
-alpha 0.97 9.700000e-01
-srate 16000.0 1.600000e+04
-frate 100 100
-wlen 0.025625 2.560000e-02
-nfft 512 512
-nfilt 40 40
-lowerf 133.33334 1.333333e+02
-upperf 6855.4976 6.855498e+03
-ncep 13 13
-doublebw no no
-warp_type inverse_linear inverse_linear
-warp_params
-blocksize 200000 200000
-dither yes yes
-seed -1 -1
-verbose no yes
INFO: fe_interface.c(100): You are using the internal mechanism to generate the seed.
INFO: fe_sigproc.c(752): Current FE Parameters:
INFO: fe_sigproc.c(753): Sampling Rate: 16000.000000
INFO: fe_sigproc.c(754): Frame Size: 410
INFO: fe_sigproc.c(755): Frame Shift: 160
INFO: fe_sigproc.c(756): FFT Size: 512
INFO: fe_sigproc.c(757): Lower Frequency: 133.333
INFO: fe_sigproc.c(758): Upper Frequency: 6855.5
INFO: fe_sigproc.c(759): Number of filters: 40
INFO: fe_sigproc.c(760): Number of Overflow Samps: 0
INFO: fe_sigproc.c(761): Start Utt Status: 0
INFO: fe_sigproc.c(763): Will add dither to audio
INFO: fe_sigproc.c(764): Dither seeded with -1
INFO: fe_sigproc.c(771): Will not use double bandwidth in mel filter
INFO: wave2feat.c(139): /home/ahmad/sphinx/ncst/wav/are.wav
LENGTH: 4
INFO: wave2feat.c(786): Reading MS Wav file /home/ahmad/sphinx/ncst/wav/are.wav:
INFO: wave2feat.c(787): 16 bit PCM data, 2 channels 84708 samples
INFO: wave2feat.c(788): Sampled at 16000
ERROR: "wave2feat.c", line 883: unknown input file format
ERROR: "wave2feat.c", line 201: error reading speech data
FATAL_ERROR: "wave2feat.c", line 90: error converting files...exiting
Where did I go wrong??
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
k.
i continued and at MODULE: 50 Training Context dependent models also i got some errors. There is no problem with that also??? I have copied a part of the error message below...
One more thing i wanted to know...
Is it possible to place .wav files recorded by me in the an4/wav folder( which now has .sph files) and do this training and get a speech recognition system which recognizes my sound?
Current Overall Likelihood Per Frame = 2.67914672700468
Convergence Ratio = 0.0314306119339485
Baum welch starting for 2 Gaussian(s), iteration: 1 (1 of 1)
bw Log File
WARNING: This step had 0 ERROR messages and 1 WARNING messages. Please check the log file for details.
completed
Normalization for iteration: 1
norm Log File
This step had 34 ERROR messages and 0 WARNING messages. Please check the log file for details.
completed
Current Overall Likelihood Per Frame = 2.19669113299328
Baum welch starting for 2 Gaussian(s), iteration: 2 (1 of 1)
bw Log File
completed
Normalization for iteration: 2
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
> This step had 34 ERROR messages and 0 WARNING messages. Please check the log file for details. completed
These errors are errors in the data processing and not in the training. It was discussed so many times on this forum you can just search. They means your data is not consistent with transcription you provided.
> Is it possible to place .wav files recorded by me in the an4/wav folder( which now has .sph files) and do this training and get a speech recognition system which recognizes my sound?
Yes, you just need to change file format in sphinx_train.cfg to mswav instead of sph:
$CFG_WAVFILE_EXTENSION = 'wav';
$CFG_WAVFILE_TYPE = 'mswav'; # one of nist, mswav, raw
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am totally new to sphinx...Hope someone can help me out.
I downloaded SphinxTrain trainer and an4_sphere database. I have copied both to a folder and have run the following.
cd SphinxTrain
./configure
make
perl scripts_pl/setup_tutorial.pl an4
cd an4
perl scripts_pl/make_feats.pl -ctl etc/an4_train.fileids
till now there is no problem....
After this rather than running all scripts together i am running one by one and got the following error messages in between when doing the training at module 30.
I got this when gave
perl scripts_pl/30.cd_hmmuntied/slave_convg.pl
Following was the details given in an4.html
What is the problem? what should i do?
*********
MODULE: 30 Training Context Dependent models (2009-01-01 13:20)
Phase 1: Cleaning up directories:
accumulator... logs... qmanager... completed
Phase 2: Initialization
mk_mdef_gen Log File
WARNING: This step had 0 ERROR messages and 1 WARNING messages. Please check the log file for details.
completed
init_mixw Log File
completed
Phase 3: Forward-Backward
Baum welch starting for iteration: 1 (1 of 1)
bw Log File
completed
Normalization for iteration: 1
norm Log File
completed
Current Overall Likelihood Per Frame = 0.353931891107765
Baum welch starting for iteration: 2 (1 of 1)
bw Log File
<b>This step had 2 ERROR messages and 1 WARNING messages. </b>Please check the log file for details.
completed
Normalization for iteration: 2
norm Log File
WARNING: This step had 0 ERROR messages and 3 WARNING messages. Please check the log file for details.
completed
Current Overall Likelihood Per Frame = 3.2847686038117
Convergence Ratio = 8.28079296141062
Baum welch starting for iteration: 3 (1 of 1)
bw Log File
<b>This step had 2 ERROR messages and 1 WARNING messages. </b>Please check the log file for details.
completed
Normalization for iteration: 3
norm Log File
WARNING: This step had 0 ERROR messages and 3 WARNING messages. Please check the log file for details.
completed
Current Overall Likelihood Per Frame = 4.21795544474383
Convergence Ratio = 0.284095153567056
Baum welch starting for iteration: 4 (1 of 1)
bw Log File
This step had 2 ERROR messages and 1 WARNING messages. Please check the log file for details.
completed
Normalization for iteration: 4
norm Log File
WARNING: This step had 0 ERROR messages and 3 WARNING messages. Please check the log file for details.
completed
Current Overall Likelihood Per Frame = 4.34042914120474
Training completed after 4 iterations
hi..
i recorded my voice as .wav files and made the above mentioned changes in config file . But some error is coming during feature extraction
i am posting the same below.
-cfg not specified, using the default ./etc/sphinx_train.cfg
-param not specified, using the default ./etc/feat.params
bin/wave2feat \
-verbose yes \
-alpha 0.97 \
-dither yes \
-doublebw no \
-nfilt 40 \
-ncep 13 \
-lowerf 133.33334 \
-upperf 6855.4976 \
-nfft 512 \
-wlen 0.0256 \
-c etc/ncst_train.fileids \
-mswav yes \
-di /home/ahmad/sphinx/ncst/wav \
-ei wav \
-do /home/ahmad/sphinx/ncst/feat \
-eo mfc
[Switch] [Default] [Value]
-help no no
-example no no
-i
-o
-c etc/ncst_train.fileids
-nskip
-runlen
-di /home/ahmad/sphinx/ncst/wav
-ei wav
-do /home/ahmad/sphinx/ncst/feat
-eo mfc
-nist no no
-raw no no
-mswav no yes
-input_endian little little
-nchans 1 1
-whichchan 1 1
-logspec no no
-feat sphinx sphinx
-mach_endian little little
-alpha 0.97 9.700000e-01
-srate 16000.0 1.600000e+04
-frate 100 100
-wlen 0.025625 2.560000e-02
-nfft 512 512
-nfilt 40 40
-lowerf 133.33334 1.333333e+02
-upperf 6855.4976 6.855498e+03
-ncep 13 13
-doublebw no no
-warp_type inverse_linear inverse_linear
-warp_params
-blocksize 200000 200000
-dither yes yes
-seed -1 -1
-verbose no yes
INFO: fe_interface.c(100): You are using the internal mechanism to generate the seed.
INFO: fe_sigproc.c(752): Current FE Parameters:
INFO: fe_sigproc.c(753): Sampling Rate: 16000.000000
INFO: fe_sigproc.c(754): Frame Size: 410
INFO: fe_sigproc.c(755): Frame Shift: 160
INFO: fe_sigproc.c(756): FFT Size: 512
INFO: fe_sigproc.c(757): Lower Frequency: 133.333
INFO: fe_sigproc.c(758): Upper Frequency: 6855.5
INFO: fe_sigproc.c(759): Number of filters: 40
INFO: fe_sigproc.c(760): Number of Overflow Samps: 0
INFO: fe_sigproc.c(761): Start Utt Status: 0
INFO: fe_sigproc.c(763): Will add dither to audio
INFO: fe_sigproc.c(764): Dither seeded with -1
INFO: fe_sigproc.c(771): Will not use double bandwidth in mel filter
INFO: wave2feat.c(139): /home/ahmad/sphinx/ncst/wav/are.wav
LENGTH: 4
INFO: wave2feat.c(786): Reading MS Wav file /home/ahmad/sphinx/ncst/wav/are.wav:
INFO: wave2feat.c(787): 16 bit PCM data, 2 channels 84708 samples
INFO: wave2feat.c(788): Sampled at 16000
ERROR: "wave2feat.c", line 883: unknown input file format
ERROR: "wave2feat.c", line 201: error reading speech data
FATAL_ERROR: "wave2feat.c", line 90: error converting files...exiting
Where did I go wrong??
> What is the problem? what should i do?
Continue with training? There is no problem in the messages you provided.
k.
i continued and at MODULE: 50 Training Context dependent models also i got some errors. There is no problem with that also??? I have copied a part of the error message below...
One more thing i wanted to know...
Is it possible to place .wav files recorded by me in the an4/wav folder( which now has .sph files) and do this training and get a speech recognition system which recognizes my sound?
Current Overall Likelihood Per Frame = 2.67914672700468
Convergence Ratio = 0.0314306119339485
Baum welch starting for 2 Gaussian(s), iteration: 1 (1 of 1)
bw Log File
WARNING: This step had 0 ERROR messages and 1 WARNING messages. Please check the log file for details.
completed
Normalization for iteration: 1
norm Log File
This step had 34 ERROR messages and 0 WARNING messages. Please check the log file for details.
completed
Current Overall Likelihood Per Frame = 2.19669113299328
Baum welch starting for 2 Gaussian(s), iteration: 2 (1 of 1)
bw Log File
completed
Normalization for iteration: 2
> This step had 34 ERROR messages and 0 WARNING messages. Please check the log file for details. completed
These errors are errors in the data processing and not in the training. It was discussed so many times on this forum you can just search. They means your data is not consistent with transcription you provided.
> Is it possible to place .wav files recorded by me in the an4/wav folder( which now has .sph files) and do this training and get a speech recognition system which recognizes my sound?
Yes, you just need to change file format in sphinx_train.cfg to mswav instead of sph:
$CFG_WAVFILE_EXTENSION = 'wav';
$CFG_WAVFILE_TYPE = 'mswav'; # one of nist, mswav, raw