I am trying to train my own model with sphinxtrain using the tutorial (http://cmusphinx.sourceforge.net/wiki/tutorialam), the intent is to start by something that works initially (an example) and then apply opportune modifications, to familiarize with sphinxtrain. I I downloaded an4 database from http://www.speech.cs.cmu.edu/databases/an4/ (Raw audio (.raw) format, little endian byte order (64 M))
I unpacked, it, moved to the folder where the database is situated, (/home/osota/an4) , and then I run:
/opt/sphinxtrain/bin/./sphinxtrain -t an4 setup
it gives me this result:
Sphinxtrain path: /opt/sphinxtrain/lib/sphinxtrain
Sphinxtrain binaries path: /opt/sphinxtrain/libexec/sphinxtrain
Setting up the database an4
which from my understanding, shows that there are no issues arising from installation/configuration
I see that sphinx_train.cfg is created in /etc
I moved into etc/sphinx_train.cfg, and changed the following lines
$CFG_WAVFILE_EXTENSION = 'wav';
$CFG_WAVFILE_TYPE = 'mswav'; # one of nist, mswav, raw
into this:
$CFG_WAVFILE_EXTENSION = 'raw';
$CFG_WAVFILE_TYPE = 'raw'; # one of nist, mswav, raw
I did that because I noticed that the audio files have extension .raw
then I moved into /home/osota/an4 (the training database) and typed:
I put back $CFG_WAVFILE_EXTENSION = 'wav and $CFG_WAVFILE_TYPE = 'mswav', run again, encounter the same error
I do a search on this forum and in the troubleshooting section to try to find the solution, so I move to /home/osota/an4/logdir and I see 1 file: 000.comp_feat
I open it and I see:
" ============================================================================
" Netrw Directory Listing (netrw v145)
" /home/osota/an4/logdir/000.comp_feat
" Sorted by name
" Sort sequence: [\/]$,\<core\%(\.\d\+\)\=\>,\.h$,\.c$,\.cpp$,\~\=\*$,*,\.o$,\.obj$,\.info$,\.swp$,\.bak$,\~$
" Quick Help: <F1>:help -:go up dir D:delete R:rename s:sort-by x:exec
" ============================================================================
../
an4.test-1-1.log
an4.train-1-1.log
.swp
~
then I see the documentation citing a similar issue to mine in the troubleshooting section:
I'd like to familiarize and understand what's going on, because I've never trained a model before
"Did you skip this step?"
what step? the creation of mfc files? isn't sphinxtrain supposed to do this step? should I create those mfc files with sphinx_fe ?
"The training process expects a feature file to be there, and it isn't."
this makes me guess that the process of mfc files creation failed on step "MODULE: 000"
Another guess is that I might have some installation/configuration problem, my python version is Python 2.7.3, and I'm on Debian Linux
what Did I do wrong?
Last edit: Nickolay V. Shmyrev 2015-03-18
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Nickolay, I downloaded "NIST's Sphere audio (.sph) format (64 M)" ,repeated the procedure (/opt/sphinxtrain/bin/./sphinxtrain -t an4 setup) in the new folder
and set
$CFG_WAVFILE_EXTENSION = 'sph';
$CFG_WAVFILE_TYPE = 'nist'; # one of nist, mswav, raw
when I run, I encounter exactly the same errors, any idea of what it might be?
takes 0 seconds to progress. From my understanding, this means that no feature is being extracted (and no mfc is being created), but on the other side I also see "Feature extraction is done" which implies that there were no errors
if I go to the logdir I see 1 folder "000.comp_feat", If I open it I see 2 files: "an4.test-1-1.log" and "an4.train-1-1.log"
If I open an4.train-1-1.log I see
Fri Mar 20 14:54:25 2015
Fri Mar 20 14:54:25 2015
(which is the time at the moment of steup or run)
as an additional note, If I access the newly created folder feat/ I can see that the directory structure is created (the same directory structure as in wav), but in the case of feat/ there are no files contained in the folders
EDIT 999: I tried again with the latest sphinxtrain from github, same errors
any suggestions?
Last edit: Orest 2015-03-20
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This log means you do not have sphinx_fe binary installed in your path or it fails to run. You need to make sure you properly installed sphinxbase, in particular, LD_LIBRARY_PATH part.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
Estimated Total Hours Training: -0.000202564102564107 WARNING: Not enough data for the training
what should i do for this warning Mr. Nickolay, because my data training can't reach 1 hour?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am trying to train my own model with sphinxtrain using the tutorial (http://cmusphinx.sourceforge.net/wiki/tutorialam), the intent is to start by something that works initially (an example) and then apply opportune modifications, to familiarize with sphinxtrain. I I downloaded an4 database from http://www.speech.cs.cmu.edu/databases/an4/ (Raw audio (.raw) format, little endian byte order (64 M))
I unpacked, it, moved to the folder where the database is situated, (/home/osota/an4) , and then I run:
it gives me this result:
which from my understanding, shows that there are no issues arising from installation/configuration
I see that sphinx_train.cfg is created in /etc
I moved into etc/sphinx_train.cfg, and changed the following lines
into this:
I did that because I noticed that the audio files have extension .raw
then I moved into /home/osota/an4 (the training database) and typed:
output stops at Phase 7
I put back $CFG_WAVFILE_EXTENSION = 'wav and $CFG_WAVFILE_TYPE = 'mswav', run again, encounter the same error
I do a search on this forum and in the troubleshooting section to try to find the solution, so I move to /home/osota/an4/logdir and I see 1 file: 000.comp_feat
I open it and I see:
then I see the documentation citing a similar issue to mine in the troubleshooting section:
I'd like to familiarize and understand what's going on, because I've never trained a model before
"Did you skip this step?"
what step? the creation of mfc files? isn't sphinxtrain supposed to do this step? should I create those mfc files with sphinx_fe ?
"The training process expects a feature file to be there, and it isn't."
this makes me guess that the process of mfc files creation failed on step "MODULE: 000"
Another guess is that I might have some installation/configuration problem, my python version is Python 2.7.3, and I'm on Debian Linux
what Did I do wrong?
Last edit: Nickolay V. Shmyrev 2015-03-18
You need to download NIST's Sphere audio (.sph) format (64 M). Extension must be sph, file type nist.
Hi Nickolay, I downloaded "NIST's Sphere audio (.sph) format (64 M)" ,repeated the procedure (/opt/sphinxtrain/bin/./sphinxtrain -t an4 setup) in the new folder
and set
when I run, I encounter exactly the same errors, any idea of what it might be?
I noticed that
takes 0 seconds to progress. From my understanding, this means that no feature is being extracted (and no mfc is being created), but on the other side I also see "Feature extraction is done" which implies that there were no errors
if I go to the logdir I see 1 folder "000.comp_feat", If I open it I see 2 files: "an4.test-1-1.log" and "an4.train-1-1.log"
If I open an4.train-1-1.log I see
(which is the time at the moment of steup or run)
as an additional note, If I access the newly created folder feat/ I can see that the directory structure is created (the same directory structure as in wav), but in the case of feat/ there are no files contained in the folders
EDIT 999: I tried again with the latest sphinxtrain from github, same errors
any suggestions?
Last edit: Orest 2015-03-20
This log means you do not have sphinx_fe binary installed in your path or it fails to run. You need to make sure you properly installed sphinxbase, in particular, LD_LIBRARY_PATH part.
OH WOW. You are a savior good sir. Had the same problem and was confused for the longest time.
kindly plz contact me at ijazulhassan13@gmail.com, I need your help
how about this warning
what should i do for this warning Mr. Nickolay, because my data training can't reach 1 hour?
I have the following error kindly plz take me into a right way all things is going in right way except these errors!!!
Last edit: ijazulhassan 2020-07-18
my speech db is below
Last edit: ijazulhassan 2020-07-20