Hi,
i got problem to read the features files.
when i want to run the bw executable for cd-untied training, i got this error:
utt> 0 0BF1SET0Header size field: 990183424(3b050000); filesize: 5375(000014ff)
ERROR: "corpus.c", line 1513: MFCC read failed. Retrying after sleep...
the problem is i don't know why it can't read the features file.
this is the cmd_line i used to generate the features file. i used TI46 data corpus (sampling rate = 12500):
"bin/wave2feat -c etc/eset.fileids -raw -di wav -ei wav -do feat -eo feat -alpha 0.97 -srate 12500 -frate 100 -wlen 0.0256 -nfft 512 -nfilt 40 -lowerf 130 -upperf 6800 -ncep 13"
can anyone here tell me what my mistakes?
thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
A feature file with 13 cepstra per frame should have a file length of 4*13*n + 4 bytes, where n is the number of frames in the file. (That's 4 bytes per float times 13 floats per frame times the number of frames, plus 4 bytes of header.) The size of your file (5375 bytes) just isn't possible for a feature file.
So either you are specifying the wrong file, or it somehow got truncated.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
i have checked the .wav file, control file, and the command line i used. all are fine just like i used before.
still, every time i run the wave2feat command, the feature files won't get its correct size.
is it this problem happens because there is something wrong with my OS?
can i see your command line wave2feat for the comparison?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I didn't use wave2feat to generate my cepstral files; I used Sphinx2 (which supports only 16kHz sampling rate).
Are you running the training programs under Cygwin? If so, try running them in a regular XP command window. Maybe you are getting tripped up by Cygwin's attempt to compensate for Windows text file idiosyncracies.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
the TI46 data corpus use 12500kHz for sampling rate..i just change that one and leave the rest arguments with their default value..but, still the same problem occurs.
i have tried to run the program in DOS prompt but i got error because i don't know the correct command to run the program in DOS.
before this, i thought i just change the input format between -raw and -nist in the command line, that's all and the feature files seem to work fine with the training executables.
but, this time i really can't see what actually caused the problems. i suspect my cygwin went collapsed..i really don't know, just my guess.
instead of re-install it, is there any way to refresh/rebuild or to fix any error in cygwin?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm not familiar with the TI46 data corpus, but according to http://wave.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S9,
it is in "nist" format If that is so, you should be using -nist, not -raw. Specifying the wrong input format will certainly cause things to go wrong. (Don't expect a nice message like "The audio file is not in the specified format.")
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
yes, i know that ti46 in NIST format and i have used -nist in the command line, but it can't be read too by the program..just same when i use -raw. always get the file size that can't be divided with 52.
it's true like you mentioned before, the main key lies in the issue of the file size. if i can get the file size that can be divided with 52 (4*13), then the error 'MFCC read failed' will be settled.
i have just finish uninstall my cygwin and will re-install it then. i just want to find where could the problem get started.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
after i re-install cygwin program and checked the default text file option to DOS and i ran again wave2feat command, all features files generated were fine and can be read successfully.
i suspect if the option is Unix, then there exist complication between Unix type text file and the Windows.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
i got problem to read the features files.
when i want to run the bw executable for cd-untied training, i got this error:
utt> 0 0BF1SET0Header size field: 990183424(3b050000); filesize: 5375(000014ff)
ERROR: "corpus.c", line 1513: MFCC read failed. Retrying after sleep...
the problem is i don't know why it can't read the features file.
this is the cmd_line i used to generate the features file. i used TI46 data corpus (sampling rate = 12500):
"bin/wave2feat -c etc/eset.fileids -raw -di wav -ei wav -do feat -eo feat -alpha 0.97 -srate 12500 -frate 100 -wlen 0.0256 -nfft 512 -nfilt 40 -lowerf 130 -upperf 6800 -ncep 13"
can anyone here tell me what my mistakes?
thanks.
The key lies in the message:
Header size field: 990183424(3b050000); filesize: 5375(000014ff)
A feature file with 13 cepstra per frame should have a file length of 4*13*n + 4 bytes, where n is the number of frames in the file. (That's 4 bytes per float times 13 floats per frame times the number of frames, plus 4 bytes of header.) The size of your file (5375 bytes) just isn't possible for a feature file.
So either you are specifying the wrong file, or it somehow got truncated.
i have checked the .wav file, control file, and the command line i used. all are fine just like i used before.
still, every time i run the wave2feat command, the feature files won't get its correct size.
is it this problem happens because there is something wrong with my OS?
can i see your command line wave2feat for the comparison?
I didn't use wave2feat to generate my cepstral files; I used Sphinx2 (which supports only 16kHz sampling rate).
Are you running the training programs under Cygwin? If so, try running them in a regular XP command window. Maybe you are getting tripped up by Cygwin's attempt to compensate for Windows text file idiosyncracies.
the TI46 data corpus use 12500kHz for sampling rate..i just change that one and leave the rest arguments with their default value..but, still the same problem occurs.
i have tried to run the program in DOS prompt but i got error because i don't know the correct command to run the program in DOS.
before this, i thought i just change the input format between -raw and -nist in the command line, that's all and the feature files seem to work fine with the training executables.
but, this time i really can't see what actually caused the problems. i suspect my cygwin went collapsed..i really don't know, just my guess.
instead of re-install it, is there any way to refresh/rebuild or to fix any error in cygwin?
I'm not familiar with the TI46 data corpus, but according to http://wave.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S9,
it is in "nist" format If that is so, you should be using -nist, not -raw. Specifying the wrong input format will certainly cause things to go wrong. (Don't expect a nice message like "The audio file is not in the specified format.")
yes, i know that ti46 in NIST format and i have used -nist in the command line, but it can't be read too by the program..just same when i use -raw. always get the file size that can't be divided with 52.
it's true like you mentioned before, the main key lies in the issue of the file size. if i can get the file size that can be divided with 52 (4*13), then the error 'MFCC read failed' will be settled.
i have just finish uninstall my cygwin and will re-install it then. i just want to find where could the problem get started.
after i re-install cygwin program and checked the default text file option to DOS and i ran again wave2feat command, all features files generated were fine and can be read successfully.
i suspect if the option is Unix, then there exist complication between Unix type text file and the Windows.