I am trying to train my own Acoutic module,I found that wave2feat.exe accepts three kind of format: nist , raw and mswav .But when I use Windows 2000' sound recoder program , the generated wav file didn't meet the requirement of wave2feat, error message : MS WAV file not in 16-bit PCM format.
Then I tried in WindowXp ,it success!So ,I think there are several formats of wave file ,what is exact format used by wave2feat?
And , what is "nist" format ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Sample rate is another confusion, the default sample rate of file recorded by Windows' sound recorder is not 16000。Do I have to covert it to 16000 sample rate?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2006-05-11
I no longer work with Sphinx, so I can't answer your question exactly, but I can say that "MS WAV" is actually a very general format, permitting considerable variation in sample rate, number of channels, bits per sample, and signal representation, to name a few.
- "16-bit PCM format" means each sample is represented by a 16-bit twos complement number.
- The Windoze sound recorder may record in 8-bit format, and it may record in 2-channels (stereo). wave2feat may require 1-channel.
- Yes, the sudio files for training an acoustic model should be sampled at 16 kHz. Your sound recorder app probably does it at 44.1 kHz, but you may be able to change that setting.
I hope that helps, even though I have not been able to provide all the specific information you requested. Good luck.
cheers,
jerry
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you very much.Great help.
I am working with Sphinx3,and have some sound files in mp3 format for training .I think convert them into raw format file is better ,but I didn't find any tools to do this work. So I think I have to convert them to wave files ,with one channel ,16000 sample rate and 16-bit PCM , to make wave2feat happy.
^_^
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You may find the sound utility sox ( http://sox.sourceforge.net ) to be of some help in converting from a sound file in common formats to other formats.
Also, I've used lame ( http://lame.sourceforget.net ) to do encoding/decoding of MP3 files. The command line arguments are somewhat daunting, but it's possible to google around for some cookbook type recipes that you can tinker on.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am trying to train my own Acoutic module,I found that wave2feat.exe accepts three kind of format: nist , raw and mswav .But when I use Windows 2000' sound recoder program , the generated wav file didn't meet the requirement of wave2feat, error message : MS WAV file not in 16-bit PCM format.
Then I tried in WindowXp ,it success!So ,I think there are several formats of wave file ,what is exact format used by wave2feat?
And , what is "nist" format ?
Thank you ,Eric ,I will try it.
Sample rate is another confusion, the default sample rate of file recorded by Windows' sound recorder is not 16000。Do I have to covert it to 16000 sample rate?
I no longer work with Sphinx, so I can't answer your question exactly, but I can say that "MS WAV" is actually a very general format, permitting considerable variation in sample rate, number of channels, bits per sample, and signal representation, to name a few.
- "16-bit PCM format" means each sample is represented by a 16-bit twos complement number.
- The Windoze sound recorder may record in 8-bit format, and it may record in 2-channels (stereo). wave2feat may require 1-channel.
- Yes, the sudio files for training an acoustic model should be sampled at 16 kHz. Your sound recorder app probably does it at 44.1 kHz, but you may be able to change that setting.
I hope that helps, even though I have not been able to provide all the specific information you requested. Good luck.
cheers,
jerry
I forgot to add that NIST format is one designed by the speech group at the U.S. National Institute of Science and Technology. See references to SPHERE at http://www.nist.gov/speech/tools/index.htm , and also http://ftp.cwi.nl/audio/NIST-SPHERE .
Thank you very much.Great help.
I am working with Sphinx3,and have some sound files in mp3 format for training .I think convert them into raw format file is better ,but I didn't find any tools to do this work. So I think I have to convert them to wave files ,with one channel ,16000 sample rate and 16-bit PCM , to make wave2feat happy.
^_^
You may find the sound utility sox ( http://sox.sourceforge.net ) to be of some help in converting from a sound file in common formats to other formats.
Also, I've used lame ( http://lame.sourceforget.net ) to do encoding/decoding of MP3 files. The command line arguments are somewhat daunting, but it's possible to google around for some cookbook type recipes that you can tinker on.
Sorry, that's http://lame.sourceforge.net . Somehow my fingers want to continue typing after forge and type forget instead.