From: Vassil P. <vas...@gm...> - 2015-07-06 10:14:49
|
Hi Neal, yes, I guess there might be tools, or combination thereof, that could produce even better results. This is one of the raison d'êtres for the "original-mp3" archive. It should contain enough metadata to allow re-extraction of the aligned utterances, possibly using different tools (it also contains 15-20% additional audio, that was discarded in order to make LibriSpeech more balanced). It seems to me, however, that the the audio quality of the current corpus is OK. Thanks for mentioning these audio analysis tools, I wasn't aware of some of them. I've found WaveSurfer to be pretty useful too. Vassil On Sun, Jul 5, 2015 at 8:36 PM, Neil Nelson <nn...@in...> wrote: > Vassil, > > I have always used Lame (Ubuntu Software Center is your friend) to > convert between wav and mp3. It is well regarded. I suggest SOX for > down-sampling. Spek will give a spectral analysis picture for the entire > file. Audacity will give an ongoing spectral analysis but it is not > frequency labeled. Sonic Visualizer may have something. Upgrading to > Ubuntu 14.04 can be tricky in spots but something to consider since the > versions of GCC and all the software tend to be limited to the OS rev. > > Neil > > On 07/05/2015 02:05 AM, Vassil Panayotov wrote: > > BTW, when preparing LibriSpeech, I've noticed that the quality of MP3 > > conversion can vary substantially, depending on the particular tool used. > > For example the output of mpg123(or maybe it was mpg321) was very noisy > and > > the ASR WER was 10-15% absolute higher than when alternative MP3 decoders > > were used. When converting to 16kHz .wav ffmpeg cuts off the frequencies > > higher than 7kHz. So eventually I settled for mplayer. It preserves the > > frequency content in the 7-8kHz range and as far as I could tell the > audio > > sounded a bit "closer" to the original recording, although I'm not sure > if > > there is any measurable difference in ASR performance b/w ffmpeg and > > mplayer produced .wav-s. The versions of the tools I've tried were those > > shipped with Ubuntu 10.04 and 12.04, so the issues may be fixed in the > more > > recent releases. > > > > Vassil > > -- > RSA public key for this email address at http://pgp.mit.edu/ > > > ------------------------------------------------------------------------------ > Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and support that > you need to offload your IT needs and focus on growing your business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > _______________________________________________ > Kaldi-users mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-users > |