I have obtained single channel files from stereo data with 8 bit sample encoding. However the number of samples in the original stereo files were odd and hence I get the cannot read multiple data chunks error during mfcc feature extraction. [ from my reading about .wav format the data chunk must always have even number of bytes, so I think I should just add a sample or delete one of the samples but I could not find a sox command to do that ]
The riff_chunk_size is 1 more byte than riff_chunk_read + data_chunk_size .
ERROR (compute-mfcc-feats:Read():wave-reader.cc:224) Expected 1468838 bytes in RIFF chunk, but after first data block there will be 36 + 1468801 bytes (we do not support reading multiple data chunks).
WARNING (compute-mfcc-feats:Read():feat/wave-reader.h:149) Exception caught in WaveHolder object (reading).
WARNING (compute-mfcc-feats:LoadCurrent():util/kaldi-table-inl.h:233) TableReader: failed to load object from /home/rpawar7/kaldi-trunk/SID/Audio/train/data/short2/tduva_A.wav
Hmmm... And just
sox tduva_A.wav -c 1 -r 8000 -t wav output.wav
won't help? I think I managed to "normalize" wav once, doing this.
y.
On Tue, Jul 21, 2015 at 10:53 AM, Rahul Shivaji Pawar rpawar7@users.sf.net
wrote:
I have obtained single channel files from stereo data with 8 bit sample
encoding. However the number of samples in the original stereo files were
odd and hence I get the cannot read multiple data chunks error during mfcc
feature extraction. [ from my reading about .wav format the data chunk must
always have even number of bytes, so I think I should just add a sample or
delete one of the samples but I could not find a sox command to do that ]
The riff_chunk_size is 1 more byte than riff_chunk_size + data_chunk_size .
ERROR (compute-mfcc-feats:Read():wave-reader.cc:224) Expected 1468838
bytes in RIFF chunk, but after first data block there will be 36 + 1468801
bytes (we do not support reading multiple data chunks).
WARNING (compute-mfcc-feats:Read():feat/wave-reader.h:149) Exception
caught in WaveHolder object (reading).
WARNING (compute-mfcc-feats:LoadCurrent():util/kaldi-table-inl.h:233)
TableReader: failed to load object from
/home/rpawar7/kaldi-trunk/SID/Audio/train/data/short2/tduva_A.wav
Rahul, I think it's just because yoour wav is slightly weird. But not in
the sense of number of samples but the fact that the header specifies 8kHz
8bit linear encoding. If you recode it to 16bit samples, it works:
not sure if you can read that command, but this is how to wav.scp should
look like:
ID_B /usr/bin/sox /absolute/path/to/wavs/tjvyo_B.wav -r 8000 -b 16 -t wav
-|
Also you should make sure that the header is not actually damaged -- is it
really 8bit linear and not 8bit A/mu-law?
Listen to the the output from the sox output to verify the audio will not
be scrambled.
y.
On Tue, Jul 21, 2015 at 2:38 PM, Rahul Shivaji Pawar rpawar7@users.sf.net
wrote:
Yenda, i think it might actually be a bug in our reading code, due to not
having good documentation to work from (AFAIK there is no "official"
documentation).
It comes from a check that the header+payload size equals the entire chunk
size.
From https://www.daubnet.com/en/file-format-riff
I got the following, that there can be 1-byte padding. I'll change the
code.
Dan
Basic File FormatThe price for the flexibility of holding different types
of data is a file structure, that isn't easy to understand. A RIFF is -
more or less - a hierarchical structure. The 'directory entries' are
defined by chunks. Every chunk contains either data or a list of chunks.
This document will occasionally refer to the analogy of a file system,
every chunk is either a file or a subdirectory. All chunks have the same
structure:
NameSizeDescriptionID4 bytefour ASCII character identifier, padded with
ASCII 32 (space) if less than 4 charactersSize4 bytesize of DataDataSize
bytesthe 'payload'unused 1 byte present, if size is odd
ERROR! The markdown supplied could not be parsed correctly. Did you
forget to surround a code snippet with "~~~~"?
Rahul, I think it's just because yoour wav is slightly weird. But not in
the sense of number of samples but the fact that the header specifies 8kHz
8bit linear encoding. If you recode it to 16bit samples, it works:
not sure if you can read that command, but this is how to wav.scp should
look like:
ID_B /usr/bin/sox /absolute/path/to/wavs/tjvyo_B.wav -r 8000 -b 16 -t wav
-|
Also you should make sure that the header is not actually damaged -- is it
really 8bit linear and not 8bit A/mu-law?
Listen to the the output from the sox output to verify the audio will not
be scrambled.
y.
On Tue, Jul 21, 2015 at 2:38 PM, Rahul Shivaji Pawar rpawar7@users.sf.net
wrote:
Yenda, i think it might actually be a bug in our reading code, due to not
having good documentation to work from (AFAIK there is no "official"
documentation).
It comes from a check that the header+payload size equals the entire chunk
size.
From https://www.daubnet.com/en/file-format-riff
I got the following, that there can be 1-byte padding. I'll change the
code.
Dan
Basic File FormatThe price for the flexibility of holding different types
of data is a file structure, that isn't easy to understand. A RIFF is -
more or less - a hierarchical structure. The 'directory entries' are
defined by chunks. Every chunk contains either data or a list of chunks.
This document will occasionally refer to the analogy of a file system,
every chunk is either a file or a subdirectory. All chunks have the same
structure:
NameSizeDescriptionID4 bytefour ASCII character identifier, padded with
ASCII 32 (space) if less than 4 charactersSize4 bytesize of DataDataSize
bytesthe 'payload'unused 1 byte present, if size is odd
ERROR! The markdown supplied could not be parsed correctly. Did you
forget to surround a code snippet with "~~~~"?
Rahul, I think it's just because yoour wav is slightly weird. But not in
the sense of number of samples but the fact that the header specifies
8kHz
8bit linear encoding. If you recode it to 16bit samples, it works:
not sure if you can read that command, but this is how to wav.scp should
look like:
ID_B /usr/bin/sox /absolute/path/to/wavs/tjvyo_B.wav -r 8000 -b 16 -t
wav
-|
Also you should make sure that the header is not actually damaged -- is
it
really 8bit linear and not 8bit A/mu-law?
Listen to the the output from the sox output to verify the audio will not
be scrambled.
y.
It looks to me like the issue was that for some reason the riff_chunk_size
specified in the file was one byte larger than it should have been.
According to https://www.daubnet.com/en/file-format-riff, the number of
bytes will always be even (so I guess they padded with one). So maybe we
need to modify the reading-in code to not complain if there is a mismatch
of just one byte.
I wish there was a reasonable format for audio files. The WAV
specification feels like it was created by people who never went to
college, and I'm not aware of any proper documentation for it.
Dan
On Tue, Jul 21, 2015 at 11:38 AM, Rahul Shivaji Pawar rpawar7@users.sf.net
wrote:
BTW, in case anyone is getting these forum emails, please know that this
forum, like everything on Sourceforge, is now deprecated.
Go to kaldi-asr.org and click on help!, to get latest info.
Dan
I have obtained single channel files from stereo data with 8 bit sample encoding. However the number of samples in the original stereo files were odd and hence I get the cannot read multiple data chunks error during mfcc feature extraction. [ from my reading about .wav format the data chunk must always have even number of bytes, so I think I should just add a sample or delete one of the samples but I could not find a sox command to do that ]
The riff_chunk_size is 1 more byte than riff_chunk_read + data_chunk_size .
ERROR (compute-mfcc-feats:Read():wave-reader.cc:224) Expected 1468838 bytes in RIFF chunk, but after first data block there will be 36 + 1468801 bytes (we do not support reading multiple data chunks).
WARNING (compute-mfcc-feats:Read():feat/wave-reader.h:149) Exception caught in WaveHolder object (reading).
WARNING (compute-mfcc-feats:LoadCurrent():util/kaldi-table-inl.h:233) TableReader: failed to load object from /home/rpawar7/kaldi-trunk/SID/Audio/train/data/short2/tduva_A.wav
Input File : 'tduva_A.wav'
Channels : 1
Sample Rate : 8000
Precision : 8-bit
Duration : 00:03:03.60 = 1468801 samples ~ 13770 CDDA sectors
File Size : 1.47M
Bit Rate : 64.0k
Sample Encoding: 8-bit Unsigned Integer PCM
Any way I can sort this out using sox ?
Last edit: Rahul Shivaji Pawar 2015-07-21
Hmmm... And just
sox tduva_A.wav -c 1 -r 8000 -t wav output.wav
won't help? I think I managed to "normalize" wav once, doing this.
y.
On Tue, Jul 21, 2015 at 10:53 AM, Rahul Shivaji Pawar rpawar7@users.sf.net
wrote:
That does not help. The number of data bytes still remain odd because the the number of samples is odd.
I tried extracting the mfcc features, the problem remained the same.
Last edit: Rahul Shivaji Pawar 2015-07-21
Ok, but the odd number of samples should not matter to kaldi.
If the wav has a reasonable size, can you send that wav to me?
jtrmal@gmail.com
y.
On Tue, Jul 21, 2015 at 2:32 PM, Rahul Shivaji Pawar rpawar7@users.sf.net
wrote:
an example of a wav file is attached.
Last edit: Rahul Shivaji Pawar 2015-07-21
Rahul, I think it's just because yoour wav is slightly weird. But not in
the sense of number of samples but the fact that the header specifies 8kHz
8bit linear encoding. If you recode it to 16bit samples, it works:
wav-to-duration scp:<(echo 'B sox tjvyo_B.wav -r 8000 -b 16 -t wav -|')
ark,t:-
not sure if you can read that command, but this is how to wav.scp should
look like:
ID_B /usr/bin/sox /absolute/path/to/wavs/tjvyo_B.wav -r 8000 -b 16 -t wav
-|
Also you should make sure that the header is not actually damaged -- is it
really 8bit linear and not 8bit A/mu-law?
Listen to the the output from the sox output to verify the audio will not
be scrambled.
y.
On Tue, Jul 21, 2015 at 2:38 PM, Rahul Shivaji Pawar rpawar7@users.sf.net
wrote:
Yenda, i think it might actually be a bug in our reading code, due to not
having good documentation to work from (AFAIK there is no "official"
documentation).
It comes from a check that the header+payload size equals the entire chunk
size.
From
https://www.daubnet.com/en/file-format-riff
I got the following, that there can be 1-byte padding. I'll change the
code.
Dan
Basic File FormatThe price for the flexibility of holding different types
of data is a file structure, that isn't easy to understand. A RIFF is -
more or less - a hierarchical structure. The 'directory entries' are
defined by chunks. Every chunk contains either data or a list of chunks.
This document will occasionally refer to the analogy of a file system,
every chunk is either a file or a subdirectory. All chunks have the same
structure:
NameSizeDescriptionID4 bytefour ASCII character identifier, padded with
ASCII 32 (space) if less than 4 charactersSize4 bytesize of DataDataSize
bytesthe 'payload'unused 1 byte present, if size is odd
On Tue, Jul 21, 2015 at 1:22 PM, Jan jtrmal@users.sf.net wrote:
OK, it's possible I got mislead by the 8bit-linear. It's a bit of a rarity.
y.
On Tue, Jul 21, 2015 at 7:23 PM, Daniel Povey danielpovey@users.sf.net
wrote:
It looks to me like the issue was that for some reason the riff_chunk_size
specified in the file was one byte larger than it should have been.
According to https://www.daubnet.com/en/file-format-riff, the number of
bytes will always be even (so I guess they padded with one). So maybe we
need to modify the reading-in code to not complain if there is a mismatch
of just one byte.
I wish there was a reasonable format for audio files. The WAV
specification feels like it was created by people who never went to
college, and I'm not aware of any proper documentation for it.
Dan
On Tue, Jul 21, 2015 at 11:38 AM, Rahul Shivaji Pawar rpawar7@users.sf.net
wrote:
I am currently doing sox --ignore-length and this increases the number of samples by 1 ( in the header ).
This I think should solve the problem.
Last edit: Rahul Shivaji Pawar 2015-07-21
Glad it's working. Seems the forum has almost an 1hr delay.
y.
BTW, in case anyone is getting these forum emails, please know that this
forum, like everything on Sourceforge, is now deprecated.
Go to kaldi-asr.org and click on help!, to get latest info.
Dan
On Tue, Jul 21, 2015 at 3:30 PM, Jan jtrmal@users.sf.net wrote: