Kaldi / Discussion / Open Discussion: Multiple data chunks

Rahul Shivaji Pawar - 2015-07-21

I have obtained single channel files from stereo data with 8 bit sample encoding. However the number of samples in the original stereo files were odd and hence I get the cannot read multiple data chunks error during mfcc feature extraction. [ from my reading about .wav format the data chunk must always have even number of bytes, so I think I should just add a sample or delete one of the samples but I could not find a sox command to do that ]

The riff_chunk_size is 1 more byte than riff_chunk_read + data_chunk_size .

ERROR (compute-mfcc-feats:Read():wave-reader.cc:224) Expected 1468838 bytes in RIFF chunk, but after first data block there will be 36 + 1468801 bytes (we do not support reading multiple data chunks).
WARNING (compute-mfcc-feats:Read():feat/wave-reader.h:149) Exception caught in WaveHolder object (reading).
WARNING (compute-mfcc-feats:LoadCurrent():util/kaldi-table-inl.h:233) TableReader: failed to load object from /home/rpawar7/kaldi-trunk/SID/Audio/train/data/short2/tduva_A.wav

Input File : 'tduva_A.wav'
Channels : 1
Sample Rate : 8000
Precision : 8-bit
Duration : 00:03:03.60 = 1468801 samples ~ 13770 CDDA sectors
File Size : 1.47M
Bit Rate : 64.0k
Sample Encoding: 8-bit Unsigned Integer PCM

Any way I can sort this out using sox ?

Last edit: Rahul Shivaji Pawar 2015-07-21

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Jan "yenda" Trmal - 2015-07-21
  
  Hmmm... And just
  sox tduva_A.wav -c 1 -r 8000 -t wav output.wav
  won't help? I think I managed to "normalize" wav once, doing this.
  y.
  
  On Tue, Jul 21, 2015 at 10:53 AM, Rahul Shivaji Pawar rpawar7@users.sf.net
  wrote:
  
  I have obtained single channel files from stereo data with 8 bit sample
  encoding. However the number of samples in the original stereo files were
  odd and hence I get the cannot read multiple data chunks error during mfcc
  feature extraction. [ from my reading about .wav format the data chunk must
  always have even number of bytes, so I think I should just add a sample or
  delete one of the samples but I could not find a sox command to do that ]
  
  The riff_chunk_size is 1 more byte than riff_chunk_size + data_chunk_size .
  
  ERROR (compute-mfcc-feats:Read():wave-reader.cc:224) Expected 1468838
  bytes in RIFF chunk, but after first data block there will be 36 + 1468801
  bytes (we do not support reading multiple data chunks).
  WARNING (compute-mfcc-feats:Read():feat/wave-reader.h:149) Exception
  caught in WaveHolder object (reading).
  WARNING (compute-mfcc-feats:LoadCurrent():util/kaldi-table-inl.h:233)
  TableReader: failed to load object from
  /home/rpawar7/kaldi-trunk/SID/Audio/train/data/short2/tduva_A.wav
  
  Input File : 'tduva_A.wav'
  Channels : 1
  Sample Rate : 8000
  Precision : 8-bit
  Duration : 00:03:03.60 = 1468801 samples ~ 13770 CDDA sectors
  File Size : 1.47M
  Bit Rate : 64.0k
  Sample Encoding: 8-bit Unsigned Integer PCM
  
  Any way I can sort this out using sox ?
  
  Multiple data chunks
  
  Sent from sourceforge.net because you indicated interest in <
  https://sourceforge.net/p/kaldi/discussion/1355347/>
  
  To unsubscribe from further messages, please visit <
  https://sourceforge.net/auth/subscriptions/>
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Rahul Shivaji Pawar - 2015-07-21
    
    That does not help. The number of data bytes still remain odd because the the number of samples is odd.
    
    I tried extracting the mfcc features, the problem remained the same.
    
    Last edit: Rahul Shivaji Pawar 2015-07-21
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Jan "yenda" Trmal - 2015-07-21
      
      Ok, but the odd number of samples should not matter to kaldi.
      If the wav has a reasonable size, can you send that wav to me?
      jtrmal@gmail.com
      y.
      
      On Tue, Jul 21, 2015 at 2:32 PM, Rahul Shivaji Pawar rpawar7@users.sf.net
      wrote:
      
      That does not help. The number of data bytes still remain odd because the
      the number of samples is odd.
      
      Multiple data chunks
      
      Sent from sourceforge.net because you indicated interest in <
      https://sourceforge.net/p/kaldi/discussion/1355347/>
      
      To unsubscribe from further messages, please visit <
      https://sourceforge.net/auth/subscriptions/>
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Rahul Shivaji Pawar - 2015-07-21
        
        an example of a wav file is attached.
        
        Last edit: Rahul Shivaji Pawar 2015-07-21
        
        ttnxf_A.wav
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Jan "yenda" Trmal - 2015-07-21
        
        Rahul, I think it's just because yoour wav is slightly weird. But not in
        the sense of number of samples but the fact that the header specifies 8kHz
        8bit linear encoding. If you recode it to 16bit samples, it works:
        
        wav-to-duration scp:<(echo 'B sox tjvyo_B.wav -r 8000 -b 16 -t wav -|')
        ark,t:-
        
        not sure if you can read that command, but this is how to wav.scp should
        look like:
        ID_B /usr/bin/sox /absolute/path/to/wavs/tjvyo_B.wav -r 8000 -b 16 -t wav
        -|
        
        Also you should make sure that the header is not actually damaged -- is it
        really 8bit linear and not 8bit A/mu-law?
        Listen to the the output from the sox output to verify the audio will not
        be scrambled.
        y.
        
        On Tue, Jul 21, 2015 at 2:38 PM, Rahul Shivaji Pawar rpawar7@users.sf.net
        wrote:
        
        I will send a wav file to you.
        
        Multiple data chunks
        
        Sent from sourceforge.net because you indicated interest in <
        https://sourceforge.net/p/kaldi/discussion/1355347/>
        
        To unsubscribe from further messages, please visit <
        https://sourceforge.net/auth/subscriptions/>
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Daniel Povey - 2015-07-21
        
        Yenda, i think it might actually be a bug in our reading code, due to not
        having good documentation to work from (AFAIK there is no "official"
        documentation).
        
        It comes from a check that the header+payload size equals the entire chunk
        size.
        From
        https://www.daubnet.com/en/file-format-riff
        I got the following, that there can be 1-byte padding. I'll change the
        code.
        Dan
        
        Basic File FormatThe price for the flexibility of holding different types
        of data is a file structure, that isn't easy to understand. A RIFF is -
        more or less - a hierarchical structure. The 'directory entries' are
        defined by chunks. Every chunk contains either data or a list of chunks.
        This document will occasionally refer to the analogy of a file system,
        every chunk is either a file or a subdirectory. All chunks have the same
        structure:
        NameSizeDescriptionID4 bytefour ASCII character identifier, padded with
        ASCII 32 (space) if less than 4 charactersSize4 bytesize of DataDataSize
        bytesthe 'payload'unused 1 byte present, if size is odd
        
        On Tue, Jul 21, 2015 at 1:22 PM, Jan jtrmal@users.sf.net wrote:
        
        ERROR! The markdown supplied could not be parsed correctly. Did you
        forget to surround a code snippet with "~~~~"?
        
        Rahul, I think it's just because yoour wav is slightly weird. But not in
        the sense of number of samples but the fact that the header specifies 8kHz
        8bit linear encoding. If you recode it to 16bit samples, it works:
        
        wav-to-duration scp:<(echo 'B sox tjvyo_B.wav -r 8000 -b 16 -t wav -|')
        ark,t:-
        
        not sure if you can read that command, but this is how to wav.scp should
        look like:
        ID_B /usr/bin/sox /absolute/path/to/wavs/tjvyo_B.wav -r 8000 -b 16 -t wav
        -|
        
        Also you should make sure that the header is not actually damaged -- is it
        really 8bit linear and not 8bit A/mu-law?
        Listen to the the output from the sox output to verify the audio will not
        be scrambled.
        y.
        
        On Tue, Jul 21, 2015 at 2:38 PM, Rahul Shivaji Pawar rpawar7@users.sf.net
        wrote:
        
        I will send a wav file to you.
        
        Multiple data chunks
        
        Sent from sourceforge.net because you indicated interest in <
        https://sourceforge.net/p/kaldi/discussion/1355347/>
        
        To unsubscribe from further messages, please visit <
        https://sourceforge.net/auth/subscriptions/>
        
        Multiple data chunks
        
        Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/kaldi/discussion/1355347/
        
        To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Jan "yenda" Trmal - 2015-07-21
        
        OK, it's possible I got mislead by the 8bit-linear. It's a bit of a rarity.
        y.
        
        On Tue, Jul 21, 2015 at 7:23 PM, Daniel Povey danielpovey@users.sf.net
        wrote:
        
        Yenda, i think it might actually be a bug in our reading code, due to not
        having good documentation to work from (AFAIK there is no "official"
        documentation).
        
        It comes from a check that the header+payload size equals the entire chunk
        size.
        From
        https://www.daubnet.com/en/file-format-riff
        I got the following, that there can be 1-byte padding. I'll change the
        code.
        Dan
        
        Basic File FormatThe price for the flexibility of holding different types
        of data is a file structure, that isn't easy to understand. A RIFF is -
        more or less - a hierarchical structure. The 'directory entries' are
        defined by chunks. Every chunk contains either data or a list of chunks.
        This document will occasionally refer to the analogy of a file system,
        every chunk is either a file or a subdirectory. All chunks have the same
        structure:
        NameSizeDescriptionID4 bytefour ASCII character identifier, padded with
        ASCII 32 (space) if less than 4 charactersSize4 bytesize of DataDataSize
        bytesthe 'payload'unused 1 byte present, if size is odd
        
        On Tue, Jul 21, 2015 at 1:22 PM, Jan jtrmal@users.sf.net wrote:
        
        ERROR! The markdown supplied could not be parsed correctly. Did you
        forget to surround a code snippet with "~~~~"?
        
        Rahul, I think it's just because yoour wav is slightly weird. But not in
        the sense of number of samples but the fact that the header specifies
        8kHz
        8bit linear encoding. If you recode it to 16bit samples, it works:
        
        wav-to-duration scp:<(echo 'B sox tjvyo_B.wav -r 8000 -b 16 -t wav -|')
        ark,t:-
        
        not sure if you can read that command, but this is how to wav.scp should
        look like:
        ID_B /usr/bin/sox /absolute/path/to/wavs/tjvyo_B.wav -r 8000 -b 16 -t
        wav
        -|
        
        Also you should make sure that the header is not actually damaged -- is
        it
        really 8bit linear and not 8bit A/mu-law?
        Listen to the the output from the sox output to verify the audio will not
        be scrambled.
        y.
        
        On Tue, Jul 21, 2015 at 2:38 PM, Rahul Shivaji Pawar rpawar7@users.sf.net
        wrote:
        
        I will send a wav file to you.
        
        [Multiple data chunks](
        
        https://sourceforge.net/p/kaldi/discussion/1355347/thread/4f07e084/?limit=25#1741/ba2f/54ba/04d1/fc7a
        )
        
        Sent from sourceforge.net because you indicated interest in <
        https://sourceforge.net/p/kaldi/discussion/1355347/>
        
        To unsubscribe from further messages, please visit <
        https://sourceforge.net/auth/subscriptions/>
        
        Multiple data chunks
        
        Sent from sourceforge.net because you indicated interest in <
        https://sourceforge.net/p/kaldi/discussion/1355347/>
        
        To unsubscribe from further messages, please visit <
        https://sourceforge.net/auth/subscriptions/>
        
        Multiple data chunks
        
        Sent from sourceforge.net because you indicated interest in <
        https://sourceforge.net/p/kaldi/discussion/1355347/>
        
        To unsubscribe from further messages, please visit <
        https://sourceforge.net/auth/subscriptions/>
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Daniel Povey - 2015-07-21
        
        It looks to me like the issue was that for some reason the riff_chunk_size
        specified in the file was one byte larger than it should have been.
        According to https://www.daubnet.com/en/file-format-riff, the number of
        bytes will always be even (so I guess they padded with one). So maybe we
        need to modify the reading-in code to not complain if there is a mismatch
        of just one byte.
        I wish there was a reasonable format for audio files. The WAV
        specification feels like it was created by people who never went to
        college, and I'm not aware of any proper documentation for it.
        
        Dan
        
        On Tue, Jul 21, 2015 at 11:38 AM, Rahul Shivaji Pawar rpawar7@users.sf.net
        wrote:
        
        I will send a wav file to you.
        
        Multiple data chunks
        https://sourceforge.net/p/kaldi/discussion/1355347/thread/4f07e084/?limit=25#1741/ba2f/54ba/04d1/fc7a
        
        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/kaldi/discussion/1355347/
        
        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Rahul Shivaji Pawar - 2015-07-21

I am currently doing sox --ignore-length and this increases the number of samples by 1 ( in the header ).

This I think should solve the problem.

Last edit: Rahul Shivaji Pawar 2015-07-21

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Jan "yenda" Trmal - 2015-07-21
  
  Glad it's working. Seems the forum has almost an 1hr delay.
  y.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Daniel Povey - 2015-07-21
    
    BTW, in case anyone is getting these forum emails, please know that this
    forum, like everything on Sourceforge, is now deprecated.
    Go to kaldi-asr.org and click on help!, to get latest info.
    Dan
    
    On Tue, Jul 21, 2015 at 3:30 PM, Jan jtrmal@users.sf.net wrote:
    
    Glad it's working. Seems the forum has almost an 1hr delay.
    y.
    
    Multiple data chunks
    https://sourceforge.net/p/kaldi/discussion/1355347/thread/4f07e084/?limit=25#23b6/cace
    
    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/kaldi/discussion/1355347/
    
    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    
    alternate
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Multiple data chunks

Forums

Help

Multiple data chunks

I will send a wav file to you.

Multiple data chunks

Forums

Help

Multiple data chunks document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

I will send a wav file to you.

Multiple data chunks