Menu

#339 trim nr samples different behavior from nr seconds

closed-invalid
nobody
None
5
2020-08-25
2020-08-25
Emile V
No

Hi,

like the title, I think an example shows best what I mean:

when using seconds to trim, the result is as expected the requested duration
~~~
/tmp
▶ sox -n -r 16000 silence.wav trim 0 0.25

/tmp
▶ sox --i silence.wav

Input File : 'silence.wav'
Channels : 1
Sample Rate : 16000
Precision : 32-bit
Duration : 00:00:00.25 = 4000 samples ~ 18.75 CDDA sectors
File Size : 16.1k
Bit Rate : 515k
Sample Encoding: 32-bit Signed Integer PCM

When using number of samples taken from the above example, it seems that the duration is not as expected (1333 in stead of 4000)
I noticed this is relative to 48000Hz meaning 4000 * 16000/48000 = 1333.333
This command works as expected for -r 48000
Maybe I'm understanding something wrong, but I can't seem to find anything in the manual indicating to this behavior?

/tmp
▶ sox -n -r 16000 silence.wav trim 0s 4000s

/tmp
▶ sox --i silence.wav

Input File : 'silence.wav'
Channels : 1
Sample Rate : 16000
Precision : 32-bit
Duration : 00:00:00.08 = 1333 samples ~ 6.24844 CDDA sectors
File Size : 5.41k
Bit Rate : 520k
Sample Encoding: 32-bit Signed Integer PCM
~~~

Discussion

  • Emile V

    Emile V - 2020-08-25

    Apparently it works as expected when it is called in a different order.

    /tmp   
    ▶ sox -r 16000 -n silence.wav trim 0s 4000s
    
    /tmp   
    ▶ sox --i silence.wav                      
    
    Input File     : 'silence.wav'
    Channels       : 1
    Sample Rate    : 16000
    Precision      : 32-bit
    Duration       : 00:00:00.25 = 4000 samples ~ 18.75 CDDA sectors
    File Size      : 16.1k
    Bit Rate       : 515k
    Sample Encoding: 32-bit Signed Integer PCM
    

    Sorry, I guess I don't really understand things very well.
    Also I don't really understand why it would be different when using samples or seconds for the trim effect in the original post's case

     

    Last edit: Emile V 2020-08-25
  • Ulrich Klauer

    Ulrich Klauer - 2020-08-25

    Well, when a time is converted to a sample count by SoX, the current rate is automatically taken into consideration. When you specify a number of samples directly, you need to do that yourself.

    So, in your example from the first post, youstart with a sampling rate of 48000 Hz, which is the nullfile default. Then you trim to 4000 samples, which is 1/12 of a second, before the audio is automatically converted to the specified output rate of 16000 Hz, where 1/12 of a second is approximately 1333 samples. All correct. In the second post, you specify a sampling rate of 16000 Hz for the input nullfile already, so no conversion takes place, and the trimming happens at 16000 Hz.

    You could consider using verbose mode (-V) in order to see what is actually happening and in what order. You can always manually specify the necessary converting effects in your preferred order (e.g. rate 16k trim 0 4000s).

     
  • Ulrich Klauer

    Ulrich Klauer - 2020-08-25
    • status: open --> closed-invalid
     
  • Mans Rullgard

    Mans Rullgard - 2020-08-25

    Add the -V flag for verbose output:

    $ sox -V -n -r 16000 silence.wav trim 0 0.25
    sox:      SoX v14.4.2
    sox INFO nulfile: sample rate not specified; using 48000
    
    Input File     : '' (null)
    Channels       : 1
    Sample Rate    : 48000
    Precision      : 32-bit
    
    
    Output File    : 'silence.wav'
    Channels       : 1
    Sample Rate    : 16000
    Precision      : 32-bit
    Sample Encoding: 32-bit Signed Integer PCM
    Endian Type    : little
    Reverse Nibbles: no
    Reverse Bits   : no
    Comment        : 'Processed by SoX'
    
    sox INFO sox: effects chain: input        48000Hz  1 channels
    sox INFO sox: effects chain: trim         48000Hz  1 channels
    sox INFO sox: effects chain: rate         16000Hz  1 channels
    sox INFO sox: effects chain: output       16000Hz  1 channels
    

    Compare with this, your second example:

    $ sox -V -r 16000 -n silence.wav trim 0 4000s
    sox:      SoX v14.4.2
    
    Input File     : '' (null)
    Channels       : 1
    Sample Rate    : 16000
    Precision      : 32-bit
    
    sox INFO sox: Overwriting `silence.wav'
    
    Output File    : 'silence.wav'
    Channels       : 1
    Sample Rate    : 16000
    Precision      : 32-bit
    Sample Encoding: 32-bit Signed Integer PCM
    Endian Type    : little
    Reverse Nibbles: no
    Reverse Bits   : no
    Comment        : 'Processed by SoX'
    
    sox INFO sox: effects chain: input        16000Hz  1 channels
    sox INFO sox: effects chain: trim         16000Hz  1 channels
    sox INFO sox: effects chain: output       16000Hz  1 channels
    

    In the first example, the null input (-n) has the default 48 kHz sample rate. The trim effect is applied to this. Finally, the output from the trim is resampled to the output rate of 16 kHz. In the second example, the input is set to 16 kHz sample rate, and the output then defaults to the same rate.

    The trim effect is operating on different sample rates in the two cases, which explains the difference.

     
  • Emile V

    Emile V - 2020-08-25

    Thanks for the clarification.

     

Log in to post a comment.