Menu

#184 Pipe rewind code fails when autodetecting format from a pipe

open
nobody
None
5
2020-08-21
2011-07-30
Rogier
No

Hi,

I am running into the following problem:

When sox is reading from a pipe, and tries to autodetect
the file format, the pipe rewind code may fail.

Example:
The following command feeds a sound file to sox in 2 parts, with
a delay between the parts. The first part is 32 bytes.
sox --null -t sox - trim 0 1:0 2> /dev/null \ | ( dd bs=32 count=1; sleep 2; dd bs=4k) 2> /dev/null \ | sox -V3 - --null
The output is:
sox: SoX v14.3.2
sox INFO formats: detected file format type `sox'
sox FAIL formats: can't open input `-': can't find sox file format identifier
Obviously, it shouldn't fail. For the record, the following command:
(same as before, but with a workaround)
sox --null -t sox - trim 0 1:0 2> /dev/null \ | ( dd bs=32 count=1; sleep 2; dd bs=4k) 2> /dev/null \ | dd obs=4k 2> /dev/null \ | sox -V3 - --null
Has the expected output:
sox: SoX v14.3.2
sox INFO formats: detected file format type `sox'

Input File : '-' (sox)
Channels : 1
Sample Rate : 48000
Precision : 32-bit
Sample Encoding: 32-bit Signed Integer PCM
Endian Type : little
Reverse Nibbles: no
Reverse Bits : no
Comment : 'Processed by SoX'

Output File : '' (null)
Channels : 1
Sample Rate : 48000
Precision : 32-bit

sox INFO sox: effects chain: input 48000Hz 1 channels
sox INFO sox: effects chain: output 48000Hz 1 channels

I am running sox v14.3.2 on debian testing.

The cause of the failure is that fread performs (needs to perform) two
read system calls, with the data from the second read overwriting the
data from the first in its internal buffers. The pipe rewind code silently
fails as a result, and sox becomes confused.

I suspect there is no easy fix for this, but the case could be
documented, and the error message is puzzling, and could be improved upon,
for instance by detecting the case, and failing. For example:

----------------------------------------------------------------------
--- formats.c 2011-01-04 06:03:20.000000000 +0100
+++ formats.c.new 2011-07-26 16:52:18.000000000 +0200
@@ -392,16 +392,20 @@

/* Hack to rewind pipes (a small amount).
* Works by resetting the FILE buffer pointer */
-static void UNUSED rewind_pipe(FILE * fp)
+static size_t UNUSED rewind_pipe(FILE * fp)
{
+ size_t rewind = 0;
/* _FSTDIO is for Torek stdio (i.e. most BSD-derived libc's)
* In theory, we no longer need to check _NEWLIB_VERSION or __APPLE__ */
#if defined _FSTDIO || defined _NEWLIB_VERSION || defined __APPLE__
+ rewind = AUTO_DETECT_SIZE;
fp->_p -= AUTO_DETECT_SIZE;
fp->_r += AUTO_DETECT_SIZE;
#elif defined __GLIBC__
+ rewind = fp->_IO_read_ptr - fp->_IO_read_base;
fp->_IO_read_ptr = fp->_IO_read_base;
#elif defined _MSC_VER || defined __MINGW_H || defined _ISO_STDIO_ISO_H
+ rewind = fp->_ptr - fp->_base;
fp->_ptr = fp->_base;
#else
/* To fix this #error, either simply remove the #error line and live without
@@ -411,6 +415,7 @@
#define NO_REWIND_PIPE
(void)fp;
#endif
+ return rewind;
}

static sox_format_t * open_read(
@@ -474,7 +479,11 @@
else if (!(ft->handler.flags & SOX_FILE_NOSTDIO) &&
input_bufsiz >= AUTO_DETECT_SIZE) {
filetype = auto_detect_format(ft, lsx_find_file_extension(path));
- rewind_pipe(ft->fp);
+ if (rewind_pipe(ft->fp) != AUTO_DETECT_SIZE) {
+ lsx_fail("autodetection garbled contents of input pipe `%s' "
+ "- make sure the first %d bytes can be read atomically", path, AUTO_DETECT_SIZE);
+ goto error;
+ }
ft->tell_off = 0;
}
#endif
----------------------------------------------------------------------

Note that the code for the BSD-derived (and other) libc's has a different
bug waiting to happen: It assumes it can rewind AUTO_DETECT_SIZE, which may
not be possible. From a BSD stdio.h I found on internet, I suspect the
correct code may read something like:
rewind = fp->_p - fp->_bf._base;
fp->_r += fp->_p - fp->_bf._base;
fp->_p = fp->_bf._base;
But I have no way of testing...

Kind regards,

Rogier.

Discussion

  • Rogier

    Rogier - 2011-08-02

    I looked a bit further, and on Linux, and probably at least on BSD as well (the include file I checked was from FreeBSD), libc supports many more ungetc() than just one. As far as I can see, it even uses an addidional buffer if necessary. I suppose that, for the platforms that support it, using ungetc would therefore be a better-defined and more dependable implementation of pipe-rewind than the hack that is being used now...

     
  • Jan Starý

    Jan Starý - 2020-08-21

    I can confirm that the same problem still exists in current git.
    Onn macOS 10.15.6, the error reported is the same.
    On OpenBSD 6.7-current, sox segfaults.
    The workaround works on both.

     

Log in to post a comment.