Menu

#1756 pipe out line ending problem on windows

None
closed-out-of-date
nobody
2016-07-20
2016-03-17
Anonymous
No

Using version 5.0.3 on windows, suppose that we have the following python3 program (test.py):

from sys import stdin
data = stdin.buffer.read(None)
open("test.png","wb").write(data)

This program simply reads from the standard input binary stream and writes what it sees into test.png. For example type sample.png | test.py effectively copies sample.png into test.png.

Now suppose that we run the following commands in gnuplot

set term png
set output "| test.py"
plot sin(x)
set output

this will create a plot and pipe it through that python program. However, the resulting png is unreadable. After some investigation, I discovered the problem is that the file that is written out using the normal output methods (set output sample.png) differs from the piped version in that every occurance of "\n" in the good version is replaced by "\r\n" in the bad version. In other words, every newline in the good version is replaced by the windows newline in the bad version.

I don't know if the fault lies with Windows or with gnuplot here. I suspect that gnuplot is translating the newlines itself. This is exactly what should be happening for text output, but is not desireable for binary output. It is possible to replace the bad newlines with good newlines with another process (set output "| fixnewlines.py | test.py" where fixnewlines simply replaces "\r\n" with "\n"), so there is a workaround, but it is a bit of a hack.

Obviously, this example isn't particularly useful. There is no reason to pipe through a program that simply writes the results to disc, but similar scenarios exist where we may be piping through a conversion program, a server, or any number of things. This example is useful for illustrating the problem, however.

There should be a way to turn off the newline behavior in this case - possibly a command like set lineendings unix where the normal windows behavior could be restored with set lineendings windows. There may be such an option but if there is, I am unaware of it.

Related

Bugs: #1756

Discussion

<< < 1 2 3 > >> (Page 2 of 3)
  • Tatsuro MATSUOKA

    Thank you for C codes.

    I have generated four 'test.png' files by

    1. unpatched gnuplot.exe on windows
    2. patched gnuplot.exe on windows
    3. gnuplot.exe on Cygwin_x86
    4. gnuplot on Ubutu 14.04

    All tests are done by the development version.
    I observed four files by a binary editor to see HEX code.

    Case 1
    Many "0D 0A" codes exist. This is what you said.

    Case 2
    Only one "0D 0A" code exist. The "0D 0A" codes found in the Case 1 is replaced by "0D".

    Case 3
    The situation is smilar to the Case 2.

    Case 4
    The situation is smilar to the Case 2.

    File size is slightly different between the Case 2, the Case 3 and Case 4 but I think it is not essential.

    Thus the patch is effective to suppress traslattion of "0D" codes to "0D 0A" codes.

    The problem is that this break fake pipe on wgnuplot.exe.
    To overcome the wgnuplot issue, Bastian contribution is highly desired.
    I will send e-mail to alert this ticket to him.

     

    Last edit: Tatsuro MATSUOKA 2016-03-18
  • Tatsuro MATSUOKA

    test.png for case 1.

     
  • Tatsuro MATSUOKA

    test.png for case 2

     

    Last edit: Tatsuro MATSUOKA 2016-03-18
  • Tatsuro MATSUOKA

    test.png for case 3

     
  • Tatsuro MATSUOKA

    test.png for case 4

     
  • Matthew Halverson

    I must have had some sort of caching issue going on. Trying it again with the patched binaries, it looks ok on my end now.

    Looking up the png specification http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html, it looks like that one remaining "\r\n" is meant to occur

    The first eight bytes of a PNG file always contain the following (decimal) values:
    137 80 78 71 13 10 26 10

    I'm not sure why the size differences, but they seem to be ok. Looking at the png spec and the places where the files differ, the difference is consistant with the file layout, so this is almost certainly not a problem. For some reason, the libraries probably serialize the files a little differently based on platform.

    So this behavior can be considered fixed by that patch. Yes, it does cause a problem with wnguplot.exe, but hopefully that can be fixed.

     

    Last edit: Matthew Halverson 2016-03-18
    • Matthew Halverson

      I played around with the 4 attached resulting png files and the size difference is not a problem. I don't know the image data format well enough to verify the file contents, but the interal CRCs all match up, and the internal compressed data decompresses without problem. That plus the fact that they are all viewable should be enough to dismiss the size differences as irrelevant. It is likely just a subtle difference in how the rendering library works in different environments.

       
  • Ethan Merritt

    Ethan Merritt - 2016-03-18

    Possible alternative

    Modifying the popen() command in term.c may cause problems if some terminal types really want binary mode and other terminal types really do not want it. A possible alternative is to always open the pipe in text mode, then have terminals wanting binary mode instead to change the mode of the pipe already opened. See
    http://stackoverflow.com/questions/16888339/what-is-the-simplest-way-to-write-to-stdout-in-binary-mode

    In the case of the libgd png terminal I suppose this would look something like:

    TERM_PUBLIC void
    PNG_text()
    {
        image_do_crop();
        if (png_state.flags & PNG_USE_INTERLACE)
            gdImageInterlace(png_state.image, 1);
    
    #ifdef WIN32
        /* Make sure the output stream is in binary mode */
        setmode(fileno(gpoutfile), O_BINARY);
    #endif
    
        gdImagePng(png_state.image, gpoutfile);
        gdImageDestroy(png_state.image);
    }
    

    Totally untested, and it may require extra checks for Windows-specific *.h files.

     
    • Matthew Halverson

      That does mean modifiing every terminal that needs binary output, but is probably the safer way to do it. That would protect the automatic line ending translation on windows for text based terminals, and fix the problem with the binary terminals. However, windows is usually forgiving about the difference between "\n" and "\r\n" in text files these days. That hasn't always been the case.

      This may be the right way to do it.

       

      Last edit: Matthew Halverson 2016-03-19
    • Matthew Halverson

      As I said, I am not very experienced with C and unfamiliar with the gnuplot source, but looking at term.c in the term_initialise method starting at line 435, am I right that the terminals already state if they are binary or text-based with this TERM_BINARY flag? Am I right in reading that with the opened_binary flag that this method is calling term_set_output if the previous terminal was text based and the terminal was switched to a binary one?
      Can this flag be used to determine the terminal mode in term_set_output and set the appropriate mode in windows based on this? Would this allow the mode to be set on the pipe in one place instead of in each terminal individually? For example, could the following work to replace lines 377 to 385 (adding the check on the terminal mode for windows when opening the pipe, but leaving it alone otherwise):

      #if defined(PIPES)
          if (*dest == '|') {
              restrict_popen();
          #ifdef _Windows
              if ((f = popen(dest + 1, (term->flags & TERM_BINARY)?"wb":w")) == (FILE *) NULL)
          #else
              if ((f = popen(dest + 1, POPEN_MODE)) == (FILE *) NULL) 
          #endif
              os_error(c_token, "cannot create pipe; output not changed");
              else
              output_pipe_open = TRUE;
          } else {
      #endif /* PIPES */
      

      I am sure that I haven't written good code there, but could something like that work, when cleaned up?
      I'm not sure what lines 477-488 are doing either (in the preprocessor included/excluded code at the end of term_initialise()).
      Does it look like the different modes were being handled intelligently already in terms of files, but maybe that just wasn't being applied to pipes?

      EDIT I have been trying to build under windows with mixed success, but it looks like the above is not working. I don't know if it is just because it doesn't work, or because I am having trouble building.

       

      Last edit: Matthew Halverson 2016-03-19
  • Matthew Halverson

    Actually, even better would this attached patch work? It is made against the 5.0.3 source, which is all I have right now (and is possibly the same in the development version).

    I THINK that it is handling the pipe in a similar method as the files are being handled. Again, I'd like to test it out myself, but my windows machine is not set up to build this, and I don't really know how to set that up.

    I need a more experienced eye on this anyways to make sure that I haven't done somthing stupid that is going to cause crashes or anything.

    EDIT I have been trying to build under windows with mixed success, but it looks like the above is not working. I don't know if it is just because it doesn't work, or because I am having trouble building. It is likey that that my limited C knowledge isn't up to this task.

     

    Last edit: Matthew Halverson 2016-03-19
  • Tatsuro MATSUOKA

    I have build a binary (5.1.0 ChangeLog 2016-03-18) with your term.c.patch:

    http://www.tatsuromatsuoka.com/gnuplot/Eng/winbin/gp510-20160320-win32-mingw_patched2.zip

    I have tested

    set term png
    set output "| testoutput"
    plot sin(x)
    set output
    

    on gnuplot.exe, wgnuplot.exe and wgnuplot_pipes.exe.

    As expected, on gnuplot.exe and wgnuplot_pipes.exe, the avove code gave a correct png image file.
    However, the patch breaks pipe feature on wgnuplot.exe.

    gnuplot> set output "| testoutput"
             Could not execute pipe ' testoutput'. Writing to binary pipes is not supported.
    

    As I said before the fake pipe feature was implemented by Bastian to emulate pipe on wgnuplot using termporary file. Unfortuately I do not know how the fake pipe is implemented in detail.

    To go further, we need Bastian's contribution.

    If you would like to build gnuplot for windows to test your patch by yourself, please see Bastian's instruction.

    https://sourceforge.net/p/gnuplot/support-requests/199/

     

    Last edit: Tatsuro MATSUOKA 2016-03-20
    • Matthew Halverson

      Ignoring the problem with wgnuplot.exe for the moment, we see

      1. With the unpatched version, the png terminal (and all other binary terminals) get mangled when using output pipes because of the translated "\n" character. This is because the mode is w for ALL terminals and on Windows there is a difference between w and wb.
      2. With the original patch, binary terminals work correctly with output pipes. However, text terminals (the latex terminal for instance) end up with "\n" as a line ending instead of "\r\n" as is desired. This is because we changed the mode to wb for ALL terminals.
      3. With my patch, binary terminals work correctly. Text terminals also work correctly (have the "\r\n" line ending). This is because we are using the w mode for text terminals and wb for binary terminals.

      Is this patch maybe the better way to go then just changing the mode universally?

      Based on Ethan Merrit's comment above, it may be desireable to change the mode only when needed. This patch, however, shouldn't require changing every terminal individually as he suggested, as it puts the code for that in one place.

      Of course, now we need to see if wgnuplot.exe can be fixed to work with this.

      Thank you for that link about building on Windows. I'll have to see if I can get it to work that way.

       

      Last edit: Matthew Halverson 2016-03-20
  • Matthew Halverson

    I was able to build under Windows using the link provided on how to do it with MSYS2.

    The two attached patches seem to fix this bug altogether. The term.c.patch is the previous one that I had attached, and winmain.c.patch fixes it in the wgnuplot.exe executable.

    This needs better tested, and needs another eye on it, but it works on my machine.

    This allows text based terminals to keep the line ending handling as they work in mode w and it allows binary terminals to work correctly as they use the wb mode. Thus both types can pipe out data they can work with.

     
    • Bastian Märkisch

      That seems to be a reasonable approach overall. I am pretty sure that we could use "w" for text and "wb" for binary on all supported platforms though. It won't hurt on *nix and Mac, is correct on Windows on also on OS/2.

      As for type handling binary data on "recent" Windows that might need more testing. In particular since its help text still refers to "text files" only. Another approach would be to supply our own version of cat on Windows to call instead of type. Attached is a quick 'n dirty "port" of NetBSD's version.

       
      • Matthew Halverson

        I made the changes only for the Windows version, because I was uncertain if there was a reason that it was the way that it is (OS/2 using wb for everything, and everything else using w for everything). Use what you want from those patches, improve them if necessary, or throw them out altogether if there is a better way. I have little experience with C, but the source of the bug looked clear enough that I could at least make an attempt at it. My code is probably not the cleanest it could be. Using w for text on all OSs and wb for binary on all of them would probably result in clearer code and less of it.

        Having tested type on Windows 7, it works just fine for binary. Yes, the help text refers strictly to text, but on Microsoft's webpage, there is a warning about using it for binary because it will print "strange" characters, suggesting that it is known to echo binary. So, at least right now it works for binary. Of course, being undocumented for that, there is no guarantee that it will continue to do so in the future (but is unlikely to be changed), and I'm not sure if it has always done that or if there are some versions that don't.

        Suppling an executable to handle that instead would probably be the best way. It will be known to work as expected on all supported Windows versions, and its future behavior will be known.

         
  • Bastian Märkisch

    • labels: windows pipe output lineendings line-endings --> Windows, pipe, line-endings
     
  • Tatsuro MATSUOKA

    Off topic:

    Another approach would be to supply our own version of cat on Windows to call instead of type.

    I have googled and found the busybox-w32 that supports many unixy commans.
    https://frippery.org/busybox/

    One has to do is rename busybox.exe <command-name>.exe (e.g. cat.exe)
    It seems to be very useful.</command-name>

     
  • Bastian Märkisch

    • status: open --> pending-fixed
     
  • Bastian Märkisch

    A revision of Matthew's patch to term.c is now in CVS for 5.1. Pipes are opened in binary mode if the terminal has the TERM_BINARY flag set - on all platforms. Applying it to 5.0 will come later.

    I haven't applied the change to winmain.c yet. Relying on a feature which contradicts the documentation might not be very robust - although it seems to work. On Windows 8.1, type some.png > another.png strangely enough gives a perfect copy.

     
    • Ethan Merritt

      Ethan Merritt - 2016-03-21

      Does this introduce a hard requirement that the "set term" command comes before the "set output" command? The documentation already recommends this, but in practice it has not made much difference before now.

       
      • Matthew Halverson

        Testing this, the order does not seem to matter. term_initialise calls term_set_output again if the file mode is no longer correct (see set.term lines 442 - 474). That was probably already happening in windows too, although it made no difference in that case.

         
  • Matthew Halverson

    As stated, relying on the type function may not be best as the behavior we need is undocumented. Unless there is word from Microsoft out there that it will always handle binary data ok, it is probably better to not rely on it. But, as you discovered, it seems to work.

    Providing a binary to handle this behavior is probably best. Do we need a cat clone? We really don't need the full capability of the cat command, we only need the ability to read a file (in binary mode) and write it out to stdout (also in binary mode). We want to do this explicitly in binary mode, as the line ending behavior was handled by writing out the temporary file - we don't want to convert those a second time. Couldn't something like this work,

    #include <stdio.h>
    #include <fcntl.h>
    #include <io.h>
    
    int main(int argc, char* argv[]) {
        if (argc>1) {
            int c;
            FILE *infile;
            _setmode(_fileno(stdout),_O_BINARY);    
            infile = fopen(argv[1],"rb");
            while((c=fgetc(infile))!=EOF) {
                fputc(c,stdout);
            }
            fclose(infile);
        }
    }
    

    As part of the windows build process, compile this as gpfakepipe.exe, and then apply the attached patch to winmain.c (which is similar to the previous patch, except it also changes the call to type to be a call to gpfakepipe). The above program probably needs some work (my amateur nature in C is almost certainly showing there), but I think that will pick up the functionality that we need without having to rely on undocumented behavior in type.

    This seems to work, with one problem: the gpfakepipe.exe MUST be on the path to be found. It doesn't seem to work if it is just in the gnuplot bin folder unless that folder is on the path. This will be a problem with any additional binary. Is there a way to have the wgnuplot.exe call an external program in its own folder (ie run current_executables_folder\gpfakepipe)? Somehow this needs to be overcome. If it can't, it may be that we need to rely on the type command afterall.

     

    Last edit: Matthew Halverson 2016-03-21
  • Tatsuro MATSUOKA

    As I wrote previously, binary pipe works on wgnuplot_pipes.exe but not wgnuplot.exe.
    On wgnuplot_pipes.exe has a cmd screen and realize pipe through the console process.

    If one want use command line process to emulate pipe, why does not one take into account how wgnuplot_pipes realize pipe feature ?

    BTW, Octave uses hidden console process for GUI interface.
    Can we improve wgnuplot_pipes to use hidden console process and replace it wgnuplot.exe?

     

    Last edit: Tatsuro MATSUOKA 2016-03-22
  • Tatsuro MATSUOKA

    Sorry the previous post is just misleading. Please ignore it.

     
<< < 1 2 3 > >> (Page 2 of 3)

Log in to post a comment.

MongoDB Logo MongoDB