Menu

#641 SIGSEGV thrown performing longjmp in jpeg.c

v1.0_(example)
closed-fixed
None
5
2021-05-01
2021-03-04
No

I'm running GraphicksMagick 1.3.36 built using mingw64.

The command producing the SEGV is:

gm identify -define jpeg:max-warnings=0 -verbose truncated_pool.jpg

After modifying the Makefile to turn off optimization when compiling jpeg.c, I get the expected output i.e.

$ ./gm.exe identify -define jpeg:max-warnings=0 -verbose truncated_pool.jpg
C:\Users\chris\git\GraphicsMagick-1.3.36\utilities\gm.exe identify: Premature end of JPEG file (truncated_pool.jpg).
C:\Users\chris\git\GraphicsMagick-1.3.36\utilities\gm.exe identify: Request did not return an image.

The gdb log, jpeg file and gcc version information are attached.

3 Attachments

Discussion

  • Bob Friesenhahn

    Bob Friesenhahn - 2021-03-04

    Please re-upload truncated_pool.jpg wrapped in an archive format such as zip or tar because SourceForge automatically re-writes JPEG, GIF, and PNG file uploads.

     
  • Bob Friesenhahn

    Bob Friesenhahn - 2021-03-04
    • assigned_to: Bob Friesenhahn
     
  • Bob Friesenhahn

    Bob Friesenhahn - 2021-03-04

    The gdb.txt backtrace shows that exception is zero upon entry into ReadJPEGImage(). This should not be possible. I suspect that the stack has been corrupted:

    #12 0x00007ff6b3843d22 in ReadJPEGImage (image_info=0x7ff600000000, 
        exception=0x0) at coders/jpeg.c:1603
    

    It is unlikely that I will be able to re-create this issue, particularly since it appears that optimization makes a difference. If you are able to debug the issue further on your end, that would help.

     
  • Chris Gravely

    Chris Gravely - 2021-03-04

    Added zipped jpeg file

     
  • Chris Gravely

    Chris Gravely - 2021-03-05

    I've made the following changes:

    1. Rebuilt everything except coders/jpeg.c with CFLAGS = -fopenmp -g -O0 -Wall
    2. Added pragmas around JPEGDecodeMessageHandler to force it to compile -O0 (so stuff doesn't get inlined out in the stack traces).

    gdb_O0.txt was produced compiling coders/jpeg.c with CFLAGS = -fopenmp -g -O0 -Wall
    gdb_Og.txt was produced compiling coders/jpeg.c with CFLAGS = -fopenmp -g -Og -Wall -fno-combine-stack-adjustments -fno-compare-elim -fno-cprop-registers -fno-defer-pop -fno-forward-propagate -fno-guess-branch-probability -fno-inline -fno-ipa-profile -fno-ipa-pure-const -fno-ipa-reference -fno-ipa-reference-addressable -fno-omit-frame-pointer -fno-reorder-blocks -fno-shrink-wrap -fno-split-wide-types -fno-toplevel-reorder -fno-tree-builtin-call-dce -fno-tree-ccp -fno-tree-ch -fno-tree-coalesce-vars -fno-tree-copy-prop -fno-tree-dce -fno-tree-dominator-opts -fno-tree-fre -fno-tree-sink -fno-tree-slsr -fno-tree-ter

    The first test runs correctly; the second throws a SEGV.

    The flags for the second test were chosen so gcc -Q --help=optimizers gives the same results as for the flags used in the first test.

    Note that in the second test, the assert ( exception != ( ExceptionInfo * ) NULL ) in ReadJPEGImage doesn't trigger even though the stack trace claims exception is 0x0. My guess is that gcc is doing some sort of optimization that's causing gdb to print an incorrect call stack.

    I notice too that in the second test, the breakpoint on entry to ReadJPEGImage is on line 1123 and line 1169 is the second breakpoint hit, suggesting extra preamble at the start of the function.

    My suspicion is that gcc optimization and setjmp/longjmp don't play well together, in which case forcing functions that do setjmp/longjmp to build without optimization might be the right thing to do.

     
    • Bob Friesenhahn

      Bob Friesenhahn - 2021-03-06

      On Fri, 5 Mar 2021, Chris Gravely wrote:

      My suspicion is that gcc optimization and setjmp/longjmp don't play well together, in which case forcing functions that do setjmp/longjmp to build without optimization might be the right thing to do.

      It is indeed true that setjmp/longjmp are a terrible thing in that
      they do dangerous things and are platform and CPU specific. The
      setjmp "saves" the current stack and "longjmp" attempts to restore it
      as well as CPU registers so that the program appears to continue where
      it was right after setjmp. When this happens, any stack data which
      might have been modified after the call to setjmp is restored back to
      what it was before. If code then uses that stack data and expects the
      modified value, there can be problems.

      It is entirely possible that this version/build of GCC for Windows
      might have some bug in this area. The bug could be CPU type specific
      if a particular type of CPU is being targeted.

      While compiling GraphicsMagick, are there any GCC warnings, and
      particularly saying that a variable may be "clobbered"?

      Bob

      Bob Friesenhahn
      bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
      GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
      Public Key, http://www.simplesystems.org/users/bfriesen/public-key.txt

       
      • Chris Gravely

        Chris Gravely - 2021-03-06

        There are no warnings or errors when I recompile coders/jpeg.c and rebuild the executables

         
  • Bob Friesenhahn

    Bob Friesenhahn - 2021-03-06

    I am able to reproduce this issue using mingw64 and MSYS2. The problem appears to be directly related to '-define jpeg:max-warnings=0' and if I change it to '-define jpeg:max-warnings=1' it does not crash.

    The crash does not occur under Linux and various memory checkers do not find anything wrong.

     
    • Chris Gravely

      Chris Gravely - 2021-03-06

      The problem is directly related to '-define jpeg:max-warnings=0'; there's only one warning issued when reading the jpeg file, so longjmp isn't called when max-warnings is one or greater.

       
      • Bob Friesenhahn

        Bob Friesenhahn - 2021-03-07

        On Sat, 6 Mar 2021, Chris Gravely wrote:

        The problem is directly related to '-define jpeg:max-warnings=0';
        there's only one warning issued when reading the jpeg file, so
        longjmp isn't called when max-warnings is one or greater.

        Yes. At this point there is no reason to believe that the JPEG error
        handling is working reliably at all with this compiler and/or
        libjpeg-turbo version. I would not trust something which only works
        25-33% of the time.

        The PNG code also uses setjmp/longjmp so it might be similarly
        incapacitated.

        Bob

        Bob Friesenhahn
        bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
        GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
        Public Key, http://www.simplesystems.org/users/bfriesen/public-key.txt

         
  • Bob Friesenhahn

    Bob Friesenhahn - 2021-03-06

    I found that if I add 'volatile' before Image at line 247 of jpeg.c ('volatile Image'), it no longer crashes, although there are then many warnings.

     
  • Bob Friesenhahn

    Bob Friesenhahn - 2021-03-06

    Curioser and curioser! I thought that code edits where somehow making a difference. This is not the case. My Microsoft Windows system has 4 cores. What I have discovered is that if I run the program four times, it will usually segfault once.

     
  • Bob Friesenhahn

    Bob Friesenhahn - 2021-03-06

    By the way, I am also using GCC 10.2.0.

     
  • Bob Friesenhahn

    Bob Friesenhahn - 2021-03-10

    I just tested building under Cygwin with GCC 10.2.0 but apparently with a libjpeg which identifies itself as "IJG JPEG 80" in its version identification. In this case there are no crashes. It seems to me that the JPEG library used might have something to do with the crashes.

     
    • Chris Gravely

      Chris Gravely - 2021-03-14

      I've replaced libjpeg-turbo's jpeg library with an IJG one (I used jpeg-9d as that's the latest). I get crashes similar to my originals.

      Do you know anything about Cygwin's setjmp/longjmp implementations. I know mingw64 uses MSVCRT implementations, but I'm wondering if the Cygwin library providing the POSIX environment has its own implementation. If so, that could explain why the test works when built under Cygwin.

       
      • Bob Friesenhahn

        Bob Friesenhahn - 2021-03-14

        On Sun, 14 Mar 2021, Chris Gravely wrote:

        I've replaced libjpeg-turbo's jpeg library with an IJG one (I used jpeg-9d as that's the latest). I get crashes similar to my originals.

        Do you know anything about Cygwin's setjmp/longjmp implementations.
        I know mingw64 uses MSVCRT implementations, but I'm wondering if the
        Cygwin library providing the POSIX environment has its own
        implementation. If so, that could explain why the test works when
        built under Cygwin.

        I don't have any knowledge about this. The original MSVCRT
        implementation could have no knowledge about modern CPU types
        (providing additional families of instructions such as SSE3, AVX1, and
        AVX2, and additional registers) which were invented after its release.
        If it was updated by Microsoft as part of system updates then it could
        gain such knowledge.

        It would be useful to learn if the MinGW build suffers from similar
        problems with the setjmp-based error reporting other than from the one
        based on the number of warnings. It is not yet clear if the
        setjmp-based error reporting is entirely broken, or just for
        this one code path.

        In my own testing, the failures are semi-random, perhaps failing 25%
        to 30% of the time.

        Bob

        Bob Friesenhahn
        bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
        GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
        Public Key, http://www.simplesystems.org/users/bfriesen/public-key.txt

         
  • Bob Friesenhahn

    Bob Friesenhahn - 2021-04-18
    • status: open --> closed-fixed
     
  • Bob Friesenhahn

    Bob Friesenhahn - 2021-04-18

    This problem is fixed by Mercurial changeset 16516:1747878113cd. I am not sure exactly why the issue is fixed, but the change dramatically reduces the amount of data pushed to the stack, and it also relocates the jump buffer to the heap rather than the stack.

     
    • Chris Gravely

      Chris Gravely - 2021-04-22

      It looks as though ISO-IEC 9899 doesn't allow assignment of the setjmp result, so it might be better to remove the assignment and its associated logging, as it's possible this might cause problems in the future.

       
      • Bob Friesenhahn

        Bob Friesenhahn - 2021-04-22

        On Thu, 22 Apr 2021, Chris Gravely wrote:

        It looks as though ISO-IEC 9899 doesn't allow assignment of the
        setjmp result, so it might be better to remove the assignment and
        its associated logging, as it's possible this might cause problems
        in the future.

        That is a very interesting finding. It appears that the return value
        needs to be consumed immediately as part of decision logic (e.g. a
        switch statement) or not used at all.

        I am not much of a fan of setjmp/longjmp but the only alternative I am
        aware of requires use of threads.

        Bob

        Bob Friesenhahn
        bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
        GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
        Public Key, http://www.simplesystems.org/users/bfriesen/public-key.txt

         
        • Chris Gravely

          Chris Gravely - 2021-04-22

          A possible alternative I can think of would be to implement JPEGCoder as a C-callable C++ subsystem, which would allow you to use try...catch and throw a C++ exception instead of using setjmp/longjmp.
          Having said that, in some situations the exception would get thrown through intermediate JPEG code, which lives in a separate library, and that might complicate things.

           
          • Bob Friesenhahn

            Bob Friesenhahn - 2021-04-25

            Changeset 16518:2a0438efd812 eliminates the assignment of setjmp() to
            a variable. Please look/check again to make sure that all seems ok.

            Bob

            Bob Friesenhahn
            bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
            GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
            Public Key, http://www.simplesystems.org/users/bfriesen/public-key.txt

             
            • Chris Gravely

              Chris Gravely - 2021-05-01

              Sorry for the slow response. The fix looks fine to me

               
          • Bob Friesenhahn

            Bob Friesenhahn - 2021-04-25

            On Thu, 22 Apr 2021, Chris Gravely wrote:

            A possible alternative I can think of would be to implement
            JPEGCoder as a C-callable C++ subsystem, which would allow you to
            use try...catch and throw a C++ exception instead of using
            setjmp/longjmp. Having said that, in some situations the exception
            would get thrown through intermediate JPEG code, which lives in a
            separate library, and that might complicate things.

            Unfortunately, throwing C++ exceptions through C code is not portable.

            In the case of JPEG all of these longjmps are through intermediate
            JPEG C code in libjpeg.

            I thought that I was being clever by passing the reporting line
            number. Now I see that one would need to use some sort of enumeration
            value instead, which is then deciphered using a switch statement.

            Bob

            Bob Friesenhahn
            bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
            GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
            Public Key, http://www.simplesystems.org/users/bfriesen/public-key.txt

             

Log in to post a comment.

MongoDB Logo MongoDB