#21 removing thumbnail from non-JPEG image causes crash

closed-fixed
Lutz Müller
exif (14)
5
2005-03-30
2004-10-26
Ben Liblit
No

When exif is asked to remove the thumbnail from a file which is not actually a JPEG image, it subsequently crashes. The fact that the file is malformed is actually detected, but that discovery is not properly conveyed to later code which incorrectly assumes that it is looking at valid JPEG data. We are seeing this crash using exif-0.6.9 with libexif-0.6.11.

This bug was uncovered as part of the Cooperative Bug Isolation Project <http://sampler.cs.berkeley.edu/>, which attempts to find bugs by discovering statistically significant differences in the behavior of good and bad runs. This bug, along with several others, was discovered by running exif with random command line flags on random elements of a large corpus of JPEG images. We observed a 7% crash rate across the entire experiment. The bug we describe here accounts for 24% of those failures, in part due to our intentional seeding of the corpus with several bogus files which are not actually JPEG images.

["We" here refers to the members of the Cooperative Bug Isolation research team at Stanford University and the University of California, Berkeley. Though just one of us is submitting this report, it is the product of a joint research effort to which all members made invaluable contributions.]

We are attaching a test file, "image2.jpg", to this report. The command "exif -r image2.jpg" causes a segmentation fault. The (abstract) stack trace looks as follows:

main
save_exif_data_to_file
jpeg_data_set_exif_data
memmove

This directs our attention to the crash site which is a call to memmove in the body of function jpeg_data_set_exif_data in file exif-0.6.9/libjpeg/jpeg-data.c:

jpeg_data_set_exif_data(...)
{
...
section = jpeg_data_get_section (data, JPEG_MARKER_APP1);
if (!section) {
jpeg_data_append_section (data);
memmove (&data->sections[2], &data->sections[1],
sizeof (JPEGSection) * (data->count - 2));
...
}
...
}

However, the stack trace alone doesn't help us much in determining the cause of the bug. Our Cooperative Bug Isolation analysis tools further direct our attention to the following two predicates:

"sizeof (JPEGSection) * (data->count - 2) < 0" in function jpeg_data_set_exif_data.

"!JPEG_IS_MARKER (d[o + i])" in function jpeg_data_load_data in file exif-0.6.9/libjpeg/jpeg-data.c.

When these two predicates are true, the program is much more likely to crash. These predicates, then, describe the circumstances under which the program fails. Once these two predicates are provided, it is easy to figure the cause of the bug.

First, the predicate "sizeof (JPEGSection) * (data->count - 2) < 0" indicates that the 3rd argument of the call to memmove in the body of function jpeg_data_set_exif_data is negative. Furthermore, we can infer from this predicate that data->count is less than 2. It is easy to figure from the code in jpeg_data_set_exif_data around the call to memmove that data->count is keeping track of the number of sections appended from the input image file. Once this fact is known, it is easy to infer that the programmer expects at least 2 sections to have been appended before the call to memmove is executed. One section is appended just before the call to memove by the call to function jpeg_data_append_section. This call ensures that data->count is at least 1, but it does not ensure that data->count is at least 2. Clearly, a call to jpeg_data_append_section is missed along the path from the entry point of main to the call to memmove in the body of jpeg_data_set_exif_data. But this path is very lengthy, and it is tedious to search for this missing call.

Fortunately, our analysis tools also report the predicate "!JPEG_IS_MARKER (d[o + i])" in the body of function jpeg_data_load_data as predictive of failure, which motivates us to construct the following sequence of calls:

main
save_exif_data_to_file
jpeg_data_new_from_file
jpeg_data_load_file
jpeg_data_load_data
jpeg_data_set_exif_data
memmove

The code of jpeg_data_load_data looks as follows:

jpeg_data_load_data(...)
{
...
if (!JPEG_IS_MARKER (d[o + i])) <--- (*)
return;
/* Append this section */
jpeg_data_append_section (data);
...
}

We can now immediately infer that when the predicate "!JPEG_IS_MARKER (d[o + i])" is true, jpeg_data_load_data does not call jpeg_data_append_section and returns silently. As a result of this missed call to jpeg_data_append_section, the value of data->count just before the call to memmove in the body of function jpeg_data_set_exif_data is 1 instead of 2, thereby causing a negative integer to be passed as the 3rd argument to memmove and crashing it.

The fix for this bug seems to be for jpeg_data_load_data to return an error code whenever "!JPEG_IS_MARKER (d[o + i])" is true instead of returning silently, and to check for this error code in the callers of jpeg_data_load_data.

Discussion

  • Ben Liblit
    Ben Liblit
    2004-10-26

    test non-image demonstrating the bug

     
    Attachments
  • Ben Liblit
    Ben Liblit
    2004-10-26

    Logged In: YES
    user_id=2116

    Unfortunately it looks like SourceForge's bug system ate up the leading whitespace in our abstract stack traces. Lack of indentation makes them hard to read. Using underscores instead of spaces, the initial stack trace at the point of failure should look like this:

    __ main
    ____ save_exif_data_to_file
    ______ jpeg_data_set_exif_data
    ________ memmove

    The strong failure predictor in jpeg_data_load_data lets us reconstruct the following sequence of calls before the failure:

    __ main
    ____ save_exif_data_to_file
    ______ jpeg_data_new_from_file
    ________ jpeg_data_load_file
    __________ jpeg_data_load_data
    ______ jpeg_data_set_exif_data
    ________ memmove

    Hopefully that's more readable. Sorry about the confusion!

     
    • labels: --> exif
     
  • Logged In: YES
    user_id=59853

    Thanks for the report.

    Looks like this is still in libexif 0.6.13 CVS / exif 0.6.11
    CVS.

    Testcase added.

     
  • Lutz Müller
    Lutz Müller
    2005-03-16

    • status: open --> closed-fixed
     
    • status: closed-fixed --> open-accepted
     
  • Logged In: YES
    user_id=59853

    I hereby reopen the bug.

    Lutz, have you verified the fix with attached image?

    I have tried to (using libexif-testsuite), and still get get
    the segfault.

     
  • Logged In: YES
    user_id=59853

    "exif -r -o foo.jpg this-file-does-not-exist.jpg" also
    segfaults.

     
  • Logged In: YES
    user_id=59853

    More diagnostics:
    jpeg_data_new_from_file() returns non-NULL even in case of
    errors.

    The reason is that (like just about everywhere in
    libexif/libjpeg/exif/etc.) jpeg_data_load_file() has no
    defined semantics to return an error condition.

    Lutz, am I missing something or is that just a conceptual
    oversight you made when writing all that stuff?

    As this problem penetrates just about all the code, fixing
    it will be a lot of tedious work :-(

     
    • assigned_to: nobody --> lutz
     
  • Lutz Müller
    Lutz Müller
    2005-03-30

    Logged In: YES
    user_id=58652

    I made it possible to supply a log function to JPEGData and
    changed exif to abort if anything dubious gets logged
    (CORRDUPT_DATA, NO_MEMORY).

     
  • Lutz Müller
    Lutz Müller
    2005-03-30

    • status: open-accepted --> closed-fixed