Menu

#513 "gm convert A.jpg A.pdf" results invalid PDF

v1.0_(example)
open
nobody
None
5
2019-05-31
2017-10-13
No

As to this post, when converting image to PDF the resultsing PDF is invalid.

GraphicsMagick v1.3.26 for CygWin.

Discussion

  • Bob Friesenhahn

    Bob Friesenhahn - 2017-10-13

    On Fri, 13 Oct 2017, Dmitry Katsubo wrote:

    As to this post, when converting image to PDF the resultsing PDF is invalid.

    Please include A.jpg and A.pdf inside a zipfile or tarball (so
    SourceForge does not re-write the files) and attach it to this problem
    report so we can see what the problem is and attempt to recreate it.

    Thanks,

    Bob

    Bob Friesenhahn
    bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
    GraphicsMagick Maintainer, http://www.GraphicsMagick.org/

     
  • Dmitry Katsubo

    Dmitry Katsubo - 2017-10-14

    Attaching files.

     
    • Bob Friesenhahn

      Bob Friesenhahn - 2017-10-14

      What causes you to believe that the PDF is invalid? It produces a
      nice purple gradient when I open it with a couple of PDF viewers.

       
  • Dmitry Katsubo

    Dmitry Katsubo - 2017-10-14

    It displays OK, however iText PDF library fails to parse it. The author of this library supposes that resulting PDF is not ISO-32000-1 compliant, in particular as stated in this post, the section

    11 0 obj
    <<
    
    endstream
    endobj
    

    is invalid.

     
  • Harald Hanche-Olsen

    I hit this old bug today. Here is a simple way to reproduce it:

    gm convert -size 100x100 xc:white -draw 'circle 50 50 60 60' test.pdf
    

    The resulting pdf file is indeed malformed. The reason this does not cause a problem for most (all?) pdf previewers, is that the offending object (#11 in this example) is never referenced by any other object, so pdf viewers are unlikely to try to parse it. However, any program that parses entire pdf files, such as qpdf, will at least issue a warning and potentially fail. So I think the issue should be fixed. Fortunately, I think it is quite easy. My analysis is as follows. The problem is the function WritePDFImage in file coders/pdf.c.

    :::c
    /*
        Write Colormap object.
      */
      xref[object++]=TellBlob(image);
      FormatString(buffer,"%lu 0 obj\n",object);
      (void) WriteBlobString(image,buffer);
      (void) WriteBlobString(image,"<<\n");
      if ((image->storage_class != DirectClass) && (image->colors <= 256) &&
          (compression != FaxCompression))
        {
          if (compression == NoCompression)
            (void) WriteBlobString(image,"/Filter [ /ASCII85Decode ]\n");
          FormatString(buffer,"/Length %lu 0 R\n",object+1);
          (void) WriteBlobString(image,buffer);
          (void) WriteBlobString(image,">>\n");
          (void) WriteBlobString(image,"stream\n");
          offset=TellBlob(image);
          if (compression == NoCompression)
            Ascii85Initialize(image);
          for (i=0; i < (long) image->colors; i++)
            {
              if (compression == NoCompression)
                {
                  Ascii85Encode(image,ScaleQuantumToChar(image->colormap[i].red));
                  Ascii85Encode(image,ScaleQuantumToChar(image->colormap[i].green));
                  Ascii85Encode(image,ScaleQuantumToChar(image->colormap[i].blue));
                  continue;
                }
              (void) WriteBlobByte(image,
                                   ScaleQuantumToChar(image->colormap[i].red));
              (void) WriteBlobByte(image,
                                   ScaleQuantumToChar(image->colormap[i].green));
              (void) WriteBlobByte(image,
                                   ScaleQuantumToChar(image->colormap[i].blue));
            }
          if (compression == NoCompression)
            Ascii85Flush(image);
        }
      offset=TellBlob(image)-offset;
      (void) WriteBlobString(image,"\nendstream\n");
      (void) WriteBlobString(image,"endobj\n");
    

    This block will unconditionally start writing a new object, open a new dictionary by writing the initial <<, then conditionally write some stuff into the dictionary, end it with >>, start a stream, and write the stream contents. Then, once more unconditionally, it ends the stream and the object.

    If the condition is true, this is no problem. But if the condition is false, it leads to unbalanced << and endstream tokens in the file.

    What I think should happen, is that the entire object construction should be moved within the body of the if statement.

    Moreover, if you look earlier in the code, you will find that the Colormap object will only be referenced if the innermost else of this nested statements is executed (irrelevant parts replaced by […]:

    :::c
          if (image->colorspace == CMYKColorspace)
            {
              []
            }
          else
            {
              if ((compression == FaxCompression) ||
                  (compression == Group4Compression) ||
                  ((image_info->type != TrueColorType) &&
                   (characteristics.grayscale)))
                {
                  []
                }
              else
                {
                  if ((image->storage_class == DirectClass) || (image->colors > 256) ||
                      (compression == JPEGCompression))
                    [];
                  else
                    FormatString(buffer,"[ /Indexed /DeviceRGB %u %lu 0 R ]\n",
                                 (unsigned int) image->colors-1,object+1);
                }
            }
    

    It is not immediately clear that this is equivalent to the condition in the if statement in the “Write Colormap object” block. Perhaps it would be better to set a flag in the innermost else, and use it to test whether to create the Colormap block?

    Finally, the object after the Colormap block just holds a length, and is referenced from within the Colormap object. Its creation could also be moved inside the conditional, in the interest of not creating unreferenced objects.

     
    • Bob Friesenhahn

      Bob Friesenhahn - 2019-05-31

      On Fri, 31 May 2019, Harald Hanche-Olsen wrote:

      I hit this old bug today. Here is a simple way to reproduce it:

      ~~~
      gm convert -size 100x100 xc:white -draw 'circle 50 50 60 60' test.pdf
      ~~~

      The resulting pdf file is indeed malformed. The reason this does not
      cause a problem for most (all?) pdf previewers, is that the
      offending object (#11 in this example) is never referenced by any
      other object, so pdf viewers are unlikely to try to parse it.
      However, any program that parses entire pdf files, such as qpdf,
      will at least issue a warning and potentially fail. So I think the
      issue should be fixed. Fortunately, I think it is quite easy. My
      analysis is as follows. The problem is the function WritePDFImage
      in file coders/pdf.c.

      Would you be able to submit source code changes to solve this in the
      form of a source code patch? The analysis is very helpful, but a
      tested patch that I can easily apply is even more helpful.

      Bob

      Bob Friesenhahn
      bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
      GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
      Public Key, http://www.simplesystems.org/users/bfriesen/public-key.txt

       
      • Harald Hanche-Olsen

        I normally get my copy of gm via macports. Not sure how hard it is to build it outside the macports world, but I suppose I can give it a shot. (Or possibly, I can subvert the macports build system. There is an approved way to do that, but I have never tried it. Hmm.) I cannot promise to do it soon, however. Too many other pressing concerns.

         

Log in to post a comment.

MongoDB Logo MongoDB