Menu

#3 Incorrect extraction of images

open
None
5
2012-09-10
2008-02-03
Kovid Goyal
No

pdf2xml incorrectly extracts the image from the attached PDF. The extracted image is in inverse color and rotated by 180 degrees.

Discussion

  • Kovid Goyal

    Kovid Goyal - 2008-02-03

    Buggy PDF

     
  • Herve Dejean

    Herve Dejean - 2008-02-04

    Logged In: YES
    user_id=960595
    Originator: NO

    • Inverse color: should be fixed rapidly.
    • rotation: similar to pdfimages "pdfimages extracts the raw image data from the PDF file, without performing any additional transforms. Any rotation, clipping, color inversion, etc. done by the PDF content stream is ignored."
     
  • Kovid Goyal

    Kovid Goyal - 2008-02-04

    Logged In: YES
    user_id=1109503
    Originator: YES

    Thanks. Would it at least be possible to indicate any rotation in the output XML? Also, I'd like to contribute a few patches (I'm using pdf2xml as a backend to write a tool to convert PDF to reflowable HTML with as much structure preserved as possible, as part of the libprs500 suite of tools), but I couldn't get pdf2xml to build from source (I couldn't get libxpdf.a to compile from the xpdf sources). Any hints would be appreciated.

     
  • Herve Dejean

    Herve Dejean - 2008-02-05

    Logged In: YES
    user_id=960595
    Originator: NO

    rotation: It should be possible (we can do it for text). I'm investigating.
    For compilation, Have you succeeded in installing xpdf from http://www.foolabs.com/xpdf ?(Can you open a support request for this?)

     
  • Herve Dejean

    Herve Dejean - 2008-02-06

    Logged In: YES
    user_id=960595
    Originator: NO

    for compilation: you can update the INSTALL file
    line 42: I added the instruction for creating libxpdf.a : ar -rc libxpdf.a *.o (in xpdf-#.XX/xpdf)

     

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.