#89 "Nicht genügend Daten für ein Bild"

open
nobody
Merge (16)
5
2012-02-01
2012-02-01
Dr. Death
No

We use a Toshiba e-Studio 281c to scan documents to a server share where they were OCR-processed by a Toshiba application called Re-Rite. IMHO this software is based on libraries by Nuance. I found a thread where my problem was discussed before (http://www.pdfsam.org/bbforum/viewtopic.php?f=2&t=714) referring to a software called Paper-Port which IIRC is also based on the Nuance libs. The recommended solution ("Use FoxitReader and reprint as PDF") is not satisfying for me for two reasons:

1. I can and do not want to deploy FoxitReader in our system only for that reason
2. Reprint as PDF (to get rid of the text-layer) produces *much* larger files (factor 10)

Furthermore i do *not* think that the explanation ("...image layer is damaged. Adobe Reader is overchallenged with this problem") does apply. The image-layer is *not* damaged as the possibility to reprint it with another software shows. Also Adobe Reader is able to display the OCRed pdfs correctly. Only after such pdfs are *merged* using pdfsam, the problem occurs. I think the problem is the way pdfsam deals with additional layers or at least an additional OCRed text layer.

Here i supply all files you need to reproduce the problem. Steps to reproduce:

1. merge the original OCRed pdf file "test_page.pdf" twice into a single pdf (see screenshot.png)

Discussion

  • Dr. Death
    Dr. Death
    2012-02-01

    the original OCRed pdf created by ReRite

     
    Attachments
  • Dr. Death
    Dr. Death
    2012-02-01

    the settings used for merging

     
    Attachments
  • Dr. Death
    Dr. Death
    2012-02-01

    the merged result pdfsam creates

     
    Attachments
  • Dr. Death
    Dr. Death
    2012-02-01

    the error message Adobe Reader displays after opening the result

     
    Attachments
  • Dr. Death
    Dr. Death
    2012-02-01

    the log messages pdfsam created during processing

     
    Attachments
  • Dr. Death
    Dr. Death
    2012-02-01

    2. open the merged result ("result.pdf") using Adobe Reader (i used v10.1.1)
    3. adobe reader displays the error message in "error.png"