From: Federico L. (Nemo) <nem...@gm...> - 2011-07-23 07:45:40
|
Hello, you might have heard about <http://arstechnica.com/tech-policy/news/2011/07/swartz-supporter-dumps-18592-jstor-docs-on-the-pirate-bay.ars> We're now going to upload those ~19000 PDFs to the Internet Archive, but we need to remove a watermark. Could you please give me a suggestion about how to do it? Sadly I don't know anything about PDF manipulation. We tried pdfimages, which output a .pbms per page plus a .ppm (the footer/watermark); using ImageMagick to recombine pages in a PDF compressed with LZM produced a PDF almost 3 times as big as the original one, so I think it's better to edit the original PDF without converting it to other raster formats. The PDF looks like this: http://p.defau.lt/?8I_tQEf0Q2SZpi9CJx6I8A Apparently, we need to remove this image: /GxMWCL: 18 0 R, 187 x 248 Which is like this in other PDFs: http://p.defau.lt/?I1lqfJPL8ociEfOpvTfPaA How can I do it? Thank you, Federico |