You can subscribe to this list here.
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
(21) |
Feb
(36) |
Mar
(24) |
Apr
(36) |
May
(38) |
Jun
(32) |
Jul
(14) |
Aug
(34) |
Sep
(12) |
Oct
(15) |
Nov
(8) |
Dec
(6) |
2008 |
Jan
(27) |
Feb
(16) |
Mar
(18) |
Apr
(14) |
May
(20) |
Jun
(8) |
Jul
(20) |
Aug
(27) |
Sep
(15) |
Oct
(23) |
Nov
(2) |
Dec
|
2009 |
Jan
(11) |
Feb
(22) |
Mar
(11) |
Apr
|
May
(18) |
Jun
(5) |
Jul
(2) |
Aug
(25) |
Sep
(27) |
Oct
(4) |
Nov
(4) |
Dec
(2) |
2010 |
Jan
(7) |
Feb
(3) |
Mar
(5) |
Apr
(23) |
May
(27) |
Jun
(12) |
Jul
(11) |
Aug
(7) |
Sep
|
Oct
(28) |
Nov
(74) |
Dec
(11) |
2011 |
Jan
(64) |
Feb
(4) |
Mar
(20) |
Apr
(17) |
May
(11) |
Jun
|
Jul
(9) |
Aug
|
Sep
(20) |
Oct
(1) |
Nov
(1) |
Dec
|
2012 |
Jan
(2) |
Feb
(3) |
Mar
(4) |
Apr
(2) |
May
|
Jun
|
Jul
(6) |
Aug
(1) |
Sep
(2) |
Oct
|
Nov
|
Dec
(4) |
2013 |
Jan
(3) |
Feb
(4) |
Mar
(2) |
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
(1) |
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2014 |
Jan
|
Feb
(1) |
Mar
(4) |
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
(1) |
Feb
|
Mar
|
Apr
(6) |
May
(2) |
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
|
From: Michal H. <ms...@gm...> - 2011-09-21 08:06:21
|
On Tue, Sep 20, 2011 at 03:08:18PM -0500, Bollinger, John C wrote: > Hello All, Hi, > > I am attempting to build PDFedit v 0.4.5 on CentOS 5, using the > distro's provided GCC 4.1.2 toolchain. I am encountering a cluster > of related compilation errors that appear the same as, or at least > closely related to, the one discussed in this previous thread: > > http://sourceforge.net/mailarchive/forum.php?thread_name=20100521074048.GG4000%40tiehlicka.suse.cz&forum_name=pdfedit-support > > I encounter exactly the same compiler error that the OP in that thread reported, plus a few more: > > g++ -c -g -O2 -fmessage-length=0 -D_FORTIFY_SOURCE=2 -fno-strict-aliasing -fexceptions -fstack-protector -pipe -posix -ansi -std=c++98 -pedantic -I. -I/home/jbolling/rpm/BUILD/pdfedit-0.4.5/src -I/home/jbolling/rpm/BUILD/pdfedit-0.4.5/src/xpdf/ -I/usr/include -I/usr/include/freetype2 -I/usr/include -o cpagecontents.o cpagecontents.cc > cpagecontents.cc: In member function 'void pdfobjects::CPageContents::addInlineImage(const std::vector<char, std::allocator<char> >&, const libs::Point&, const libs::Point&)': > cpagecontents.cc:543: warning: passing 'const double' for argument 1 to 'pdfobjects::CObjectSimple<Tp>::CObjectSimple(const typename pdfobjects::PropertyTraitSimple<Tp>::value&) [with pdfobjects::PropertyType Tp = pInt]' > /usr/include/boost/noncopyable.hpp: In copy constructor 'pdfobjects::CObjectSimple<pInt>::CObjectSimple(const pdfobjects::CObjectSimple<pInt>&)': > /usr/include/boost/noncopyable.hpp:27: error: 'boost::noncopyable_::noncopyable::noncopyable(const boost::noncopyable_::noncopyable&)' is private > /home/jbolling/rpm/BUILD/pdfedit-0.4.5/src/kernel/cobjectsimple.h:104: error: within this context > cpagecontents.cc: In member function 'void pdfobjects::CPageContents::addInlineImage(const std::vector<char, std::allocator<char> >&, const libs::Point&, const libs::Point&)': > cpagecontents.cc:543: note: synthesized method 'pdfobjects::CObjectSimple<pInt>::CObjectSimple(const pdfobjects::CObjectSimple<pInt>&)' first required here > cpagecontents.cc:544: warning: passing 'const double' for argument 1 to 'pdfobjects::CObjectSimple<Tp>::CObjectSimple(const typename pdfobjects::PropertyTraitSimple<Tp>::value&) [with pdfobjects::PropertyType Tp = pInt]' > /usr/include/boost/noncopyable.hpp: In copy constructor 'pdfobjects::CObjectSimple<pName>::CObjectSimple(const pdfobjects::CObjectSimple<pName>&)': > /usr/include/boost/noncopyable.hpp:27: error: 'boost::noncopyable_::noncopyable::noncopyable(const boost::noncopyable_::noncopyable&)' is private > /home/jbolling/rpm/BUILD/pdfedit-0.4.5/src/kernel/cobjectsimple.h:104: error: within this context > cpagecontents.cc: In member function 'void pdfobjects::CPageContents::addInlineImage(const std::vector<char, std::allocator<char> >&, const libs::Point&, const libs::Point&)': > cpagecontents.cc:545: note: synthesized method 'pdfobjects::CObjectSimple<pName>::CObjectSimple(const pdfobjects::CObjectSimple<pName>&)' first required here > > > I will address the warnings in a separate message. Right now, observe > that g++ complains not just about line 545, where attention focused > during the previous discussion, but also about line 544. In fact, in > my experiments I found that the compiler will make the same complaint > about an inaccessible copy constructor for each of these lines: > > image_dict.addProperty ("W", CInt (image_size.x)); > image_dict.addProperty ("H", CInt (image_size.y)); > image_dict.addProperty ("CS", CName ("RGB")); > image_dict.addProperty ("BPC", CInt (8)); Yes the problem is already fixed in CVS. The issue is that this particular gcc version doesn't like r-value (constructor) for a const reference parameter without a copy constructor. There is no reason to use the copy constructor here because the object is for the single use without any external aliases. Strictly speaking C++ standard enables such a behavior so we have addressed the issue simply by creating a temporal object. CInt W (image_size.x); image_dict.addProperty ("W", W); > > It appears, then, that g++ decides it must make copies of the CInt and > CName arguments, even though the method prototype specifies that the > arguments are references. Perhaps g++ wants to copy the arguments > and then pass references to the copies. Exactly. > That could be a GCC bug, Not a bug as I said above. C++ standard enables that. I would rather call it suboptimality. > but it draws attention to a PDFEdit bug here: the program is > attempting to store references to local CInt and CName objects in a > CDict whose lifetime exceeds theirs. No. Arguments are deep copied (we are using clone method). Have a look at CDict::addProperty method. > > That is, to the best of my understanding, the lifetime of local > objects instantiated in an argument list is limited to the function > call to which they are arguments (i.e. each addProperty() call), > whereas image_dict lives until the end of the method. > > Even if that were not so, it appears that the method also leaks > references to these local CInt and CName objects when it subsequently > uses the constructed image_dict to initialize a CInlineImage allocated > on the heap (thus using longer-lived local objects would not solve the > problem). > > In any event, the compiler is satisfied if I change the above four lines like so: > > image_dict.addProperty ("W", *(new CInt (image_size.x))); > image_dict.addProperty ("H", *(new CInt (image_size.y))); > image_dict.addProperty ("CS", *(new CName ("RGB"))); > image_dict.addProperty ("BPC", *(new CInt (8))); > > That obviously leaks memory, but it's better than the program > accessing who-knows-what through dangling references. It also > demonstrates to my satisfaction that the problem is not with the > compiler being confused about types, but rather with its approach to > handling the type locality issue, possibly in conjunction with some > optimization it attempts to apply. > > To sum up, there appears to be a cluster of PDFedit bugs here, even > when the original code compiles successfully. I don't see any simple > fix that would be adequate, but perhaps those of you who are more > familiar with Does the deep copying answers your concern? > Best, > > John Bollinger > > > Email Disclaimer: www.stjude.org/emaildisclaimer -- Michal Hocko |
From: Jeffrey W. <nol...@gm...> - 2011-09-20 22:12:36
|
On Tue, Sep 20, 2011 at 6:02 PM, Bollinger, John C <Joh...@st...> wrote: > > On Tuesday, September 20, 2011 4:35 PM, Jeffrey Walton wrote: >> On Tue, Sep 20, 2011 at 5:13 PM, Bollinger, John C >> <Joh...@st...> wrote: > > [...] > >> > The most obvious fix would be to change the CInts to CReals, but >> I don't know whether that would cause trouble elsewhere in the >> program.CInt (static_cast<int>(image_size.x)) >> CInt (static_cast<int>(image_size.x)) and CInt >> (static_cast<int>(image_size.y)) ? Otherwise, you might need to >> change >> Tp to float or double (which seems like a lot more work). >> >> Since PDFs are widely abused as vectors (and it is CentOS), you >> might >> want to verify image_size.x and image_size.y are within bounds of >> the >> [integer] data type if you choose to cast. numeric_limits is your >> friend. > > Changing Tp to double is approximately the effect of switching the types from CInt to CReal. Only the lines I showed need to be modified, and the code then compiles fine without any more warnings in that section. The problem is that I'm not sure how to test adequately whether the resulting program works correctly. > > static_cast might do the job, but storing a reference to the result of a cast sounds dubious to me. It might work, but it seems like asking for trouble. OK. CInt should copy construct its object, so I don't believe its retaining an external reference. > Again, though, I'm not sure how to test the result adequately. > > Is there any clear guidance on what this ought to be, or is guess and test the best available approach? Take a look at CInt's data member declaration and see if its a reference. From http://pdfedit.cvs.sourceforge.net/viewvc/pdfedit/pdfedit/src/, I can't tell where it might be hiding. Also, if you can assign a CInt, I would expect that it does not hold an internal integer reference. As for the container that holds the name/value pair, they should be copy constructible. But I'm basing that on STL, and PDF Edit might be doing things differently. Jeff |
From: Bollinger, J. C <John.Bollinger@STJUDE.ORG> - 2011-09-20 22:02:21
|
On Tuesday, September 20, 2011 4:35 PM, Jeffrey Walton wrote: > On Tue, Sep 20, 2011 at 5:13 PM, Bollinger, John C > <Joh...@st...> wrote: [...] > > The most obvious fix would be to change the CInts to CReals, but > I don't know whether that would cause trouble elsewhere in the > program.CInt (static_cast<int>(image_size.x)) > CInt (static_cast<int>(image_size.x)) and CInt > (static_cast<int>(image_size.y)) ? Otherwise, you might need to > change > Tp to float or double (which seems like a lot more work). > > Since PDFs are widely abused as vectors (and it is CentOS), you > might > want to verify image_size.x and image_size.y are within bounds of > the > [integer] data type if you choose to cast. numeric_limits is your > friend. Thanks, Jeff. Changing Tp to double is approximately the effect of switching the types from CInt to CReal. Only the lines I showed need to be modified, and the code then compiles fine without any more warnings in that section. The problem is that I'm not sure how to test adequately whether the resulting program works correctly. static_cast might do the job, but storing a reference to the result of a cast sounds dubious to me. It might work, but it seems like asking for trouble. Again, though, I'm not sure how to test the result adequately. Is there any clear guidance on what this ought to be, or is guess and test the best available approach? Thanks again, John Email Disclaimer: www.stjude.org/emaildisclaimer |
From: Jeffrey W. <nol...@gm...> - 2011-09-20 21:35:36
|
On Tue, Sep 20, 2011 at 5:13 PM, Bollinger, John C <Joh...@st...> wrote: > Hello All, > > As I wrote a few minutes ago, I am attempting to build PDFedit v 0.4.5 on CentOS 5, using the distro's provided GCC 4.1.2 toolchain. The method CPageContents::addInlineImage() is giving me some trouble, the bulk of which I raised in my previous mail. In addition, however, I see some worrisome warnings in the same method: > > g++ -c -g -O2 -fmessage-length=0 -D_FORTIFY_SOURCE=2 -fno-strict-aliasing -fexceptions -fstack-protector -pipe -posix -ansi -std=c++98 -pedantic -I. -I/home/jbolling/rpm/BUILD/pdfedit-0.4.5/src -I/home/jbolling/rpm/BUILD/pdfedit-0.4.5/src/xpdf/ -I/usr/include -I/usr/include/freetype2 -I/usr/include -o cpagecontents.o cpagecontents.cc > cpagecontents.cc: In member function 'void pdfobjects::CPageContents::addInlineImage(const std::vector<char, std::allocator<char> >&, const libs::Point&, const libs::Point&)': > cpagecontents.cc:543: warning: passing 'const double' for argument 1 to 'pdfobjects::CObjectSimple<Tp>::CObjectSimple(const typename pdfobjects::PropertyTraitSimple<Tp>::value&) [with pdfobjects::PropertyType Tp = pInt]' > [...] > cpagecontents.cc:544: warning: passing 'const double' for argument 1 to 'pdfobjects::CObjectSimple<Tp>::CObjectSimple(const typename pdfobjects::PropertyTraitSimple<Tp>::value&) [with pdfobjects::PropertyType Tp = pInt]' > > These arise from the following two lines of code: > > image_dict.addProperty ("W", CInt (image_size.x)); > image_dict.addProperty ("H", CInt (image_size.y)); > > Evidently, image_size.x and image_size.y are of type (const double), but they are passed to a constructor expecting an int&. I'm neither sure what is needed, nor sure what actually happens here (and that in itself is a minor problem). Given that we're dealing with references, however, I can't think of an approach the compiler could take to the issue that would be likely to have a satisfactory result. > > The most obvious fix would be to change the CInts to CReals, but I don't know whether that would cause trouble elsewhere in the program.CInt (static_cast<int>(image_size.x)) CInt (static_cast<int>(image_size.x)) and CInt (static_cast<int>(image_size.y)) ? Otherwise, you might need to change Tp to float or double (which seems like a lot more work). Since PDFs are widely abused as vectors (and it is CentOS), you might want to verify image_size.x and image_size.y are within bounds of the [integer] data type if you choose to cast. numeric_limits is your friend. Jeff |
From: Bollinger, J. C <John.Bollinger@STJUDE.ORG> - 2011-09-20 21:13:47
|
Hello All, As I wrote a few minutes ago, I am attempting to build PDFedit v 0.4.5 on CentOS 5, using the distro's provided GCC 4.1.2 toolchain. The method CPageContents::addInlineImage() is giving me some trouble, the bulk of which I raised in my previous mail. In addition, however, I see some worrisome warnings in the same method: g++ -c -g -O2 -fmessage-length=0 -D_FORTIFY_SOURCE=2 -fno-strict-aliasing -fexceptions -fstack-protector -pipe -posix -ansi -std=c++98 -pedantic -I. -I/home/jbolling/rpm/BUILD/pdfedit-0.4.5/src -I/home/jbolling/rpm/BUILD/pdfedit-0.4.5/src/xpdf/ -I/usr/include -I/usr/include/freetype2 -I/usr/include -o cpagecontents.o cpagecontents.cc cpagecontents.cc: In member function 'void pdfobjects::CPageContents::addInlineImage(const std::vector<char, std::allocator<char> >&, const libs::Point&, const libs::Point&)': cpagecontents.cc:543: warning: passing 'const double' for argument 1 to 'pdfobjects::CObjectSimple<Tp>::CObjectSimple(const typename pdfobjects::PropertyTraitSimple<Tp>::value&) [with pdfobjects::PropertyType Tp = pInt]' [...] cpagecontents.cc:544: warning: passing 'const double' for argument 1 to 'pdfobjects::CObjectSimple<Tp>::CObjectSimple(const typename pdfobjects::PropertyTraitSimple<Tp>::value&) [with pdfobjects::PropertyType Tp = pInt]' These arise from the following two lines of code: image_dict.addProperty ("W", CInt (image_size.x)); image_dict.addProperty ("H", CInt (image_size.y)); Evidently, image_size.x and image_size.y are of type (const double), but they are passed to a constructor expecting an int&. I'm neither sure what is needed, nor sure what actually happens here (and that in itself is a minor problem). Given that we're dealing with references, however, I can't think of an approach the compiler could take to the issue that would be likely to have a satisfactory result. The most obvious fix would be to change the CInts to CReals, but I don't know whether that would cause trouble elsewhere in the program. Best, John Bollinger Email Disclaimer: www.stjude.org/emaildisclaimer |
From: Bollinger, J. C <John.Bollinger@STJUDE.ORG> - 2011-09-20 20:08:37
|
Hello All, I am attempting to build PDFedit v 0.4.5 on CentOS 5, using the distro's provided GCC 4.1.2 toolchain. I am encountering a cluster of related compilation errors that appear the same as, or at least closely related to, the one discussed in this previous thread: http://sourceforge.net/mailarchive/forum.php?thread_name=20100521074048.GG4000%40tiehlicka.suse.cz&forum_name=pdfedit-support I encounter exactly the same compiler error that the OP in that thread reported, plus a few more: g++ -c -g -O2 -fmessage-length=0 -D_FORTIFY_SOURCE=2 -fno-strict-aliasing -fexceptions -fstack-protector -pipe -posix -ansi -std=c++98 -pedantic -I. -I/home/jbolling/rpm/BUILD/pdfedit-0.4.5/src -I/home/jbolling/rpm/BUILD/pdfedit-0.4.5/src/xpdf/ -I/usr/include -I/usr/include/freetype2 -I/usr/include -o cpagecontents.o cpagecontents.cc cpagecontents.cc: In member function 'void pdfobjects::CPageContents::addInlineImage(const std::vector<char, std::allocator<char> >&, const libs::Point&, const libs::Point&)': cpagecontents.cc:543: warning: passing 'const double' for argument 1 to 'pdfobjects::CObjectSimple<Tp>::CObjectSimple(const typename pdfobjects::PropertyTraitSimple<Tp>::value&) [with pdfobjects::PropertyType Tp = pInt]' /usr/include/boost/noncopyable.hpp: In copy constructor 'pdfobjects::CObjectSimple<pInt>::CObjectSimple(const pdfobjects::CObjectSimple<pInt>&)': /usr/include/boost/noncopyable.hpp:27: error: 'boost::noncopyable_::noncopyable::noncopyable(const boost::noncopyable_::noncopyable&)' is private /home/jbolling/rpm/BUILD/pdfedit-0.4.5/src/kernel/cobjectsimple.h:104: error: within this context cpagecontents.cc: In member function 'void pdfobjects::CPageContents::addInlineImage(const std::vector<char, std::allocator<char> >&, const libs::Point&, const libs::Point&)': cpagecontents.cc:543: note: synthesized method 'pdfobjects::CObjectSimple<pInt>::CObjectSimple(const pdfobjects::CObjectSimple<pInt>&)' first required here cpagecontents.cc:544: warning: passing 'const double' for argument 1 to 'pdfobjects::CObjectSimple<Tp>::CObjectSimple(const typename pdfobjects::PropertyTraitSimple<Tp>::value&) [with pdfobjects::PropertyType Tp = pInt]' /usr/include/boost/noncopyable.hpp: In copy constructor 'pdfobjects::CObjectSimple<pName>::CObjectSimple(const pdfobjects::CObjectSimple<pName>&)': /usr/include/boost/noncopyable.hpp:27: error: 'boost::noncopyable_::noncopyable::noncopyable(const boost::noncopyable_::noncopyable&)' is private /home/jbolling/rpm/BUILD/pdfedit-0.4.5/src/kernel/cobjectsimple.h:104: error: within this context cpagecontents.cc: In member function 'void pdfobjects::CPageContents::addInlineImage(const std::vector<char, std::allocator<char> >&, const libs::Point&, const libs::Point&)': cpagecontents.cc:545: note: synthesized method 'pdfobjects::CObjectSimple<pName>::CObjectSimple(const pdfobjects::CObjectSimple<pName>&)' first required here I will address the warnings in a separate message. Right now, observe that g++ complains not just about line 545, where attention focused during the previous discussion, but also about line 544. In fact, in my experiments I found that the compiler will make the same complaint about an inaccessible copy constructor for each of these lines: image_dict.addProperty ("W", CInt (image_size.x)); image_dict.addProperty ("H", CInt (image_size.y)); image_dict.addProperty ("CS", CName ("RGB")); image_dict.addProperty ("BPC", CInt (8)); It appears, then, that g++ decides it must make copies of the CInt and CName arguments, even though the method prototype specifies that the arguments are references. Perhaps g++ wants to copy the arguments and then pass references to the copies. That could be a GCC bug, but it draws attention to a PDFEdit bug here: the program is attempting to store references to local CInt and CName objects in a CDict whose lifetime exceeds theirs. That is, to the best of my understanding, the lifetime of local objects instantiated in an argument list is limited to the function call to which they are arguments (i.e. each addProperty() call), whereas image_dict lives until the end of the method. Even if that were not so, it appears that the method also leaks references to these local CInt and CName objects when it subsequently uses the constructed image_dict to initialize a CInlineImage allocated on the heap (thus using longer-lived local objects would not solve the problem). In any event, the compiler is satisfied if I change the above four lines like so: image_dict.addProperty ("W", *(new CInt (image_size.x))); image_dict.addProperty ("H", *(new CInt (image_size.y))); image_dict.addProperty ("CS", *(new CName ("RGB"))); image_dict.addProperty ("BPC", *(new CInt (8))); That obviously leaks memory, but it's better than the program accessing who-knows-what through dangling references. It also demonstrates to my satisfaction that the problem is not with the compiler being confused about types, but rather with its approach to handling the type locality issue, possibly in conjunction with some optimization it attempts to apply. To sum up, there appears to be a cluster of PDFedit bugs here, even when the original code compiles successfully. I don't see any simple fix that would be adequate, but perhaps those of you who are more familiar with Best, John Bollinger Email Disclaimer: www.stjude.org/emaildisclaimer |
From: Martin P. <ma...@pe...> - 2011-07-30 10:13:12
|
You can try using pdfimages with -j parameter, which will (if the image is stored with jpeg compression) save them as JPEG, thus avoiding recompression, which will increase the size. Or you can try writing custom script for pdfedit that will look in the content stream. Based on how does the watermark look in the content stream it may be anything between easy and almost impossible (depending how hard is for the script to distinguis the watermark images between the images you want to keep :) But from the dump of the stream it looks the watermark have always the same size, so it should be quite easy. Can you send me one of these documents (not to list but to my email), so I may have a look at it without downloading the whole archive? Martin Petricek On Sat, 23 Jul 2011 09:45:27 +0200, Federico Leva (Nemo) wrote: > Hello, > you might have heard about > > <http://arstechnica.com/tech-policy/news/2011/07/swartz-supporter-dumps-18592-jstor-docs-on-the-pirate-bay.ars> > We're now going to upload those ~19000 PDFs to the Internet Archive, > but > we need to remove a watermark. Could you please give me a suggestion > about how to do it? Sadly I don't know anything about PDF > manipulation. > We tried pdfimages, which output a .pbms per page plus a .ppm (the > footer/watermark); using ImageMagick to recombine pages in a PDF > compressed with LZM produced a PDF almost 3 times as big as the > original > one, so I think it's better to edit the original PDF without > converting > it to other raster formats. > The PDF looks like this: http://p.defau.lt/?8I_tQEf0Q2SZpi9CJx6I8A > Apparently, we need to remove this image: > /GxMWCL: 18 0 R, 187 x 248 > Which is like this in other PDFs: > http://p.defau.lt/?I1lqfJPL8ociEfOpvTfPaA > How can I do it? > Thank you, > Federico > > > ------------------------------------------------------------------------------ > Storage Efficiency Calculator > This modeling tool is based on patent-pending intellectual property > that > has been used successfully in hundreds of IBM storage optimization > engage- > ments, worldwide. Store less, Store more with what you own, Move > data to > the right place. Try It Now! > http://www.accelacomm.com/jaw/sfnl/114/51427378/ > _______________________________________________ > Pdfedit-support mailing list > Pdf...@li... > https://lists.sourceforge.net/lists/listinfo/pdfedit-support |
From: Federico L. (Nemo) <nem...@gm...> - 2011-07-30 09:57:19
|
Thank you for your reply. Martin Petricek, 30/07/2011 11:45: > You can try using pdfimages with -j parameter, which will (if the image > is stored with jpeg compression) save them as JPEG, thus avoiding > recompression, which will increase the size. Yes, that's what we first thought. > Or you can try writing custom script for pdfedit that will look in the > content stream. Based on how does the watermark look in the content > stream it may be anything between easy and almost impossible (depending > how hard is for the script to distinguis the watermark images between > the images you want to keep :) > But from the dump of the stream it looks the watermark have always the > same size, so it should be quite easy. Quite impossible at least for me, yes. :-) In fact in the meanwhile they're being uploaded as extracted JPG: <http://www.archive.org/search.php?query=subject%3A%22Philosophical+Transactions+of+the+Royal+Society%22> > Can you send me one of these documents (not to list but to my email), so > I may have a look at it without downloading the whole archive? I will. Anyway the last archive in the torrent (11.7z) is only ~200 KiB). :-) Federico |
From: Michal H. <ms...@gm...> - 2011-07-26 14:51:32
|
Follow up: On Tue, Jul 26, 2011 at 12:15:13PM +0200, Michal Hocko wrote: > I have looked at the document and at the first glance the update hasn't > screwed anything obvious. > At first I thought that we haven't updated the number of objects (stored > in the Xref stream in the original revision and Trailer in the new > revision) because those numbers are same for both while we have > obviously added new objects. This turned out to be OK because object > numbers are sparse and we are reusing those numbers which are not > used. > > Then I have looked at the Root object which is reported to be missing > and this started to look interesting. > Original revision reports: > 825 0 obj << > /Type /XRef > /Index [0 826] > /Size 826 > /W [1 3 1] > /Root 823 0 R > /Info 824 0 R > /ID [<9B0D6E3CC66605F7CE12FB9EAAB1356F> > <9B0D6E3CC66605F7CE12FB9EAAB1356F>] > /Length 2230 > /Filter /FlateDecode > >> > > and the new one: > trailer > << > /Size 826 > /Root 823 0 R > /Info 824 0 R > /ID [ <9b0d6e3cc66605f7ce12fb9eaab1356f> > <9b0d6e3cc66605f7ce12fb9eaab1356f> ] > /Prev 773827 > >> > > It is an object with reference number [823 0]. The problem is that I > cannot see that object in the file: > $ grep --binary-files=text "823 0 obj" eflow2.pdf > $ > > I guess that it is just embeded somewhere because I can see it with our > tools: > ./toos/pdf_object_printer --ref "823 0" --file ~/tmp/eflow2.pdf > Document: "/home/miso/tmp/eflow2.pdf" > [823 0]: > << > /Type /Catalog > /Pages 800 0 R > /Outlines 801 0 R > /Names 822 0 R > /PageMode /UseOutlines > /PageLabels << > /Nums [ 0 << > /S /D > >> 1 << > /S /D > >> ] > >> > /OpenAction 30 0 R > >> > So the Catalog object [823 0] is really compressed in ObjStm (stream object) [815 0] which looks as follows (I have skipped objects that are of no interest at the moment): $ ./tools/pdf_object_printer --ref "815 0" --decode 1 --file ~/tmp/eflow2.pdf Document: "/home/miso/tmp/eflow2.pdf" [815 0]: 814 0 816 153 817 314 818 398 819 492 820 584 821 651 822 721 823 742 [...] << /Type /Catalog /Pages 800 0 R /Outlines 801 0 R /Names 822 0 R /PageMode/UseOutlines/PageLabels<</Nums[0<</S/D>>1<</S/D>>]>> /OpenAction 30 0 R >> The xref table which defines your change looks like: xref 44 1 0000776307 00000 n 71 1 0000776400 00000 n 88 1 0000779952 00000 n 97 1 0000782990 00000 n 798 1 0000786170 00000 n 800 1 0000786290 00000 n No section refers to the object 823. So what could be wrong? My gut feeling says me that Acrobat is "buggy" here. All the above is saying that all new objects have been added correctly and the document structure is accessible. The problem seems to be that the original revision uses cross reference stream while the incremental update uses xref table. This is perfectly legal according to PDF specification AFAIU. PDFedit as well as other code based on the original xpdf code (same with poppler) parses all cross reference tables/streams first so we know where all objects are stored. We do not care much about xref tables vs. streams because that is handled when an indirect object is referenced. I guess that Acrobat is complaining because the Root [823 0] object is a part of object stream that is not immediately visible from the xref table directly. Whether this is complying to the specification is not 100% clear to me. Specification says (3.4.6 Object Streams): " Indirect references to objects inside object streams use the normal syntax: for example, 14 0 R. Access to these objects requires a different way of storing cross-reference information; see Section 3.4.7, “Cross-Reference Streams.” Although an application must support PDF 1.5 to use compressed objects, the objects can be stored in a manner that is compatible with PDF 1.4. Applications that do not support PDF 1.5 can ignore the objects; see “Compatibility with PDF 1.4” on page 85. " As you can see there _is_ a cross reference stream for this object. A section about incremental update says (3.4.5 Incremental Updates): " In an incremental update, any new or changed objects are appended to the file, a cross-reference section is added, and a new trailer is inserted. The resulting file has the structure shown in Figure 3.3. A complete example of an updated file is shown in Section G.6, “Updating Example.” The cross-reference section added when a file is updated contains entries only for objects that have been changed, replaced, or deleted. Deleted objects are left unchanged in the file, but are marked as deleted by means of their cross-reference entries. The added trailer contains all the entries (perhaps modified) from the previous trailer, as well as a Prev entry giving the location of the previous cross- reference section (see Table 3.13 on page 73). As shown in Figure 3.3, a file that has been updated several times contains several trailers; each trailer is terminated by its own end-of-file (%%EOF) marker. " There are no restrictions about combining xref stream vs. table mentioned here. OK, enough lawyering here. I would try to use a newer Acroread (mine is 9.2 and it is affected as well) or report that to Acrobat or use PDFedit to flatten the file (this will create a new document with all reachable object with a xref table and then you can update it without issues). Hope it will help. -- Michal Hocko |
From: Michal H. <ms...@gm...> - 2011-07-26 10:15:28
|
On Tue, Jul 26, 2011 at 10:39:50AM +0200, Michal Hocko wrote: > On Tue, Jul 26, 2011 at 09:52:25AM +0200, Klaus Moenig wrote: > > Hello, > > Hi, > > > > > I just compiles pdfedit on SUSE11.4. > > Could you be more specific about the version you have compiled? Just for record. I got an answer by a private email (documents were too big for our limit). The version is up-to-date CVS snapshot. > > It seems to be a nice program, > > however I have a major problem. I edit an pdf file that was created with > > pdflatex. I can read the file later e.g. with okular, however not with > > acroread, where I get the error message "The root object is missing or > > invalid". Is there something I can do about this? > > Could you share the document (before and after editing)? I have looked at the document and at the first glance the update hasn't screwed anything obvious. At first I thought that we haven't updated the number of objects (stored in the Xref stream in the original revision and Trailer in the new revision) because those numbers are same for both while we have obviously added new objects. This turned out to be OK because object numbers are sparse and we are reusing those numbers which are not used. Then I have looked at the Root object which is reported to be missing and this started to look interesting. Original revision reports: 825 0 obj << /Type /XRef /Index [0 826] /Size 826 /W [1 3 1] /Root 823 0 R /Info 824 0 R /ID [<9B0D6E3CC66605F7CE12FB9EAAB1356F> <9B0D6E3CC66605F7CE12FB9EAAB1356F>] /Length 2230 /Filter /FlateDecode >> and the new one: trailer << /Size 826 /Root 823 0 R /Info 824 0 R /ID [ <9b0d6e3cc66605f7ce12fb9eaab1356f> <9b0d6e3cc66605f7ce12fb9eaab1356f> ] /Prev 773827 >> It is an object with reference number [823 0]. The problem is that I cannot see that object in the file: $ grep --binary-files=text "823 0 obj" eflow2.pdf $ I guess that it is just embeded somewhere because I can see it with our tools: ./toos/pdf_object_printer --ref "823 0" --file ~/tmp/eflow2.pdf Document: "/home/miso/tmp/eflow2.pdf" [823 0]: << /Type /Catalog /Pages 800 0 R /Outlines 801 0 R /Names 822 0 R /PageMode /UseOutlines /PageLabels << /Nums [ 0 << /S /D >> 1 << /S /D >> ] >> /OpenAction 30 0 R >> Will have a look at this later. -- Michal Hocko |
From: Michal H. <ms...@gm...> - 2011-07-26 08:39:59
|
On Tue, Jul 26, 2011 at 09:52:25AM +0200, Klaus Moenig wrote: > Hello, Hi, > > I just compiles pdfedit on SUSE11.4. Could you be more specific about the version you have compiled? > It seems to be a nice program, > however I have a major problem. I edit an pdf file that was created with > pdflatex. I can read the file later e.g. with okular, however not with > acroread, where I get the error message "The root object is missing or > invalid". Is there something I can do about this? Could you share the document (before and after editing)? -- Michal Hocko |
From: Klaus M. <kla...@ce...> - 2011-07-26 08:11:53
|
Hello, I just compiles pdfedit on SUSE11.4. It seems to be a nice program, however I have a major problem. I edit an pdf file that was created with pdflatex. I can read the file later e.g. with okular, however not with acroread, where I get the error message "The root object is missing or invalid". Is there something I can do about this? Best wishes, Klaus Moenig =================================================== Klaus Moenig e-mail: Kla...@de... DESY, Zeuthen/CERN or: Klaus.Moenig@CERN.ch Tel.: +41 22 76 74368(CERN) +49 33762 77271(DESY) +41 775002732(mobile) =================================================== |
From: Federico L. (Nemo) <nem...@gm...> - 2011-07-23 07:45:40
|
Hello, you might have heard about <http://arstechnica.com/tech-policy/news/2011/07/swartz-supporter-dumps-18592-jstor-docs-on-the-pirate-bay.ars> We're now going to upload those ~19000 PDFs to the Internet Archive, but we need to remove a watermark. Could you please give me a suggestion about how to do it? Sadly I don't know anything about PDF manipulation. We tried pdfimages, which output a .pbms per page plus a .ppm (the footer/watermark); using ImageMagick to recombine pages in a PDF compressed with LZM produced a PDF almost 3 times as big as the original one, so I think it's better to edit the original PDF without converting it to other raster formats. The PDF looks like this: http://p.defau.lt/?8I_tQEf0Q2SZpi9CJx6I8A Apparently, we need to remove this image: /GxMWCL: 18 0 R, 187 x 248 Which is like this in other PDFs: http://p.defau.lt/?I1lqfJPL8ociEfOpvTfPaA How can I do it? Thank you, Federico |
From: Jozef <mis...@ho...> - 2011-07-15 13:26:59
|
Dne 15.7.2011 14:44, Henry Levine napsal(a): > Hi > I would like to add text into a pdf file using a specific truetype font. > > How do I specify the font. I have installed the font on windows and > can use it in > appplications like word etc. > > Information: > windows 7 > font: IDautomations barcode font > > Command executed. > add_text-tool.exe --file=test.pdf --what="*in050100*" --where=1 > --font="IDAutomationHC39M" --p=100 --p=100 Not (currently) possible with pdfedit. It seems easy but it is a complex task, though. the font *must* be embedded into the pdf in a specific format. we decided that for complex editing like this you should bether get the source document. jozef > > ideas? > > Regards > Henry Levine > Leonora Systems (Pty) Ltd > Tel: 011-475-1324 > Fax: 011-679-3660 > Mobile: 083-269-9505 > Fax to Email: 086 670 9287 > > This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. > > ------------------------------------------------------------------------------ > AppSumo Presents a FREE Video for the SourceForge Community by Eric > Ries, the creator of the Lean Startup Methodology on "Lean Startup > Secrets Revealed." This video shows you how to validate your ideas, > optimize your ideas and identify your business strategy. > http://p.sf.net/sfu/appsumosfdev2dev > _______________________________________________ > Pdfedit-support mailing list > Pdf...@li... > https://lists.sourceforge.net/lists/listinfo/pdfedit-support > > |
From: Henry L. <hl...@le...> - 2011-07-15 13:14:41
|
Hi I would like to add text into a pdf file using a specific truetype font. How do I specify the font. I have installed the font on windows and can use it in appplications like word etc. Information: windows 7 font: IDautomations barcode font Command executed. add_text-tool.exe --file=test.pdf --what="*in050100*" --where=1 --font="IDAutomationHC39M" --p=100 --p=100 ideas? Regards Henry Levine Leonora Systems (Pty) Ltd Tel: 011-475-1324 Fax: 011-679-3660 Mobile: 083-269-9505 Fax to Email: 086 670 9287 This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. |
From: Renato P. <ren...@gm...> - 2011-05-28 17:26:49
|
Yes. Now. I've solved. with thist command pdftk allegato_7.pdf stamp allegatoN.pdf output out.pdf I obtain what I'm looking for: the content of allegtoN.pdf (made by libreoffice and exported as pdf) merged in foreground with the allegato_7.pdf, to out.pdf. Last think: is it possible to have the content of allegatoN,pdf only on the first page of allegato_7.pdf? Thank you Renato |
From: Renato P. <ren...@gm...> - 2011-05-28 12:01:01
|
In data sabato 28 maggio 2011 04:44:34, Alister Hood ha scritto: > Hi Renato, > Do you mean a single _word_? Yes, just the name of attachment > Do you want to add it as text or as an > image? wich is better? For me it's the same > Do you have any more complicated requirements, e.g. do you need to > automatically stamp it on every page of a pdf, or on a particular page > of a number of pdfs? just on the upper right corner of the first page > > I think the easiest way may be to use pdftk, which can "Apply a > Background Watermark or a Foreground Stamp" > http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/ > > Does that help? > > Alister > > > -----Original Message----- > > From: ren...@gm... [mailto:ren...@gm...] > > Sent: Saturday, 28 May 2011 12:57 a.m. > > To: pdf...@li... > > Subject: [Pdfedit-support] how to write a simple word on a pdf > > > > Hi, > > I'm new to this list. I need to 'stamp' a single world on pdf. > > > > I have been lookig for fow a while, but without sucess. > > > > Can someone sudgest me a soution or some link where I can look for? > > > > TIA > > > > Renato > > ------------------------------------------------------------------------ > ------ > > > vRanger cuts backup time in half-while increasing security. > > With the market-leading solution for virtual backup and recovery, > > you get blazing-fast, flexible, and affordable data protection. > > Download your free trial now. > > http://p.sf.net/sfu/quest-d2dcopy1 > > _______________________________________________ > > Pdfedit-support mailing list > > Pdf...@li... > > https://lists.sourceforge.net/lists/listinfo/pdfedit-support > > --------------------------------------------------------------------------- > --- vRanger cuts backup time in half-while increasing security. > With the market-leading solution for virtual backup and recovery, > you get blazing-fast, flexible, and affordable data protection. > Download your free trial now. > http://p.sf.net/sfu/quest-d2dcopy1 > _______________________________________________ > Pdfedit-support mailing list > Pdf...@li... > https://lists.sourceforge.net/lists/listinfo/pdfedit-support |
From: Renato P. <ren...@gm...> - 2011-05-28 08:14:40
|
thak you Alister, yes, my prob is just that. I need to stamp the name of the file (name.pdf) on the upper right corner of the first page. Can I with pdftk choose the position of the image? TIA Renato In data sabato 28 maggio 2011 04:44:34, Alister Hood ha scritto: > Hi Renato, > Do you mean a single _word_? Do you want to add it as text or as an > image? > Do you have any more complicated requirements, e.g. do you need to > automatically stamp it on every page of a pdf, or on a particular page > of a number of pdfs? > > I think the easiest way may be to use pdftk, which can "Apply a > Background Watermark or a Foreground Stamp" > http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/ > > Does that help? > > Alister > > > -----Original Message----- > > From: ren...@gm... [mailto:ren...@gm...] > > Sent: Saturday, 28 May 2011 12:57 a.m. > > To: pdf...@li... > > Subject: [Pdfedit-support] how to write a simple word on a pdf > > > > Hi, > > I'm new to this list. I need to 'stamp' a single world on pdf. > > > > I have been lookig for fow a while, but without sucess. > > > > Can someone sudgest me a soution or some link where I can look for? > > > > TIA > > > > Renato > > ------------------------------------------------------------------------ > ------ > > > vRanger cuts backup time in half-while increasing security. > > With the market-leading solution for virtual backup and recovery, > > you get blazing-fast, flexible, and affordable data protection. > > Download your free trial now. > > http://p.sf.net/sfu/quest-d2dcopy1 > > _______________________________________________ > > Pdfedit-support mailing list > > Pdf...@li... > > https://lists.sourceforge.net/lists/listinfo/pdfedit-support > > --------------------------------------------------------------------------- > --- vRanger cuts backup time in half-while increasing security. > With the market-leading solution for virtual backup and recovery, > you get blazing-fast, flexible, and affordable data protection. > Download your free trial now. > http://p.sf.net/sfu/quest-d2dcopy1 > _______________________________________________ > Pdfedit-support mailing list > Pdf...@li... > https://lists.sourceforge.net/lists/listinfo/pdfedit-support |
From: Alister H. <ali...@sy...> - 2011-05-28 02:58:29
|
Hi Renato, Do you mean a single _word_? Do you want to add it as text or as an image? Do you have any more complicated requirements, e.g. do you need to automatically stamp it on every page of a pdf, or on a particular page of a number of pdfs? I think the easiest way may be to use pdftk, which can "Apply a Background Watermark or a Foreground Stamp" http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/ Does that help? Alister > -----Original Message----- > From: ren...@gm... [mailto:ren...@gm...] > Sent: Saturday, 28 May 2011 12:57 a.m. > To: pdf...@li... > Subject: [Pdfedit-support] how to write a simple word on a pdf > > Hi, > I'm new to this list. I need to 'stamp' a single world on pdf. > > I have been lookig for fow a while, but without sucess. > > Can someone sudgest me a soution or some link where I can look for? > > TIA > > Renato > > ------------------------------------------------------------------------ ------ > vRanger cuts backup time in half-while increasing security. > With the market-leading solution for virtual backup and recovery, > you get blazing-fast, flexible, and affordable data protection. > Download your free trial now. > http://p.sf.net/sfu/quest-d2dcopy1 > _______________________________________________ > Pdfedit-support mailing list > Pdf...@li... > https://lists.sourceforge.net/lists/listinfo/pdfedit-support |
From: <ren...@gm...> - 2011-05-27 12:57:41
|
Hi, I'm new to this list. I need to 'stamp' a single world on pdf. I have been lookig for fow a while, but without sucess. Can someone sudgest me a soution or some link where I can look for? TIA Renato |
From: Eric D. <er...@do...> - 2011-05-15 13:15:14
|
On 05/15/2011 08:54 AM, Reiner Miericke wrote: > I thought the image is stored in one bunch somewhere on a page and all I have > to do is > - extract the image > - to a better compression > - encode the image for PDF > - and just exchange is (same position and dimensions etc) > I've never done anything like this before, but below are some tips that I found on this page: http://forums.debian.net/viewtopic.php?f=10&t=55341 The script shows how OCR software can extract the text and how the file size of the images can be reduced. You could probably modify it to meet your needs. Good Luck!, - Eric Step One: Install the necessary packages: |apt-get install gocr imagemagick libjpeg-progs pdftk poppler-utils Step Two: Create a script like the following: #!/bin/bash ## script to: ## * split a PDF up by pages ## * convert them to an image format ## * read the text from each page ## * concatenate the pages ## we will do all work in a temporary directory ## so remember where we started DIR=$( pwd ) ## pass name of PDF file to script INFILE=$1 if [ ! $INFILE ] ; then printf "No file specified. Exiting.\n" exit 1 fi if [ ! -f $INFILE ] ; then printf "$INFILE is not a file. Exiting.\n" exit 1 fi ## create temp directory and CD into it ## but get rid of anything that used to live there first if [ -d /tmp/image2text ] ; then rm -rf /tmp/image2text fi mkdir /tmp/image2text cp $INFILE /tmp/image2text/. cd /tmp/image2text ## split PDF file into pages, resulting files will be ## numbered: pg_0001.pdf pg_0002.pdf pg_0003.pdf pdftk $INFILE burst ## make sure file was burst if [ ! -f pg_0001.pdf ] ; then printf "Failed to burst $INFILE. Exiting.\n" exit else ## do you really need doc_data.txt ??? rm doc_data.txt fi ## now let's turn each PDF page into text for i in pg*.pdf ; do ## convert it to a PPM image file at 600 dots per inch pdftoppm -r 600 $i ${i%.pdf}.ppm ## make sure the command worked if [ -f ${i%.pdf}.ppm-1.ppm ] ; then ## change the goofy file name mv ${i%.pdf}.ppm-1.ppm ${i%.pdf}.ppm else printf "The PPM file: ${i%.pdf}.ppm-1.ppm was not created. Exiting.\n" exit 1 fi ## convert the file to a JPEG image with ImageMagick ## scanning the JPEG yields slighly better results ## and you get a much smaller file size convert ${i%.pdf}.ppm ${i%.pdf}.jpg ## make sure the command worked if [ -f ${i%.pdf}.jpg ] ; then ## get rid of the massive PPM file and the PDF file rm ${i%.pdf}.ppm $i else printf "The JPG file: ${i%.pdf}.jpg was not created. Exiting.\n" exit 1 fi ## read text from the page djpeg -pnm ${i%.pdf}.jpg | gocr - > ${i%.pdf}.txt ## make sure the command worked if [ -f ${i%.pdf}.txt ] ; then ## get rid of the JPG file rm ${i%.pdf}.jpg else printf "The TXT file: ${i%.pdf}.txt was not created. Exiting.\n" exit 1 fi done ## concatenate the pages into a single text file cat pg*.txt > $DIR/${INFILE%.pdf}.txt ## remove the temporary directory cd $DIR if [ -f ${INFILE%.pdf}.txt ] ; then rm -rf /tmp/image2text ## get out of here! printf "All done. Have fun! \n" else printf "Failed to generate ${INFILE%.pdf}.txt\n" printf "Individual text files can be found in: /tmp/image2text/ \n" fi exit | |
From: Reiner M. <re...@mi...> - 2011-05-15 12:51:36
|
thanks fpr answering, Jozef I thought the image is stored in one bunch somewhere on a page and all I have to do is - extract the image - to a better compression - encode the image for PDF - and just exchange is (same position and dimensions etc) isn't it as easy so it can be done eg. by a perl script? Am Freitag, 13. Mai 2011, 16:45:08 schrieb Jozef: > Dne 13.5.2011 15:34, Reiner Miericke napsal(a): > > Is there really nobody around who knows how to exchange images by smaller > > ones (gray --> blay/white) in PDF files without loosing the OCR-Text? > > Well, this question is a difficult one. It depends on how the OCR > software has done it but there is no easy straightforward way how to do > it in pdfedit at this moment. > If it is about removing, reducing, inserting back an image then the > script language could help you but really not an easy way. > > that would require deeper investigation which I doubt that somebody > would sacrifice his free time on this marginal issue. Sorry. > > jozef > > >> I have some scanned books, containing black/white pages. Currently a > >> page takes about 1MB of disk space. > >> * Is there a way to reduce the amount to eg. 500 kB without loosing the > >> (OCR-) text ? -- Mit freundlichen Grüssen Reiner Miericke |
From: Jozef <mis...@ho...> - 2011-05-13 14:45:18
|
Dne 13.5.2011 15:34, Reiner Miericke napsal(a): > Is there really nobody around who knows how to exchange images by smaller ones > (gray --> blay/white) in PDF files without loosing the OCR-Text? > > Well, this question is a difficult one. It depends on how the OCR software has done it but there is no easy straightforward way how to do it in pdfedit at this moment. If it is about removing, reducing, inserting back an image then the script language could help you but really not an easy way. that would require deeper investigation which I doubt that somebody would sacrifice his free time on this marginal issue. Sorry. jozef >> I have some scanned books, containing black/white pages. Currently a page >> takes about 1MB of disk space. >> * Is there a way to reduce the amount to eg. 500 kB without loosing the >> (OCR-) text ? > > Mit freundlichen Grüssen > Reiner Miericke > > ------------------------------------------------------------------------------ > Achieve unprecedented app performance and reliability > What every C/C++ and Fortran developer should know. > Learn how Intel has extended the reach of its next-generation tools > to help boost performance applications - inlcuding clusters. > http://p.sf.net/sfu/intel-dev2devmay > _______________________________________________ > Pdfedit-support mailing list > Pdf...@li... > https://lists.sourceforge.net/lists/listinfo/pdfedit-support > > |
From: Reiner M. <re...@mi...> - 2011-05-13 13:31:57
|
Is there really nobody around who knows how to exchange images by smaller ones (gray --> blay/white) in PDF files without loosing the OCR-Text? Am Donnerstag, 5. Mai 2011, 13:11:56 schrieb Reiner Miericke: > I have some scanned books, containing black/white pages. Currently a page > takes about 1MB of disk space. > * Is there a way to reduce the amount to eg. 500 kB without loosing the > (OCR-) text ? Mit freundlichen Grüssen Reiner Miericke |
From: misiu <tj...@ya...> - 2011-05-09 11:18:12
|
hi, whenever i use addText on the pdf content of the page changes, is it because my pdf is bad or there is some bug in pdfedit? i'm using version 0.4.5-20101103055546 (debian one). |