From: Gwurk <gw...@fi...> - 2008-01-24 02:41:16
|
Hello, I ran into the problem where some CReals were being written in scientific notation (e.g., "-4e-05" for -0.00004) into PDF streams, which caused the stream to be invalid, giving errors like "Number of operands mismatch" in ccontentstream.cc. It looks like the problem is in cobject.cc, in simpleValueToString<pReal>(). You do sprintf(buf,"%g",val); and %g uses scientific notation if the number is small (or big) enough. Is there some way to fix this? Thanks! |
From: Michal H. <ms...@gm...> - 2008-01-24 11:37:42
Attachments:
real-to-string.patch
|
On Jan 24, 2008 3:41 AM, Gwurk <gw...@fi...> wrote: > Hello, > > I ran into the problem where some CReals were being written in > scientific notation (e.g., "-4e-05" for -0.00004) into PDF streams, > which caused the stream to be invalid, giving errors like "Number of > operands mismatch" in ccontentstream.cc. > > It looks like the problem is in cobject.cc, in > simpleValueToString<pReal>(). You do > sprintf(buf,"%g",val); > and %g uses scientific notation if the number is small (or big) > enough. Is there some way to fix this? Without need to check the value I am thinking of %f instead of %g. This has small disadvantage - loose of precision and constant width also for numbers which don't need that much of space (e.g 0.2 would be 0.20000 if precision is 5). Attached patch checks the value and decides which format to use. Thanks! > -- Michal Hocko |
From: Martin P. <ma...@pe...> - 2008-01-24 15:13:21
|
> Without need to check the value I am thinking of %f instead of %g. > This has small disadvantage - loose of precision and constant > width also for numbers which don't need that much of space > (e.g 0.2 would be 0.20000 if precision is 5). > > Attached patch checks the value and decides which format > to use. Isn't there better way to do this? What says the PDF specs about scientific notation in PDF? Perhaps we should make our routine to convert double to string which will handle that correctly, as loss of precision may be insignificant sometimes, but sometimes this can have big impact (for very thin lines, very large/small documents, etc ...) Martin Petricek GPG/PGP Public key: http://www.petricek.net/petricm.pgp Fingerprint 6AA8 FFCE C061 1CB2 55F0 A1F3 3AA9 EB4F BD50 C1B8 /------------------------------------------------------------\ | WWW: http://www.petricek.net/ | \------------------------------------------------------------/ |
From: Jozef M. <mis...@ho...> - 2008-01-25 09:47:21
|
just to note 1) kernel tests are not working with the patch (not the problem of the patc= h, though) 2) you are using magic numbers, provide at least a comment why=20 3) 12345.12345 will be rounded to 12345.1 using the patch=20 4) i would simply replace f for g OR provide our function for printing it note: despite the fact that small number change can be significant we haven= 't had problems so far =20 so better do not patch it =20 jozo =20 Date: Thu, 24 Jan 2008 12:37:44 +0100From: ms...@gm...To: gwurk@figpo= st.comCC: pdf...@li...Subject: Re: [Pdfedit-devel] S= cientific notation numbers inserted into PDFs by simpleValueToString<pReal> On Jan 24, 2008 3:41 AM, Gwurk <gw...@fi...> wrote: Hello,I ran into the problem where some CReals were being written inscienti= fic notation (e.g., "-4e-05" for -0.00004) into PDF streams,which caused th= e stream to be invalid, giving errors like "Number of operands mismatch" in= ccontentstream.cc.It looks like the problem is in cobject.cc, insimpleValu= eToString<pReal>(). You do sprintf(buf,"%g",val);and %g uses scientific n= otation if the number is small (or big) enough. Is there some way to fix th= is? Without need to check the value I am thinking of %f instead of %g.This has = small disadvantage - loose of precision and constantwidth also for numbers = which don't need that much of space (e.g 0.2 would be 0.20000 if precision = is 5).Attached patch checks the value and decides which formatto use. Thanks!-- Michal Hocko=20 _________________________________________________________________ Climb to the top of the charts!=A0Play the word scramble challenge with sta= r power. http://club.live.com/star_shuffle.aspx?icid=3Dstarshuffle_wlmailtextlink_ja= n= |
From: Michal H. <ms...@gm...> - 2008-01-25 10:04:55
|
On Jan 25, 2008 10:47 AM, Jozef Misutka <mis...@ho...> wrote: > just to note > 1) kernel tests are not working with the patch (not the problem of the > patch, though) > So problem of what? > > 2) you are using magic numbers, provide at least a comment why > Yes, this was quick patch as a suggestion for discussion and numbers come from man printf and %g formater. > > 3) 12345.12345 will be rounded to 12345.1 using the patch > This means that %g is used (which is done also at the current code)! > > 4) i would simply replace f for g OR provide our function for printing it > *%f* has disadvantage mentioned in my reply to Gwurk (fixed number of precision). Our own function would require much more effort. > note: despite the fact that small number change can be significant we > haven't had problems so far > Maybe we just haven't thought that some problems could be triggered by this (we had some strange bugs, where writing a line caused cut off top of the page - it is in bugzilla). > > so better do not patch it > I don't plan to commit it until we will discuss it. All suggestions are welcome. > jozo > P.S. Always reply to all in devel mailing list because of non members which are allowed to post to the list (like Gwurk atc...). You can of course skip me or Martin, but rather do not this for others. -- Michal Hocko |
From: Jozef M. <mis...@ho...> - 2008-01-25 10:17:57
|
1) in patched code 0.002 -> CObject -> 0.00200, 0.002 -> xpdf Object -> 0.0= 02 some tests expect 0.002.. jozo =20 PS: please, always try kernel tests before submitting a patch, committing..= .. =20 Date: Fri, 25 Jan 2008 11:04:59 +0100From: ms...@gm...To: pdfedit-dev= el...@li...Subject: Re: [Pdfedit-devel] Scientific notation nu= mbers inserted into PDFs by simpleValueToString<pReal> On Jan 25, 2008 10:47 AM, Jozef Misutka <mis...@ho...> wrote: just to note1) kernel tests are not working with the patch (not the problem= of the patch, though) So problem of what?=20 2) you are using magic numbers, provide at least a comment why=20 Yes, this was quick patch as a suggestion for discussion and numbers come f= romman printf and %g formater. 3) 12345.12345 will be rounded to 12345.1 using the patch=20 This means that %g is used (which is done also at the current code)!=20 4) i would simply replace f for g OR provide our function for printing it *%f* has disadvantage mentioned in my reply to Gwurk (fixed number ofprecis= ion). Our own function would require much more effort.=20 note: despite the fact that small number change can be significant we haven= 't had problems so far Maybe we just haven't thought that some problems could be triggered by this= (we had some strange bugs, where writing a line caused cut off top of the = page - it is in bugzilla). so better do not patch it I don't plan to commit it until we will discuss it. All suggestions are wel= come. jozoP.S.Always reply to all in devel mailing list because of non memberswh= ich are allowed to post to the list (like Gwurk atc...). You canof course s= kip me or Martin, but rather do not this for others. -- Michal Hocko=20 _________________________________________________________________ Climb to the top of the charts!=A0Play the word scramble challenge with sta= r power. http://club.live.com/star_shuffle.aspx?icid=3Dstarshuffle_wlmailtextlink_ja= n= |
From: Michal H. <ms...@gm...> - 2008-02-26 13:48:41
|
On Wed, Jan 23, 2008 at 09:41:20PM -0500, Gwurk wrote: > Hello, Hi Gwurk. > > I ran into the problem where some CReals were being written in > scientific notation (e.g., "-4e-05" for -0.00004) into PDF streams, > which caused the stream to be invalid, giving errors like "Number of > operands mismatch" in ccontentstream.cc. Could you send a test document which this problem, I would like to do some better fix than one sent in this thread. Thx. > > It looks like the problem is in cobject.cc, in > simpleValueToString<pReal>(). You do > sprintf(buf,"%g",val); > and %g uses scientific notation if the number is small (or big) > enough. Is there some way to fix this? > > Thanks! -- Michal Hocko |
From: Michal H. <ms...@gm...> - 2008-02-27 09:58:29
Attachments:
real-to-string_v3.patch
|
On Fri, Jan 25, 2008 at 09:47:22AM +0000, Jozef Misutka wrote: > just to note > 1) kernel tests are not working with the patch (not the problem of the patch, though) > 2) you are using magic numbers, provide at least a comment why > 3) 12345.12345 will be rounded to 12345.1 using the patch > 4) i would simply replace f for g OR provide our function for printing it > note: despite the fact that small number change can be significant we haven't had problems so far > > so better do not patch it > > jozo I have updated patch and tests are working now. Basic idea is to always use %f (like you have suggested) and trim all trailing zeros after decimal point. So that we don't loose precision (we are using 5 decimal precision as described in PDF specification) and also we don't waste any space. @Jozo: please review and after you give it ack, I will commit it to the CVS. @Gwurk: This patch should be preferred because it better interacts with our test suite. -- Michal Hocko |
From: Jozef M. <mis...@ho...> - 2008-02-27 14:03:26
|
0) i think it is ok but i strongly recommend using snprintf instead of sprintf! 1) i will try to compile it under win later and fix it if necessary 2) you could have used http://beta.boost.org/doc/libs/1_33_1/doc/html/trim_if.html or put it to std::string and used reverse iterator / functions connected with string.. but your implementation seems a little bit faster. jozo ---------------------------------------- > Date: Wed, 27 Feb 2008 10:58:22 +0100 > From: ms...@gm... > To: mis...@ho... > CC: pdf...@li...; gw...@fi... > Subject: Re: [Pdfedit-devel] Scientific notation numbers inserted into PDFs by simpleValueToString > > On Fri, Jan 25, 2008 at 09:47:22AM +0000, Jozef Misutka wrote: >> just to note >> 1) kernel tests are not working with the patch (not the problem of the patch, though) >> 2) you are using magic numbers, provide at least a comment why >> 3) 12345.12345 will be rounded to 12345.1 using the patch >> 4) i would simply replace f for g OR provide our function for printing it >> note: despite the fact that small number change can be significant we haven't had problems so far >> >> so better do not patch it >> >> jozo > > I have updated patch and tests are working now. > Basic idea is to always use %f (like you have suggested) and trim all > trailing zeros after decimal point. So that we don't loose precision (we > are using 5 decimal precision as described in PDF specification) and > also we don't waste any space. > > @Jozo: please review and after you give it ack, I will commit it to the > CVS. > > @Gwurk: This patch should be preferred because it better interacts with > our test suite. > -- > Michal Hocko _________________________________________________________________ Helping your favorite cause is as easy as instant messaging. You IM, we give. http://im.live.com/Messenger/IM/Home/?source=text_hotmail_join |
From: Michal H. <ms...@gm...> - 2008-02-27 14:21:56
|
On Wed, Feb 27, 2008 at 02:03:24PM +0000, Jozef Misutka wrote: > > 0) i think it is ok but i strongly recommend using snprintf instead of sprintf! Ok, I will commit it with snprintf. Originally I had snprintf based implementation, but later I got inspired by simpleValueToString<pInt> which uses simple sprintf. But you are right that snprintf should be always preferred. > 1) i will try to compile it under win later and fix it if necessary OK. > 2) you could have used > http://beta.boost.org/doc/libs/1_33_1/doc/html/trim_if.html or > put it to std::string and used reverse iterator / functions connected with string.. > but your implementation seems a little bit faster. Of course, there are tons of ways how to do it ;). I have focused on speed, so char * -> string -> trim -> char * is not the best solution IMO. > > jozo Thanks for review. -- Michal Hocko |
From: Michal H. <ms...@gm...> - 2008-02-27 14:29:26
|
On Wed, Feb 27, 2008 at 03:21:51PM +0100, Michal Hocko wrote: > On Wed, Feb 27, 2008 at 02:03:24PM +0000, Jozef Misutka wrote: [...] > > 2) you could have used > > http://beta.boost.org/doc/libs/1_33_1/doc/html/trim_if.html or > > put it to std::string and used reverse iterator / functions connected with string.. > > but your implementation seems a little bit faster. > > Of course, there are tons of ways how to do it ;). I have focused on > speed, so char * -> string -> trim -> char * is not the best solution > IMO. I have just realized that we are creating string anyway. So the last my argument is not really relevant. Sorry for noise. -- Michal Hocko |
From: Gwurk <gw...@fi...> - 2008-02-27 16:50:04
|
On 2/27/08, Michal Hocko <ms...@gm...> wrote: > @Gwurk: This patch should be preferred because it better interacts with > our test suite. Thanks! Do you still need a PDF with the invalid numbers generated by %g? I don't have the PDF with that problem anymore, and it's going to take me a little while to figure out how to regenerate it. |
From: Michal H. <ms...@gm...> - 2008-02-27 17:17:43
|
On Wed, Feb 27, 2008 at 11:50:03AM -0500, Gwurk wrote: > On 2/27/08, Michal Hocko <ms...@gm...> wrote: > > @Gwurk: This patch should be preferred because it better interacts with > > our test suite. > > Thanks! Do you still need a PDF with the invalid numbers generated by > %g? I don't have the PDF with that problem anymore, and it's going to > take me a little while to figure out how to regenerate it. It would be usefull, but if it is too much effort for you, then let it be. Important think is that this *cannot* happen anymore because %f never uses exponent. -- Michal Hocko |