Jeff,

Wow, this is really impressive!

Any number of times some of us have dipped our toes into working on this problem, but this work is just so nicely comprehensive in comparison.

As you noted, it would be nice to avoid using BigDecimal. However, in looking at your changes, you are already using Double.doubleToLongBits and Math.getExponent (which presumably are intrinisified). So using BigDecimal for part of the heavy lifting makes sense to me. Perhaps this is something that can be revisited in 2.7.1.

Thanks again!

- Jim



On Wed, Apr 23, 2014 at 4:40 PM, Jeff Allen <ja.py@farowl.co.uk> wrote:
I've now submitted a revised float.__format__, and one for complex too. It is correct (nearly) but I have two observations:
1. The contract for Double.toString(double) appears to be exactly what we need for float.__repr__, but it doesn't keep its contract every time (http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4511638). However, I am using this for repr().
2. The conversions in float.__format__ (and my fixed version of round) are based on BigDecimal. This is correct, but slow for numbers with large exponent, especially when the exponent is large and negative.

To do any better than this, I believe we would have implement our own conversion, as CPython have. This may yet have to happen if performance of float.__format__ turns out to be an issue.

%-formatting of floats is uncorrected. I'd like to make it use the new code but this isn't entirely straightforward.

Jeff
Jeff Allen
On 18/04/2014 15:18, Jeff Allen wrote:
This was harder than it looked. I've pretty much written it three times over now. This post is by way of a progress report.

I'm getting good (perfect maybe) conformance with CPython for both new-style format calls and float.__repr__. I've done this by evolving the InternalFormatter class (in core/stringlib/Formatter.java), beyond all recognition.

I'm getting BigDecimal to do conversion and rounding for format types accepted by float.__format__, and for __str__. This is fully conformant.

BigDecimal rounding cannot be made to emulate float.__repr__ by itself,  but it appears that Double.toString(double) does exactly what we need when it gives answers up to 17 digits. Occasionally it provides 18 and we have to do some rounding. Rounding a rounded result isn't quite right (think 0.49 rounded to 1 d.p. then to an integer), but it seems to work here.

I have evolved the format parsing, and put the padding code in the formatter, to simplify the division of responsibilities between the client (float) and the formatter. Now it looks a lot like using a StringBuilder. I think this will increase cohesion, and reduce the number of source files, once applied to all the types where we currently use InternalFormatSpec.

Our way of supplying defaults, for the conversion type and alignment, diverges from CPython in a way that seems ultimately to complicate the client code. I have moved it closer to the CPython approach, which supplies defaults when the specification is parsed. In passing, I've addressed this issue: http://bugs.python.org/issue12546 which exists unreported in Jython.

complex is in a worse state than float, now I look: we are skipping a ton of tests. I'll apply my work there too.

I'm leaving %-formats alone for now, although they also fail tests. I'd like to re-use what I've done for that. With that in view, I've chosen to support alternate (#g) format in my code, temporarily defining h-format to mean that. I get agreement between the bogus format(x, "h") and "%#g" % x in CPython.
 
Jeff Allen

On 09/04/2014 08:02, Jeff Allen wrote:
I wrote a pretty good replacement for core.stringlib.Formatter, based on 
java.text.DecimalFormatter. It agrees with CPython for all "reasonable" 
precisions, that is to say up to 16 significant figures, and no-one has 
had to take the log of anything. It can provide float.__str__ and 
float.__repr__ more cleanly than at present too.

But it fails tests like this: AssertionError: 
'10000000000000000000000000000000000000000000000000' != 
'9999999999999999464902769475481793196872414789632'

I haven't come across any cases where the problem is that the actual 
float value is different (considered at the bit-level using 
float.hex()); it's all on the output formatting. CPython appears to give 
us the exact value -- there is always a finite answer, though it may 
have a thousand digits. DecimalFormatter, which is at bottom just 
Double.toString(), gives us just enough digits to differ from the next 
float bit pattern.

Experiments with BigDecimal suggest I can get exact conformance by that 
route and it seems worth the attempt, so I'm going that way.

Jeff





------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech


_______________________________________________
Jython-dev mailing list
Jython-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jython-dev


------------------------------------------------------------------------------
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform
_______________________________________________
Jython-dev mailing list
Jython-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jython-dev