From: Jeff A. <ja...@fa...> - 2014-04-18 14:18:49
|
This was harder than it looked. I've pretty much written it three times over now. This post is by way of a progress report. I'm getting good (perfect maybe) conformance with CPython for both new-style format calls and float.__repr__. I've done this by evolving the InternalFormatter class (in core/stringlib/Formatter.java), beyond all recognition. I'm getting BigDecimal to do conversion and rounding for format types accepted by float.__format__, and for __str__. This is fully conformant. BigDecimal rounding cannot be made to emulate float.__repr__ by itself, but it appears that Double.toString(double) does exactly what we need when it gives answers up to 17 digits. Occasionally it provides 18 and we have to do some rounding. Rounding a rounded result isn't quite right (think 0.49 rounded to 1 d.p. then to an integer), but it seems to work here. I have evolved the format parsing, and put the padding code in the formatter, to simplify the division of responsibilities between the client (float) and the formatter. Now it looks a lot like using a StringBuilder. I think this will increase cohesion, and reduce the number of source files, once applied to all the types where we currently use InternalFormatSpec. Our way of supplying defaults, for the conversion type and alignment, diverges from CPython in a way that seems ultimately to complicate the client code. I have moved it closer to the CPython approach, which supplies defaults when the specification is parsed <http://hg.python.org/cpython/file/3a1db0d2747e/Objects/stringlib/formatter.h#l1315>. In passing, I've addressed this issue: http://bugs.python.org/issue12546 which exists unreported in Jython. complex is in a worse state than float, now I look: we are skipping a ton of tests. I'll apply my work there too. I'm leaving %-formats alone for now, although they also fail tests. I'd like to re-use what I've done for that. With that in view, I've chosen to support alternate (#g) format in my code, temporarily defining h-format to mean that. I get agreement between the bogus format(x, "h") and "%#g" % x in CPython. Jeff Allen On 09/04/2014 08:02, Jeff Allen wrote: > I wrote a pretty good replacement for core.stringlib.Formatter, based on > java.text.DecimalFormatter. It agrees with CPython for all "reasonable" > precisions, that is to say up to 16 significant figures, and no-one has > had to take the log of anything. It can provide float.__str__ and > float.__repr__ more cleanly than at present too. > > But it fails tests like this: AssertionError: > '10000000000000000000000000000000000000000000000000' != > '9999999999999999464902769475481793196872414789632' > > I haven't come across any cases where the problem is that the actual > float value is different (considered at the bit-level using > float.hex()); it's all on the output formatting. CPython appears to give > us the exact value -- there is always a finite answer, though it may > have a thousand digits. DecimalFormatter, which is at bottom just > Double.toString(), gives us just enough digits to differ from the next > float bit pattern. > > Experiments with BigDecimal suggest I can get exact conformance by that > route and it seems worth the attempt, so I'm going that way. > > Jeff > > |