#1173 (v)snprintf problems with %p, %f /e/g and minus infinity

closed
2008-08-24
2008-05-30
Cary R.
No

The attached programs demonstrate three problems in the (v)snprintf routines.

First %p uses p as the second digit and displays as lower case. Other printf routines have a 0 for the second digit and display as upper case. This causes problems when you are using the address as the object identity. See the snprintf code for a complete example.

Next when using the %*.* syntax for %f the output is completely wrong. For %e and %g the output has something that looks correct. The field width and precision are given as -1, -1. This should read as a left justified minimum field width of 1 and default precision.

The last problem is that minus infinity is displayed as positive infinity.

It would also be nice if NaN and infinity were displayed using the more standard nan and inf instead of NaN and Infinity. This allows automatic comparisons against the output from other machines.

Discussion

  • Cary R.
    Cary R.
    2008-05-30

     
    Attachments
  • Cary R.
    Cary R.
    2008-05-30

    Logged In: YES
    user_id=1651735
    Originator: YES

    File Added: vsnprintf-percent-f.c

     
  • Cary R.
    Cary R.
    2008-05-30

     
    Attachments
  • Cary R.
    Cary R.
    2008-05-30

    Logged In: YES
    user_id=1651735
    Originator: YES

    I forgot to mention this is with the 3.14 runtime and a current setup.

     
  • Keith Marshall
    Keith Marshall
    2008-05-30

    Logged In: YES
    user_id=823908
    Originator: NO

    > The last problem is that minus infinity is displayed as
    > positive infinity.

    And similarly, minus zero is displayed as positive zero. See:
    https://sourceforge.net/tracker/index.php?func=detail&aid=1978642&group_id=2435&atid=352435

    > It would also be nice if NaN and infinity were displayed
    > using the more standard nan and inf instead of NaN and
    > Infinity. This allows automatic comparisons against the
    > output from other machines.

    But not, perhaps, against the output from Microsoft's standard _*printf() implementations. I really don't think there is any definitive standard for how these forms should be displayed.

     
  • Keith Marshall
    Keith Marshall
    2008-05-30

    • milestone: 507559 -->
     
  • Cary R.
    Cary R.
    2008-05-30

    Logged In: YES
    user_id=1651735
    Originator: YES

    > But not, perhaps, against the output from Microsoft's
    > standard _*printf() implementations. I really don't think
    > there is any definitive standard for how these forms
    > should be displayed.

    Yes these routines are definitely better than the Microsoft results! While there is not a definitive standard there appears to be a de facto standard. Cygwin and Linux produce nan and inf. I believe, but would have to double check, modern apple machines (BSD based) also produce nan and inf.

    The minimal fix for me would be to have vsnprintf work as described above and to have snprintf produce compatible %p output. We us both vsnprintf and vfprintf depending on the where the output is going; vfprintf if it is going to a file and vsnprintf for everything else, so it would be nice if these two routines gave the same output. Since this is probably too much to ask I can easily add a MinGW specific section of code to only use vsnprintf if I need to.

     
  • Keith Marshall
    Keith Marshall
    2008-06-01

    Logged In: YES
    user_id=823908
    Originator: NO

    Ok, I will follow up on this.

    However, I do have some questions regarding what `%p' should do; here is what SUSv3, and IEEE-1003.1-2001 (i.e. POSIX) have to say about it:

    | p
    | The argument shall be a pointer to void. The value of the
    | pointer is converted to a sequence of printable characters,
    | in an implementation-defined manner.

    Notice that the representation is specifically `implementation defined'; thus user written code should never expect any particular representation, if it is intended to be portable.

    Looking at the existing mingw_sprintf.c code, it would appear that it will print a pointer as a hexadecimal representation, with lower case `a'..`f' where appropriate, and prefixed by the `0p' format indicator. That is certainly an acceptable `implementation defined' representation, so I see no problem with standards conformance. However, on my GNU/Linux box, `%p' prints the equivalent of `%#x', again printing a *lower* case hexadecimal representation, but with `0x' as the prefix; this seems a more rational representation to me. What I do not see, is any justification for enforcing upper case hexadecimal representation, with `0' as the second character.

    Also, in respect of the INF and NAN representations, POSIX *does* state that `%e', `%f' and `%g' *should* print either `inf' or `infinity', and `nan' respectively, while `%E', `%F' and `%G' should print `INF' or `INFINITY', and `NAN' respectively, but there is no provision for any mixed case representation, so I may bring that into line with the standard.

     
  • Keith Marshall
    Keith Marshall
    2008-06-01

    • assigned_to: nobody --> keithmarshall
     
  • Cary R.
    Cary R.
    2008-06-01

    Logged In: YES
    user_id=1651735
    Originator: YES

    Concerning %p, when I think of `implementation defined' I assume that the various routines in a given implementation will produce the same result. In this case since you are stuck with what Microsoft does for most of the routines I think for at least %p both snprintf and vsnprintf should produce results that match the other routines (start with 00 and use upper case hex). I agree that I don't care exactly what is printed as long as it is the same for the various routines.

    FYI cygwin and RHEL also use 0x instead of 0p (MinGW) or 00 (Microsoft).

    Like I said earlier I would prefer nan and inf for %e/f/g if possible.

     
  • Keith Marshall
    Keith Marshall
    2008-06-02

    Logged In: YES
    user_id=823908
    Originator: NO

    Ok. I think I understand your `%p' issue better now; the problem is that MinGW supports two disparate printf implementations, one of which aims to provide decent ANSI/POSIX compliance, but is currently incomplete, and one provided natively by MSVCRT, and the two are inconsistent in their representations of some data types.

    The GCC folks have noted that the MSVCRT implementation is broken, to the point of uselessness for their purposes. To address this, Danny Smith provided an alternative implementation, for snprintf() and vsnprintf() only, making these the default for these two function calls in MinGW; (if you want the native MSVCRT implementation, you must use the official Microsoft names when you call the functions, i.e. _snprintf() and _vsnprintf() respectively).

    Your problem arises because Danny's alternative implementation has not been extended to cover the entire family of printf() functions, so for example, when you call printf(), libmoldnames.a provides a redirection to the official Microsoft name, _printf(), and you get the MSVCRT implementation, but when you call snprintf(), the reference is satisfied directly by libmingwex.a, and you get the MinGW implementation. I propose an overhaul of Danny's alternative implementation, (which I will undertake), and which will extend it to work consistently for the entire printf() family, so that when you call any of the printf() functions, you will consistently get the MinGW alternative behaviour, and if you really want the MSVCRT behaviour, you must always use the official Microsoft function names, referring to the _printf() family.

     
  • Keith Marshall
    Keith Marshall
    2008-06-02

    Logged In: YES
    user_id=823908
    Originator: NO

    Just for the record, your hypothesis that the second character of MSVCRT's `%p' output is always `0' cannot be substantiated; in fact MSVCRT's `%p', in the absence of any explicit precision specification, appears to be equivalent to `%.8X', for:

    printf( "%p", 0x123456789abcdefLL );

    results in the output:

    89ABCDEF

     
  • Jacky Lai
    Jacky Lai
    2008-06-02

    Logged In: YES
    user_id=2105531
    Originator: NO

    Hello,

    I also found these problems when trying to run gnulib's snprintf test. I fixed them already and would like to contribute the fix. But give me 1 or 2 days to make up the fix again (I have added many other things to __mingw_snprintf now, and is too messy)

     
  • Cary R.
    Cary R.
    2008-06-02

    Logged In: YES
    user_id=1651735
    Originator: YES

    > Just for the record, your hypothesis that the second character
    > of MSVCRT's `%p' output is always `0' cannot be substantiated;

    OK, I did very little exploration of the underlying problem. My primary observation was that snprintf and fprint gave different results for %p.

    Extending this to the entire family of *printf routines would be wonderful!

    To work around these issues I have already forced our code to use the MSVCRT routines with all their problems by redefining both snprintf and vsnprintf to use the underscore version.

    I may have one more corner case issue that I need to build test code for.

     
  • Keith Marshall
    Keith Marshall
    2008-06-03

    Logged In: YES
    user_id=823908
    Originator: NO

    > I also found these problems when trying to run gnulib's snprintf test.
    > I fixed them already and would like to contribute the fix.

    Thanks; I'll be grateful for any contribution :-)

    > But give me 1 or 2 days to make up the fix again (I have added many
    > other things to __mingw_snprintf now, and is too messy)

    IMO, it has *always* been too messy. As well as correcting reported defects, the overhaul I had in mind would also tidy it up substantially, and document it with appropriate comments; (the current lack of those is atrocious, and completely unacceptable in released code, IMO).

    FWIW, I have a reworked implementation already mostly working for all but the floating point conversions, (and currently lacking the `*' forms for width and precision specs). This is already stream capable; (I have printf(), sprintf and snprintf() implemented already, and all the rest of the family will be easy, once the single function common to all, which does the actual formatting, is complete).

     
  • Jacky Lai
    Jacky Lai
    2008-06-03

    Logged In: YES
    user_id=2105531
    Originator: NO

    I posted the patch in:
    https://sourceforge.net/tracker/index.php?func=detail&aid=1983559&group_id=2435&atid=302435
    Please test it.

    I did not put the fix for infinity/nan representation at this moment. C89 do not mandate that. C99 requires "infinity"/"inf"/"nan"/"nan(xxxxxxxx)" (and their capitalization). MSVCRT does sth very different. What should we choose?

    %p is not fixed yet, since C99 does not mandate it. But 1 more line would be enough to match %#lx style

    > IMO, it has *always* been too messy. As well as correcting reported
    > defects, the overhaul I had in mind would also tidy it up substantially,
    > and document it with appropriate comments; (the current lack of those is
    > atrocious, and completely unacceptable in released code, IMO).

    I made it more messy by implementing other C99 features like %F, %a/%A, j/t/z/hh. I even try implementing %lc and %ls, though I can't test this.
    Perhaps I will make it in the 2nd patch.

    > FWIW, I have a reworked implementation already mostly working for all but
    > the floating point conversions, (and currently lacking the `*' forms for
    > width and precision specs).

    For floating point, I strongly suggest reusing current gdtoa routines. I remember someone report in mailing list that it is faster that the one in MSVCRT

     
  • Keith Marshall
    Keith Marshall
    2008-06-03

    Logged In: YES
    user_id=823908
    Originator: NO

    > I posted the patch in:
    https://sourceforge.net/tracker/index.php?func=detail&aid=1983559&group_id=2435&atid=302435

    > Please test it.

    Thanks. I'll have a look at it later.

    > I did not put the fix for infinity/nan representation at this
    > moment. C89 do not mandate that.

    > C99 requires "infinity"/"inf"/"nan"/"nan(xxxxxxxx)" (and their
    > capitalization). MSVCRT does sth very different. What should
    > we choose?

    I'm inclined to favour "inf"/"nan" and their capitalisations, as mandated by both C99 and POSIX. The libmingwex.a philosophy is to provide C99 standards conformance, where MSVCRT is broken.

    > %p is not fixed yet, since C99 does not mandate it. But 1 more
    > line would be enough to match %#lx style

    Yep. I currently have it, in my implementation, as an alias for `%#x', with any explicit field width or precision specifications honoured, and any data length modifiers ignored, (since it must explicitly cast from a (void *) type). That's what matches best to the representation I see on my GNU/Linux box, but it would be trivial to make it match any alternative representation we might choose.

    > I made it more messy by implementing other C99 features like %F,
    > %a/%A, j/t/z/hh.

    If the design is clean, it shouldn't need to become more messy, to implement those; indeed, my implementation also has `hh' working, and place holders for the rest, in what I consider to be a clean design.

    > I even try implementing %lc and %ls, though I can't test this.

    Even those can be handled cleanly, with a good design. Indeed, I plan to include them in my implementation. They can be tested by temporarily changing to a multibyte locale, and formatting some text intended for that locale, which has been converted to wchar_t using iconv; the problem is that the output in that locale may not mean much to you, but if written to a file, it should match the original source text.

    > For floating point, I strongly suggest reusing current
    > gdtoa routines.

    Yep. That is my intention, once I've made sense of how they should be called.

     
  • Cary R.
    Cary R.
    2008-06-03

    Logged In: YES
    user_id=1651735
    Originator: YES

    I noticed the mention of the z modifier. Adding that would further reduce my need for #ifdef __MINGW__ code. Passing a z modifier crashes the MSVCRT routines. I want to thank you both for the effort you are putting into fixing this! Once this is fixed our test suite should be close to 100% for the MinGW executable. Let me know when you have something available for testing.

     
  • Keith Marshall
    Keith Marshall
    2008-07-24

    • status: open --> pending
     
  • Jacky Lai
    Jacky Lai
    2008-07-24

    Logged In: YES
    user_id=2105531
    Originator: NO

    Yes, the libprintf.a has already addressed them

     
  • Logged In: YES
    user_id=1312539
    Originator: NO

    This Tracker item was closed automatically by the system. It was
    previously set to a Pending status, and the original submitter
    did not respond within 30 days (the time period specified by
    the administrator of this Tracker).

     
    • status: pending --> closed