Menu

#2527 Oddities with auto-promotion of strings to numeric values

None
closed-fixed
nobody
None
2022-07-11
2022-06-05
No

All of these can be explained by the behaviour of the underlying C library routines atof() and atoll(). These library functions (1) do not recognize octal values, and (2) return 0 rather than reporting an error. Nevertheless one might ask for better consistency from gnuplot's user interface.

gnuplot> print 077
63
gnuplot> print int(077)
63
gnuplot> print int("077")
77
gnuplot> print real("077")
77.0
gnuplot> print int("")   # empty string
         Non-numeric string found where a numeric expression was expected
gnuplot> print int(" ")  # space characters
0
gnuplot> print int("  ") # tab and space
         Non-numeric string found where a numeric expression was expected

Discussion

  • Ethan Merritt

    Ethan Merritt - 2022-06-07

    I found additional problematic cases. Large (too big to represent in floating point double) hexadecimal constants were subject to loss of precision.

    A fix is now pending for both 5.4 and 5.5

    Here is a table of old + new behavior for various cases.

    =====================================================
             command                old     new
    =====================================================
    gnuplot> print 077              63      63
    gnuplot> print int(077)         63      63
    gnuplot> print int("077")       77      63
    gnuplot> print real("077")      77      63.
    gnuplot> print int("1e2")       100     "Trailing characters after numeric"
    gnuplot> print int("1f2")       1       "Trailing characters after numeric"
    gnuplot> print int("1.e2")      100     100
    gnuplot> print int("1.f2")      1       "Trailing characters after numeric"
    
    gnuplot> print int("")          error   "Non-numeric string found"
    gnuplot> print int(" ")         0       "Non-numeric string found"
    
    gnuplot> print      0x3fffffffffffffff   4611686018427387903    4611686018427387903
    gnuplot> print int("0x3fffffffffffffff") 4611686018427387904    4611686018427387903
    gnuplot> print      0x7fffffffffffffff   9223372036854775807    9223372036854775807
    gnuplot> print int("0x7fffffffffffffff") NaN                    9223372036854775807
    
    still maybe surprising
    ----------------------
    gnuplot> print int("Inf")       NaN     NaN
    gnuplot> print int("NaN")       "undefined"     "undefined"
    
     

    Last edit: Ethan Merritt 2022-06-09
  • Ethan Merritt

    Ethan Merritt - 2022-06-09
    • status: open --> pending-fixed
    • Group: -->
    • Priority: -->
     
  • theozh

    theozh - 2022-07-06

    Just for curiosity... why do you want to change the old behavior for string to int in the following cases?

    gnuplot> print int("1e2")       100     "Trailing characters after numeric"
    gnuplot> print int("1f2")       1       "Trailing characters after numeric"
    gnuplot> print int("1.f2")      1       "Trailing characters after numeric"
    

    In some cases it might be convenient (and not hurting anybody) according to the premise: if possible try to make an integer out of it and only if it is not possible throw an error.

     
    • Ethan Merritt

      Ethan Merritt - 2022-07-07

      I was going with the idea that a string should not be auto-promoted to a number if it doesn't look like a valid number. Using print int("foo") as a template to generate examples may have been a bad idea, since it misleadingly looks as if it might be accepting only integers, whereas actually it first tries to promote the string to a numerical value and then passes that value as a parameter to int() for conversion to an integer.

      These are maybe better examples:

         gnuplot command       old        new
         print 0 + "1f2"       1.0        "Trailing characters after numeric expression"
         print 0 + "1e2"       100.0      "Trailing characters after numeric expression"
         print 0 + "1e2f3g"    100.0      "Trailing characters after numeric expression"
      

      Admittedly this does not correspond exactly to what is available in C. The clib routines in the strtol() family are willing to try multiple leading substrings and take the longest one that constitutes a valid numerical string. But they allow you to test explicitly for trailing junk characters, or not, as you choose.

      I suppose gnuplot could offer some global setting that controls whether to accept or reject promotion of strings with trailing junk, but that seems like a rather arcane addition.

      Or the routines int() and real() could be made special cases that bypass the string->number auto-promotion and make their own decision about what to accept.

      It would be useful to have a real world example of where any of this might be useful.

       
  • theozh

    theozh - 2022-07-07

    I thought about the following example below, but maybe this is a different case because gnuplot always tries to make a floating point number out of data? And with real() trailing "junk" will always(?) be ignored? But why making a difference for int()?

    reset session
    
    $Data <<EOD
     1   20.0°C   50%r.h.
     2   25.3°C   55%r.h.
     3   30.4°C   40%r.h.
     4   37.5°C   35%r.h.
    EOD
    
    set ytics nomirror
    set y2tics
    
    plot $Data u 1:2 w lp, '' u 1:3 axis x1y2 w lp
    

    Do I assume right that something like (admittedly not a good real world example)

    plot $Data u 1:(real(strcol(2))) w lp, '' u 1:(int(strcol(3))) axis x1y2 w lp
    

    would not work anymore?

     
    • Ethan Merritt

      Ethan Merritt - 2022-07-07

      There is nothing special about int(). The expression evaluation stack applies auto-promotion of types to all function parameters and operands. I.e. in a numeric expression strings are converted to numbers, and in floating-point context integers are promoted to floating point.

       
  • Ethan Merritt

    Ethan Merritt - 2022-07-11
    • Status: pending-fixed --> closed-fixed
     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.