#108 strtoi_c() bug fix

0.61
open-later
None
2
2011-06-26
2005-05-18
Jose Da Silva
No

These modifications should allow strtoi_c() to correctly
return a long value between -2billion and +2billion.

The current routine ignores overflow and will return an
incorrect value if a very large number is present, for
example: 12345678901234567890.

The current routine will not return a correct negative value
as written right now.

Another problem is that there may not be a number
present, but the current routine modifies endptr if
ascii spaces are found or if a sign -/+ value without a
number.

With the new suggestions, now endptr will only point to
correct endptr if a number is found, otherwise endptr points
remains at beginning of str.
This allows user to perform a test for errors:
value = strtoi_c(x, &y);
if (x == y) then error;

...or to retry test using a more powerful routine:
value = strtoi_c(x, &y);
if (x == y) then extralongvalue = strtoextralongi_c(x, &y);

The attached diff file is based on latest strtonum.cpp in
CVS (may17,2005)

--------suggested updated routine (below)-------
long strtoi_c(const char * npter, const char ** endptr) {

char * str = (char*)npter;
long num = 0;
char sign = 0;

*endptr = str;
while (asc_isspace(*str)) {
str++;
}
if (*str == '-' || *str == '+') {
sign = *(str++) == '-' ? -1 : 0;
}
while (*str >= '0' && *str <= '9') {
num = num * 10 + (long)(*str - '0');
if (num < 0) {
*endptr = (char*)npter;
return -1;
}
*endptr = ++str;
}
if (sign) num = -num;
return num;
}

Discussion

  • Jose Da Silva
    Jose Da Silva
    2005-05-18

    Logged In: YES
    user_id=1138138

    x.c is a mini test routine to try out the various conditions without
    having to completely recompile aspell, for example:
    str1 < -2billion,
    str1 > +2billion,
    str1 == "\t\r +";

    to compile:
    gcc -o x x.c

    NOTE: I noticed that:
    long strtoi_c(const char * npter, char ** endptr) {
    appears to agree better with the compiler versus:
    long strtoi_c(const char * npter, const char ** endptr) {

     
  • Jose Da Silva
    Jose Da Silva
    2005-05-18

    Logged In: YES
    user_id=1138138

    Added "if" test to verify a number truly exists beyond -/+ so that
    testing for "\t\r +" will test as false.... basically added:
    + if (*str < '0' || *str > '9')
    + return -1;

    Also submitted as "diff -u" instead of "diff -b -u"

     
  • Kevin Atkinson
    Kevin Atkinson
    2005-05-20

    • priority: 5 --> 2
    • assigned_to: nobody --> kevina
    • status: open --> open-later
     
  • Jose Da Silva
    Jose Da Silva
    2007-10-20

    Logged In: YES
    user_id=1138138
    Originator: YES

    bug 1204046 is for a function which only expects +ve values and should be closed or deleted if you want this strtoi_c() function to also accept negative values as well.
    That should close 2 patches (this one and 124046).
    If you want to keep 124046, this strtoi_c() function needs a method to return an error.

    I noticed an error in the patch I sent, therefore I'm attaching a new patch.
    The new patch simply returns 0 if num overflows past 'long'.
    The patch is against 1.4 in CVS:
    http://cvs.savannah.gnu.org/viewvc/aspell/aspell/common/strtonum.cpp?revision=1.4&view=markup

    File Added: strtonum.cpp.diff

     
  • Kevin Atkinson
    Kevin Atkinson
    2007-10-20

    Logged In: YES
    user_id=6591
    Originator: NO

    strtoi_c should follow the sementatics of strtol when the locale is "C" and the base is 10. (Not sure why I named it strtoi_c instead of strtol_c).

    Please figure out what they are and send me a patch to that effect.

     
  • Jose Da Silva
    Jose Da Silva
    2007-10-20

    Logged In: YES
    user_id=1138138
    Originator: YES

    > strtoi_c should follow the sementatics of strtol when the locale is "C"
    > and the base is 10.

    May need more thought - eg. suppose you spell-check German while locale is French.
    New patch includes MIN & MAX, don't know how you want to convey error considering the calling routines don't check for errors yet.

    > (Not sure why I named it strtoi_c instead of strtol_c).

    A lot of developers assumed that 32bit was going to stay - then came 64bit and now you need to check all your references to ints and use char, short, long or longlong.
    char sign should help for register-starved CPUs like pre-pentiums and embedded-type-CPUs.

    http://www.cplusplus.com/reference/clibrary/cstdlib/strtol.html
    On success, the function returns the converted integral number as a long int value.
    If no valid conversion could be performed, a zero value is returned.
    If the correct value is out of the range of representable values, LONG_MAX or LONG_MIN is returned, an the global variable errno is set to ERANGE.

    http://linux.about.com/library/cmd/blcmdl3_strtoll.htm
    appears to mention a EINVAL
    File Added: strtonum.cpp.diff

     
  • Kevin Atkinson
    Kevin Atkinson
    2007-10-20

    Logged In: YES
    user_id=6591
    Originator: NO

    > May need more thought - eg. suppose you spell-check German while locale is
    > French.

    My version is suppose to be locale independent! Ie the default or "C" locale

    When I said follow the same semantics of the strtol I meant it, thus use errno.

     
  • Jose Da Silva
    Jose Da Silva
    2007-10-22

    diff -rub aspell-0.61 for strtoi_c()

     
    Attachments
  • Jose Da Silva
    Jose Da Silva
    2007-10-22

    Logged In: YES
    user_id=1138138
    Originator: YES

    >My version is suppose to be locale independent! Ie the default or "C" locale

    Okay, makes more sense. it looked somewhat tempting to simply do a call to strtol

    >When I said follow the same semantics of the strtol I meant it, thus use errno.

    Some documentation definitions: <errno.h>
    http://www.opengroup.org/onlinepubs/009695399/basedefs/errno.h.html
    It appears EINVAL and ERANGE should be given.

    I know if you pass parameters through the stack, it is thread safe, I'm not too sure to trust passing parameters via some other method, so modified routine to pass values through the stack .... strtoi_c(startptr,endptr,errno)

    Looking at glib, strtol has "char** endptr" but not "const char** endptr".
    I modified the routine to also use char** as done in glib's strtol() function.

    Compiles okay, but still needs to be tested and right now, not sure how, or have time to try yet. Let me know if looks okay - if you want to work with it /modify it, sure, go ahead.

    You'll notice the test for out of range is divided in 2 parts. If you use char as a value range of 0...255 or -128...+127 then you see:
    (num < 0) checks positive values (0...127) and negative values (-127..0) but can't test for -128, to test -128, you need to do (num-1 < 0) for the negative range to include -128..-1
    concept should work similar for "long int"

    File Added: strtonum.cpp.diff

     
  • Kevin Atkinson
    Kevin Atkinson
    2007-10-22

    Logged In: YES
    user_id=6591
    Originator: NO

    Once Again: When I said follow the same semantics of the strtol I meant it

    Using errno is thread safe, otherwise lots of other things will break.

     
  • Kevin Atkinson
    Kevin Atkinson
    2011-06-26

    • milestone: --> 0.61