These modifications should allow strtoi_c() to correctly
return a long value between -2billion and +2billion.
The current routine ignores overflow and will return an
incorrect value if a very large number is present, for
example: 12345678901234567890.
The current routine will not return a correct negative value
as written right now.
Another problem is that there may not be a number
present, but the current routine modifies endptr if
ascii spaces are found or if a sign -/+ value without a
number.
With the new suggestions, now endptr will only point to
correct endptr if a number is found, otherwise endptr points
remains at beginning of str.
This allows user to perform a test for errors:
value = strtoi_c(x, &y);
if (x == y) then error;
...or to retry test using a more powerful routine:
value = strtoi_c(x, &y);
if (x == y) then extralongvalue = strtoextralongi_c(x, &y);
The attached diff file is based on latest strtonum.cpp in
CVS (may17,2005)
--------suggested updated routine (below)-------
long strtoi_c(const char * npter, const char ** endptr) {
char * str = (char*)npter;
long num = 0;
char sign = 0;
*endptr = str;
while (asc_isspace(*str)) {
str++;
}
if (*str == '-' || *str == '+') {
sign = *(str++) == '-' ? -1 : 0;
}
while (*str >= '0' && *str <= '9') {
num = num * 10 + (long)(*str - '0');
if (num < 0) {
*endptr = (char*)npter;
return -1;
}
*endptr = ++str;
}
if (sign) num = -num;
return num;
}
Logged In: YES
user_id=1138138
x.c is a mini test routine to try out the various conditions without
having to completely recompile aspell, for example:
str1 < -2billion,
str1 > +2billion,
str1 == "\t\r +";
to compile:
gcc -o x x.c
NOTE: I noticed that:
long strtoi_c(const char * npter, char ** endptr) {
appears to agree better with the compiler versus:
long strtoi_c(const char * npter, const char ** endptr) {
Logged In: YES
user_id=1138138
Added "if" test to verify a number truly exists beyond -/+ so that
testing for "\t\r +" will test as false.... basically added:
+ if (*str < '0' || *str > '9')
+ return -1;
Also submitted as "diff -u" instead of "diff -b -u"
Logged In: YES
user_id=1138138
Originator: YES
bug 1204046 is for a function which only expects +ve values and should be closed or deleted if you want this strtoi_c() function to also accept negative values as well.
That should close 2 patches (this one and 124046).
If you want to keep 124046, this strtoi_c() function needs a method to return an error.
I noticed an error in the patch I sent, therefore I'm attaching a new patch.
The new patch simply returns 0 if num overflows past 'long'.
The patch is against 1.4 in CVS:
http://cvs.savannah.gnu.org/viewvc/aspell/aspell/common/strtonum.cpp?revision=1.4&view=markup
File Added: strtonum.cpp.diff
Logged In: YES
user_id=6591
Originator: NO
strtoi_c should follow the sementatics of strtol when the locale is "C" and the base is 10. (Not sure why I named it strtoi_c instead of strtol_c).
Please figure out what they are and send me a patch to that effect.
Logged In: YES
user_id=1138138
Originator: YES
> strtoi_c should follow the sementatics of strtol when the locale is "C"
> and the base is 10.
May need more thought - eg. suppose you spell-check German while locale is French.
New patch includes MIN & MAX, don't know how you want to convey error considering the calling routines don't check for errors yet.
> (Not sure why I named it strtoi_c instead of strtol_c).
A lot of developers assumed that 32bit was going to stay - then came 64bit and now you need to check all your references to ints and use char, short, long or longlong.
char sign should help for register-starved CPUs like pre-pentiums and embedded-type-CPUs.
http://www.cplusplus.com/reference/clibrary/cstdlib/strtol.html
On success, the function returns the converted integral number as a long int value.
If no valid conversion could be performed, a zero value is returned.
If the correct value is out of the range of representable values, LONG_MAX or LONG_MIN is returned, an the global variable errno is set to ERANGE.
http://linux.about.com/library/cmd/blcmdl3_strtoll.htm
appears to mention a EINVAL
File Added: strtonum.cpp.diff
Logged In: YES
user_id=6591
Originator: NO
> May need more thought - eg. suppose you spell-check German while locale is
> French.
My version is suppose to be locale independent! Ie the default or "C" locale
When I said follow the same semantics of the strtol I meant it, thus use errno.
diff -rub aspell-0.61 for strtoi_c()
Logged In: YES
user_id=1138138
Originator: YES
>My version is suppose to be locale independent! Ie the default or "C" locale
Okay, makes more sense. it looked somewhat tempting to simply do a call to strtol
>When I said follow the same semantics of the strtol I meant it, thus use errno.
Some documentation definitions: <errno.h>
http://www.opengroup.org/onlinepubs/009695399/basedefs/errno.h.html
It appears EINVAL and ERANGE should be given.
I know if you pass parameters through the stack, it is thread safe, I'm not too sure to trust passing parameters via some other method, so modified routine to pass values through the stack .... strtoi_c(startptr,endptr,errno)
Looking at glib, strtol has "char** endptr" but not "const char** endptr".
I modified the routine to also use char** as done in glib's strtol() function.
Compiles okay, but still needs to be tested and right now, not sure how, or have time to try yet. Let me know if looks okay - if you want to work with it /modify it, sure, go ahead.
You'll notice the test for out of range is divided in 2 parts. If you use char as a value range of 0...255 or -128...+127 then you see:
(num < 0) checks positive values (0...127) and negative values (-127..0) but can't test for -128, to test -128, you need to do (num-1 < 0) for the negative range to include -128..-1
concept should work similar for "long int"
File Added: strtonum.cpp.diff
Logged In: YES
user_id=6591
Originator: NO
Once Again: When I said follow the same semantics of the strtol I meant it
Using errno is thread safe, otherwise lots of other things will break.
This issue has moved to GitHub: https://github.com/GNUAspell/aspell/issues/290