From: Eric B. <er...@go...> - 2007-02-17 13:32:59
|
Colin Paul Adams wrote: > And I thought about the use case I have, which is parsing > numbers. The tokenizer has already created tokens one of which is > certified to consist entirely of decimal digits (there may be a > preceding token consisting of a minus sign). > In this use case the pre-condition "is_integer" is guarenteed true. > > So I tried to think about other use cases for "is_integer_64". All I > could think of was parsing, and I presumed therefore that the > pre-condition would always be satisfied. I still believe that a routine in KL_STRING_ROUTINES should not make such assumption as to whether the string comes from a parser or not. If you don't want to parse the string twice, then ask your tokenizer to make the segregation between numbers that fit into INTEGER_64 and those that don't. And if this appears to be too complicated, then there is another solution. If I understood correctly the whole idea is to improve performance by using INTEGER_64 instead of MA_DECIMAL whenever possible. But should you use INTEGER_64 every time it is possible, or have a heuristic which is even faster than your current implementation of `is_integer_64' and will work most of the time? A possible heuristic for example is just to check the number of characters in your string: if my_string.count < 19 then use INTEGER_64 else use MA_DECIMAL end This check is just a heuristic. It will not catch leading zeros and numbers between 1000000000000000000 and 9223372036854775807. But it's super-fast. So it might be worth using that, even if for some numbers you will use MA_DECIMAL instead of INTEGER_64. But hey, we have to make trade-off and stop somewhere. For example in your current implementation there are numbers that could fit into an NATURAL_64 but are handled as MA_DECIMAL because they don't fit into INTEGER_64. -- Eric Bezault mailto:er...@go... http://www.gobosoft.com |