Thread: Re: [SPAM-bayes] Re: [cx-oracle-users] float vs. int
Brought to you by:
atuining
From: Geoff G. <ge...@ge...> - 2003-10-06 19:18:04
|
Quoting Anthony Tuininga (an...@co...): > The only possibility that I can think of right now is that they incur > the overhead of checking if trunc(result) == result and returning an > integer as a result. That has two problems with it, though. (1) the > performance penalty and (2) the possibility of getting back an integer > when you wanted a float DCOracle and DCOracle2 both returns a float in this (select count(*)) case. For columns with a precision > 9 and a scale 0, DCOracle2 returns a Long by default, without comparing the actual value to sys.maxint. DCOracle compares the actual value and returns a Long only if an Int would overflow. DCOracle does all its data conversion by poking around inside Oracle's data structures, I think. I don't know the OCI API very well but very few OCI calls are actually being made by DCOracle. DCOracle2 seems to use slightly more of the OCI API, but the code is nowhere near as clean and comprehensible as the code in cx_Oracle. I prefer the DCOracle behavior only because I have coded to it in the past. I think the DCOracle behavior makes reasonable sense, but the DCOracle2 behavior does, too. Of course, I've already voted... ;) I don't deal with a lot of large numeric values, but I am a little concerned about FP "roundoff" and FP precision errors, and returning all values as strings is interesting but incurs more overhead than I'm willing to stomach. Just hoping to add some points of comparison to the discussion. Thanks, --G. -- Geoff Gerrietts "Whenever people agree with me I always <geoff at gerrietts net> feel I must be wrong." --Oscar Wilde |
From: Anthony T. <an...@co...> - 2003-10-06 19:36:25
|
On Mon, 2003-10-06 at 13:17, Geoff Gerrietts wrote: > Quoting Anthony Tuininga (an...@co...): > > The only possibility that I can think of right now is that they incur > > the overhead of checking if trunc(result) == result and returning an > > integer as a result. That has two problems with it, though. (1) the > > performance penalty and (2) the possibility of getting back an integer > > when you wanted a float > > DCOracle and DCOracle2 both returns a float in this (select count(*)) > case. Ok. So someone else had the same problem and came up with the same solution. Good. :-) > For columns with a precision > 9 and a scale 0, DCOracle2 returns a > Long by default, without comparing the actual value to sys.maxint. > DCOracle compares the actual value and returns a Long only if an Int > would overflow. Hmm. Python 2.3 will automatically convert to long when the value won't fit inside an integer. Python 2.2 and below will not, of course, but that could be checked by looking for an OverflowError and doing the right conversion otherwise. I think that makes a lot of sense. What do you think? > DCOracle does all its data conversion by poking around inside Oracle's > data structures, I think. I don't know the OCI API very well but very > few OCI calls are actually being made by DCOracle. I believe I am making the same ones. I didn't actually look but Oracle does provide the scale and precision of numbers and that is what I am checking. > DCOracle2 seems to use slightly more of the OCI API, but the code is > nowhere near as clean and comprehensible as the code in cx_Oracle. Thanks. It would be interesting to note which parts of the OCI API DCOracle uses that cx_Oracle does not -- mostly out of curiosity but sometimes interesting enhancements turn up as a result... :-) > I prefer the DCOracle behavior only because I have coded to it in the > past. I think the DCOracle behavior makes reasonable sense, but the > DCOracle2 behavior does, too. Of course, I've already voted... ;) See the above. Does that make sense to you? > I don't deal with a lot of large numeric values, but I am a little > concerned about FP "roundoff" and FP precision errors, and returning > all values as strings is interesting but incurs more overhead than I'm > willing to stomach. Floating point is always a problem in that regard. Thus the introduction of fixed point classes. But they have their own set of problems -- mostly that they are not intrinsic to a lot of other algorithms. Someone earlier posted a link to a fixed point class. I may do something a little less functional directly in C or provide a mechanism for registering a "fixed point" class for use by cx_Oracle in returning such things. Any thoughts on that? > Just hoping to add some points of comparison to the discussion. Thanks. > Thanks, > --G. -- Anthony Tuininga an...@co... Computronix Distinctive Software. Real People. Suite 200, 10216 - 124 Street NW Edmonton, AB, Canada T5N 4A3 Phone: (780) 454-3700 Fax: (780) 454-3838 http://www.computronix.com |
From: Geoff G. <ge...@ge...> - 2003-10-06 20:41:29
|
Quoting Anthony Tuininga (an...@co...): > > For columns with a precision > 9 and a scale 0, DCOracle2 returns a > > Long by default, without comparing the actual value to sys.maxint. > > DCOracle compares the actual value and returns a Long only if an Int > > would overflow. > > Hmm. Python 2.3 will automatically convert to long when the value won't > fit inside an integer. Python 2.2 and below will not, of course, but > that could be checked by looking for an OverflowError and doing the > right conversion otherwise. I think that makes a lot of sense. What do > you think? Yes, that goes along with what I think the principle of least surprise should dictate. The full unification in 2.3 is obviously the best of all worlds, but until everyone's on 2.3+, getting as close as possible seems like the "right" thing to do (and it has the advantage of continuing to be the right thing in 2.3 and beyond). > > DCOracle does all its data conversion by poking around inside Oracle's > > data structures, I think. I don't know the OCI API very well but very > > few OCI calls are actually being made by DCOracle. > > I believe I am making the same ones. I didn't actually look but Oracle > does provide the scale and precision of numbers and that is what I am > checking. The original DCOracle started life as a SWIG wrap of the OCI API. It was heavily hacked to produce a more-or-less compliant module. As near I can tell -- which is not too close, because I've spent only a couple of days spread over a couple of years looking at the Byzantine beast -- it copies data directly out of the OCI's mapped memory region and into some Pythonic data structures. Chances are good to middling that I am missing something key, but doing a grep for fairly standard calls (like "OCINumber.*") yields hits only in object files. DCOracle2 I am even less familiar with. It does use the more regular calls, though. > Floating point is always a problem in that regard. Thus the introduction > of fixed point classes. But they have their own set of problems -- > mostly that they are not intrinsic to a lot of other algorithms. Someone > earlier posted a link to a fixed point class. I may do something a > little less functional directly in C or provide a mechanism for > registering a "fixed point" class for use by cx_Oracle in returning such > things. Any thoughts on that? Floating point is pretty standard. Fixed point really isn't, though. I mean, there's the "standard" module Tim Peters wrote. I think I read once upon a time that Aahz was looking at tightening that up possibly for inclusion in the standard library but I honestly haven't followed it. I do know that the CORBA IDL mapping to Python includes some kind of hijinx surrounding fixed point, too. To me, that looks like an opportunity for an implementation to "go rogue" and not conform to the standards. Integration with the DB layer seems like another one. I'm not sure what the /right/ answer is to all the fixed point and floating point issues. I'm certainly not well-versed in the array of possible solutions, because to date I've been ducking my head and gambling that a float will be good enough. But I think any solution's value will be measured in part by how easy it is to use with other fixed point / floating point solutions, with "effortless" or "identical" being the ideal. I think that's just a wordy way of saying "standards are good." :) Thanks, --G. -- Geoff Gerrietts "Me and my homies, we tag O.D.." <geoff at gerrietts net> --Unknown grafitti artist at a party |
From: Anthony T. <an...@co...> - 2003-10-06 20:53:22
|
On Mon, 2003-10-06 at 14:40, Geoff Gerrietts wrote: > Quoting Anthony Tuininga (an...@co...): > > > For columns with a precision > 9 and a scale 0, DCOracle2 returns a > > > Long by default, without comparing the actual value to sys.maxint. > > > DCOracle compares the actual value and returns a Long only if an Int > > > would overflow. > > > > Hmm. Python 2.3 will automatically convert to long when the value won't > > fit inside an integer. Python 2.2 and below will not, of course, but > > that could be checked by looking for an OverflowError and doing the > > right conversion otherwise. I think that makes a lot of sense. What do > > you think? > > Yes, that goes along with what I think the principle of least surprise > should dictate. The full unification in 2.3 is obviously the best of > all worlds, but until everyone's on 2.3+, getting as close as possible > seems like the "right" thing to do (and it has the advantage of > continuing to be the right thing in 2.3 and beyond). Sounds good. If no one else objects, I'll implement that in the next release. > > > DCOracle does all its data conversion by poking around inside Oracle's > > > data structures, I think. I don't know the OCI API very well but very > > > few OCI calls are actually being made by DCOracle. > > > > I believe I am making the same ones. I didn't actually look but Oracle > > does provide the scale and precision of numbers and that is what I am > > checking. > > The original DCOracle started life as a SWIG wrap of the OCI API. It > was heavily hacked to produce a more-or-less compliant module. > > As near I can tell -- which is not too close, because I've spent only > a couple of days spread over a couple of years looking at the > Byzantine beast -- it copies data directly out of the OCI's mapped > memory region and into some Pythonic data structures. Chances are good > to middling that I am missing something key, but doing a grep for > fairly standard calls (like "OCINumber.*") yields hits only in object > files. > > DCOracle2 I am even less familiar with. It does use the more regular > calls, though. I took a quick peek at the current code in DCOracle2. Apparently they don't use the OCINumber routines at all although they are found (with a grep) inside the binaries that they ship, which is rather strange. Instead, they parse the actual 22-byte value that is returned by Oracle and look at the precision and scale stored in that value. I have no intentions of going down that road! :-) > > Floating point is always a problem in that regard. Thus the introduction > > of fixed point classes. But they have their own set of problems -- > > mostly that they are not intrinsic to a lot of other algorithms. Someone > > earlier posted a link to a fixed point class. I may do something a > > little less functional directly in C or provide a mechanism for > > registering a "fixed point" class for use by cx_Oracle in returning such > > things. Any thoughts on that? > > Floating point is pretty standard. Fixed point really isn't, though. I > mean, there's the "standard" module Tim Peters wrote. I think I read > once upon a time that Aahz was looking at tightening that up possibly > for inclusion in the standard library but I honestly haven't followed > it. > > I do know that the CORBA IDL mapping to Python includes some kind of > hijinx surrounding fixed point, too. To me, that looks like an > opportunity for an implementation to "go rogue" and not conform to the > standards. Integration with the DB layer seems like another one. > > I'm not sure what the /right/ answer is to all the fixed point and > floating point issues. I'm certainly not well-versed in the array of > possible solutions, because to date I've been ducking my head and > gambling that a float will be good enough. But I think any solution's > value will be measured in part by how easy it is to use with other > fixed point / floating point solutions, with "effortless" or > "identical" being the ideal. > > I think that's just a wordy way of saying "standards are good." :) Up to this point I have used floating point where I know it will work and otherwise I have used strings. That works but it does leave a bad taste in one's mouth if one is expecting a more "transparent" solution. I'll give this a little more thought before I do anything drastic, though. If you (or anyone else) has any other ideas, please fire them this way. > Thanks, > --G. -- Anthony Tuininga an...@co... Computronix Distinctive Software. Real People. Suite 200, 10216 - 124 Street NW Edmonton, AB, Canada T5N 4A3 Phone: (780) 454-3700 Fax: (780) 454-3838 http://www.computronix.com |