From: Keith G. <kwg...@gm...> - 2006-11-01 20:33:14
|
I had a hard time tracing a bug in my code. The culprit was this difference: >> x matrix([[True], [True], [True]], dtype=bool) >> 1.0 - x matrix([[ 0.], [ 0.], [ 0.]], dtype=float32) <------- float32 >> 1.0*x matrix([[ 1.], [ 1.], [ 1.]]) <-------- float64 >> numpy.__version__ '1.0rc1' Any chance that 1.0 - x could return dtype = float64? |
From: Travis O. <oli...@ee...> - 2006-11-01 21:43:51
|
Keith Goodman wrote: >I had a hard time tracing a bug in my code. The culprit was this difference: > > > >>>x >>> >>> > >matrix([[True], > [True], > [True]], dtype=bool) > > >>>1.0 - x >>> >>> > >matrix([[ 0.], > [ 0.], > [ 0.]], dtype=float32) <------- float32 > > >>>1.0*x >>> >>> > >matrix([[ 1.], > [ 1.], > [ 1.]]) <-------- float64 > > > >>>numpy.__version__ >>> >>> >'1.0rc1' > >Any chance that 1.0 - x could return dtype = float64? > > I'm surprised it doesn't. Both should follow bascially the same code-path. Perhaps there is a missing function loop or something. I'll look in to it. -Travis |
From: Travis O. <oli...@ee...> - 2006-11-01 21:51:44
|
Keith Goodman wrote: >I had a hard time tracing a bug in my code. The culprit was this difference: > > > >>>x >>> >>> > >matrix([[True], > [True], > [True]], dtype=bool) > > >>>1.0 - x >>> >>> > >matrix([[ 0.], > [ 0.], > [ 0.]], dtype=float32) <------- float32 > > >>>1.0*x >>> >>> > >matrix([[ 1.], > [ 1.], > [ 1.]]) <-------- float64 > > > >>>numpy.__version__ >>> >>> >'1.0rc1' > >Any chance that 1.0 - x could return dtype = float64? > > It looks like 1.0-x is doing the right thing. The problem is 1.0*x for matrices is going to float64. For arrays it returns float32 just like the 1.0-x This can't be changed at this point until 1.1 We will fix the bug in 1.0*x producing float64, however. I'm still not sure what's causing it, though. -Travis |
From: Robert K. <rob...@gm...> - 2006-11-01 21:57:58
|
Travis Oliphant wrote: > It looks like 1.0-x is doing the right thing. > > The problem is 1.0*x for matrices is going to float64. For arrays it > returns float32 just like the 1.0-x Why is this the right thing? Python floats are float64. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: Travis O. <oli...@ee...> - 2006-11-01 22:44:15
|
Robert Kern wrote: >Travis Oliphant wrote: > > > >>It looks like 1.0-x is doing the right thing. >> >>The problem is 1.0*x for matrices is going to float64. For arrays it >>returns float32 just like the 1.0-x >> >> > >Why is this the right thing? Python floats are float64. > > Yeah, why indeed. Must be something with the scalar coercion code... -Travis |
From: Tim H. <tim...@ie...> - 2006-11-02 03:17:19
|
Travis Oliphant wrote: > Robert Kern wrote: > > >> Travis Oliphant wrote: >> >> >> >> >>> It looks like 1.0-x is doing the right thing. >>> >>> The problem is 1.0*x for matrices is going to float64. For arrays it >>> returns float32 just like the 1.0-x >>> >>> >>> >> Why is this the right thing? Python floats are float64. >> >> >> > Yeah, why indeed. Must be something with the scalar coercion code... This is one of those things that pops up every few years. I suspect that the best thing to do here is to treat 1.0, and all Python floats as having a kind (float), but no precision. Or, equivalently treat them as the smallest precision floating point value. The rationale behind this is that otherwise float32 array will be promoted whenever they are multiplied by Python floating point scalars. If Python floats are treated as Float64 for purposes of determining output precision then anyone using float32 arrays is going to have to wrap all of their literals in float32 to prevent inadvertent upcasting to float64. This was the origin of the (rather clunky) numarray spacesaver flag. It's no skin off my nose either way, since I pretty much never use float32, but I suspect that treating python floats equivalently to float64 scalars would be a mistake. At the very least it deserves a bit of discussion. -tim |
From: Charles R H. <cha...@gm...> - 2006-11-02 03:36:50
|
On 11/1/06, Tim Hochberg <tim...@ie...> wrote: > > Travis Oliphant wrote: > > Robert Kern wrote: > > > > > >> Travis Oliphant wrote: > >> > >> > >> > >> > >>> It looks like 1.0-x is doing the right thing. > >>> > >>> The problem is 1.0*x for matrices is going to float64. For arrays it > >>> returns float32 just like the 1.0-x > >>> > >>> > >>> > >> Why is this the right thing? Python floats are float64. > >> > >> > >> > > Yeah, why indeed. Must be something with the scalar coercion code... > > This is one of those things that pops up every few years. I suspect that > the best thing to do here is to treat 1.0, and all Python floats as > having a kind (float), but no precision. Or, equivalently treat them as > the smallest precision floating point value. The rationale behind this > is that otherwise float32 array will be promoted whenever they are > multiplied by Python floating point scalars. If Python floats are > treated as Float64 for purposes of determining output precision then > anyone using float32 arrays is going to have to wrap all of their > literals in float32 to prevent inadvertent upcasting to float64. This > was the origin of the (rather clunky) numarray spacesaver flag. > > It's no skin off my nose either way, since I pretty much never use > float32, but I suspect that treating python floats equivalently to > float64 scalars would be a mistake. At the very least it deserves a bit > of discussion. Well, I think that the present convention of having the array float type determine the output type when doing a binary op with a scalar makes sense. The question is what to do when the initial array is an integer type and needs to be promoted. Now I could see 1) coercing the scalar float to integer, which is probably consistent with the treatment of integer types. (boo) 2) requiring explicit use of float types, i.e., float64(1.0), which is a bit clumsy. 3) promoting to float64 by default and expecting the user to specify float32(1.0) when needed. I prefer 3, as float32 is probably not the most used data type. So the rule would be numpy_int array + python_int -- type numpy_int numpy_int array + python_flt -- type float64 numpy_int array + numpy_flt -- type numpy_flt numpy_flt array + python_flt -- type numpy_flt Seems a bit much to remember, but things always get complicated when you want to control the types. Mind that going from int64 to float64 can lead to loss of precision. Chuck |
From: Scott R. <sr...@nr...> - 2006-11-02 04:55:50
|
On Wed, Nov 01, 2006 at 08:16:59PM -0700, Tim Hochberg wrote: > Travis Oliphant wrote: > > Robert Kern wrote: > >> Travis Oliphant wrote: > >> > >>> It looks like 1.0-x is doing the right thing. > >>> > >>> The problem is 1.0*x for matrices is going to float64. For arrays it > >>> returns float32 just like the 1.0-x > >>> > >> Why is this the right thing? Python floats are float64. > >> > > Yeah, why indeed. Must be something with the scalar coercion code... > > This is one of those things that pops up every few years. I suspect that > the best thing to do here is to treat 1.0, and all Python floats as > having a kind (float), but no precision. Or, equivalently treat them as > the smallest precision floating point value. The rationale behind this > is that otherwise float32 array will be promoted whenever they are > multiplied by Python floating point scalars. If Python floats are > treated as Float64 for purposes of determining output precision then > anyone using float32 arrays is going to have to wrap all of their > literals in float32 to prevent inadvertent upcasting to float64. This > was the origin of the (rather clunky) numarray spacesaver flag. I'm one of those people who made serious use of that clunky spacesaver flag for precisely this reason. I deal with several GB arrays of 32-bit floats (or 32-bit x2 complex numbers) on a regular basis. Having automatic upcasting from scalar operations can be a royal pain. Scott -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sr...@nr... Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989 |
From: Robert K. <rob...@gm...> - 2006-11-02 05:17:39
|
Tim Hochberg wrote: > Travis Oliphant wrote: >> Robert Kern wrote: >>> Travis Oliphant wrote: >>>> It looks like 1.0-x is doing the right thing. >>>> >>>> The problem is 1.0*x for matrices is going to float64. For arrays it >>>> returns float32 just like the 1.0-x >>>> >>> Why is this the right thing? Python floats are float64. >>> >> Yeah, why indeed. Must be something with the scalar coercion code... > > This is one of those things that pops up every few years. I suspect that > the best thing to do here is to treat 1.0, and all Python floats as > having a kind (float), but no precision. Or, equivalently treat them as > the smallest precision floating point value. The rationale behind this > is that otherwise float32 array will be promoted whenever they are > multiplied by Python floating point scalars. If Python floats are > treated as Float64 for purposes of determining output precision then > anyone using float32 arrays is going to have to wrap all of their > literals in float32 to prevent inadvertent upcasting to float64. This > was the origin of the (rather clunky) numarray spacesaver flag. > > It's no skin off my nose either way, since I pretty much never use > float32, but I suspect that treating python floats equivalently to > float64 scalars would be a mistake. At the very least it deserves a bit > of discussion. Well, they *are* 64-bit floating point numbers. You simply can't get around that. That's why we now have all of the scalar types: you can get any precision scalars that you want as long as you are explicit about it (and explicit is better than implicit). The spacesaver flag was the only solution before the various scalar types existed. I'd like to suggest that the discussion already occurred some time ago and has concluded in favor of the scalar types. Downcasting should be explicit. However, whether or not float32 arrays operated with Python float scalars give float32 or float64 arrays is tangential to my question. Does anyone actually think that a Python float operated with a boolean array should give a float32 result? Must we *up*cast a boolean array to float64 to preserve the precision of the scalar? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: Charles R H. <cha...@gm...> - 2006-11-02 05:55:42
|
On 11/1/06, Robert Kern <rob...@gm...> wrote: > > Tim Hochberg wrote: > > Travis Oliphant wrote: <snip> However, whether or not float32 arrays operated with Python float scalars > give > float32 or float64 arrays is tangential to my question. Does anyone > actually > think that a Python float operated with a boolean array should give a > float32 > result? Must we *up*cast a boolean array to float64 to preserve the > precision of > the scalar? Probably doesn't matter most of the time, I suppose, who is going to check? I tend to think doubles because they are a bit faster on the x86 architecture and because they are a pretty common default. Chuck |
From: Travis O. <oli...@ee...> - 2006-11-02 15:48:47
|
Robert Kern wrote: >Tim Hochberg wrote: > > >>Travis Oliphant wrote: >> >> >>>Robert Kern wrote: >>> >>> >>>>Travis Oliphant wrote: >>>> >>>> >>>>>It looks like 1.0-x is doing the right thing. >>>>> >>>>>The problem is 1.0*x for matrices is going to float64. For arrays it >>>>>returns float32 just like the 1.0-x >>>>> >>>>> >>>>> >>>>Why is this the right thing? Python floats are float64. >>>> >>>> >>>> >>>Yeah, why indeed. Must be something with the scalar coercion code... >>> >>> >>This is one of those things that pops up every few years. I suspect that >>the best thing to do here is to treat 1.0, and all Python floats as >>having a kind (float), but no precision. Or, equivalently treat them as >>the smallest precision floating point value. The rationale behind this >>is that otherwise float32 array will be promoted whenever they are >>multiplied by Python floating point scalars. If Python floats are >>treated as Float64 for purposes of determining output precision then >>anyone using float32 arrays is going to have to wrap all of their >>literals in float32 to prevent inadvertent upcasting to float64. This >>was the origin of the (rather clunky) numarray spacesaver flag. >> >>It's no skin off my nose either way, since I pretty much never use >>float32, but I suspect that treating python floats equivalently to >>float64 scalars would be a mistake. At the very least it deserves a bit >>of discussion. >> >> > >Well, they *are* 64-bit floating point numbers. You simply can't get around >that. That's why we now have all of the scalar types: you can get any precision >scalars that you want as long as you are explicit about it (and explicit is >better than implicit). The spacesaver flag was the only solution before the >various scalar types existed. I'd like to suggest that the discussion already >occurred some time ago and has concluded in favor of the scalar types. >Downcasting should be explicit. > >However, whether or not float32 arrays operated with Python float scalars give >float32 or float64 arrays is tangential to my question. Does anyone actually >think that a Python float operated with a boolean array should give a float32 >result? Must we *up*cast a boolean array to float64 to preserve the precision of >the scalar? > > > The first basic rule is that scalars don't control the precision of the output when doing mixed-type calculations *except* when they are of a fundamentally different kind. Then (if a different kind of scalar is used), the rule is that the arrays will be upcast to the "lowest" precision in the group to preserve overall precision. So, when a bool is combined with a "float" kind of scalar, the result is float32 because that preserves precision of the bool. Remember it is array precision that takes precedence over scalars in mixed type array-scalar operations. This is the rule. I agree that this rule is probably flawed in certain circumstances. So, what should be done about it at this point? Do you think a change is acceptable for 1.0.1 or does it need to wait a year until 1.1? -Travis |
From: Tim H. <tim...@ie...> - 2006-11-02 16:15:33
|
Travis Oliphant wrote: > Robert Kern wrote: > > >> Tim Hochberg wrote: >> >> >> >>> Travis Oliphant wrote: >>> >>> >>> >>>> Robert Kern wrote: >>>> >>>> >>>> >>>>> Travis Oliphant wrote: >>>>> >>>>> >>>>> >>>>>> It looks like 1.0-x is doing the right thing. >>>>>> >>>>>> The problem is 1.0*x for matrices is going to float64. For arrays it >>>>>> returns float32 just like the 1.0-x >>>>>> >>>>>> >>>>>> >>>>>> >>>>> Why is this the right thing? Python floats are float64. >>>>> >>>>> >>>>> >>>>> >>>> Yeah, why indeed. Must be something with the scalar coercion code... >>>> >>>> >>>> >>> This is one of those things that pops up every few years. I suspect that >>> the best thing to do here is to treat 1.0, and all Python floats as >>> having a kind (float), but no precision. Or, equivalently treat them as >>> the smallest precision floating point value. The rationale behind this >>> is that otherwise float32 array will be promoted whenever they are >>> multiplied by Python floating point scalars. If Python floats are >>> treated as Float64 for purposes of determining output precision then >>> anyone using float32 arrays is going to have to wrap all of their >>> literals in float32 to prevent inadvertent upcasting to float64. This >>> was the origin of the (rather clunky) numarray spacesaver flag. >>> >>> It's no skin off my nose either way, since I pretty much never use >>> float32, but I suspect that treating python floats equivalently to >>> float64 scalars would be a mistake. At the very least it deserves a bit >>> of discussion. >>> >>> >>> >> Well, they *are* 64-bit floating point numbers. You simply can't get around >> that. That's why we now have all of the scalar types: you can get any precision >> scalars that you want as long as you are explicit about it (and explicit is >> better than implicit). The spacesaver flag was the only solution before the >> various scalar types existed. I'd like to suggest that the discussion already >> occurred some time ago and has concluded in favor of the scalar types. >> Downcasting should be explicit. >> >> However, whether or not float32 arrays operated with Python float scalars give >> float32 or float64 arrays is tangential to my question. Does anyone actually >> think that a Python float operated with a boolean array should give a float32 >> result? Must we *up*cast a boolean array to float64 to preserve the precision of >> the scalar? >> >> >> >> > The first basic rule is that scalars don't control the precision of the > output when doing mixed-type calculations *except* when they are of a > fundamentally different kind. > > Then (if a different kind of scalar is used), the rule is that the > arrays will be upcast to the "lowest" precision in the group to preserve > overall precision. So, when a bool is combined with a "float" kind of > scalar, the result is float32 because that preserves precision of the > bool. Remember it is array precision that takes precedence over scalars > in mixed type array-scalar operations. > > This is the rule. I agree that this rule is probably flawed in certain > circumstances. > I think any rule will be flawed in certain circumstances. This particular rule has the advantage of being relatively straightforward and the circumstances that I can think of where it could cause problems are relatively limited and relatively easy to address. The obvious "fixes" to this rule that I have I've thought of all have problems that are as bad or worse as the current rule and have the added disadvantage of being more complicated. At the very least, any replacement rule should get some serious discussion here before being implemented. We should particularly solicit the input of numarray users since that package had more infrastructure in place to support the use of lower precision arrays. > So, what should be done about it at this point? Do you think a change > is acceptable for 1.0.1 or does it need to wait a year until 1.1? > Unless someone can come up with a convincingly better solution, I say leave things as is indefinitely. -tim |
From: Charles R H. <cha...@gm...> - 2006-11-01 22:21:43
|
On 11/1/06, Robert Kern <rob...@gm...> wrote: > > Travis Oliphant wrote: > > > It looks like 1.0-x is doing the right thing. > > > > The problem is 1.0*x for matrices is going to float64. For arrays it > > returns float32 just like the 1.0-x > > Why is this the right thing? Python floats are float64. Same question here. Float32 is a designer float for special occasions, float64 is for everyday use. Chuck |
From: Keith G. <kwg...@gm...> - 2006-11-02 01:50:20
|
On 11/1/06, Travis Oliphant <oli...@ee...> wrote: > It looks like 1.0-x is doing the right thing. > > The problem is 1.0*x for matrices is going to float64. For arrays it > returns float32 just like the 1.0-x > > This can't be changed at this point until 1.1 > > We will fix the bug in 1.0*x producing float64, however. I'm still not > sure what's causing it, though. I think it would be great if float64 was the default in numpy. That way most people wouldn't have to worry about dtypes when crunching numbers. And then numpy could apply for a trademark on 'it just works'. Having to worry about dtypes makes users (me) nervous. I imagine a change like this would not be an overnight change, more of a long-term goal. This one, from a previous thread, also makes me nervous: >> sum(M.ones((300,1)) == 1) matrix([[44]], dtype=int8) But float64 might not make sense here. |
From: Charles R H. <cha...@gm...> - 2006-11-02 03:19:46
|
On 11/1/06, Keith Goodman <kwg...@gm...> wrote: > > On 11/1/06, Travis Oliphant <oli...@ee...> wrote: > > It looks like 1.0-x is doing the right thing. > > > > The problem is 1.0*x for matrices is going to float64. For arrays it > > returns float32 just like the 1.0-x > > > > This can't be changed at this point until 1.1 > > > > We will fix the bug in 1.0*x producing float64, however. I'm still not > > sure what's causing it, though. > > I think it would be great if float64 was the default in numpy. That > way most people wouldn't have to worry about dtypes when crunching > numbers. And then numpy could apply for a trademark on 'it just > works'. > > Having to worry about dtypes makes users (me) nervous. > > I imagine a change like this would not be an overnight change, more of > a long-term goal. > > This one, from a previous thread, also makes me nervous: > > >> sum(M.ones((300,1)) == 1) > matrix([[44]], dtype=int8) That one seems to be fixed: In [1]: sum(ones((300,1)) == 1) Out[1]: 300 In [2]: (ones((300,1)) == 1).sum() Out[2]: 300 The matrix version also returns a numpy scalar, however. In [20]: sum(matrix(ones((300,1)) == 1)) Out[20]: 300 I wonder if that is expected? Chuck |