From: Sebastian H. <ha...@ms...> - 2006-08-25 00:09:51
|
Hi, I get TypeError: array cannot be safely cast to required type when calling hstack() ( which calls concatenate() ) on two arrays being a int32 and a float32 respectively. I understand now that a int32 cannot be safely converted into a float32 but why does concatenate not automatically up(?) cast to float64 ?? Is this really required to be done *explicitly* every time ? ** In general it makes float32 cubersome to use. ** ( Background: my large image data is float32 (float64 would require too much memory) and the hstack call happens inside scipy plt module when I try to get a 1d line profile and the "y_data" is hstack'ed with the x-axis values (int32)) ) Thanks, Sebastian Haase |
From: Travis O. <oli...@ee...> - 2006-08-25 00:28:21
|
Sebastian Haase wrote: >Hi, >I get >TypeError: array cannot be safely cast to required type > >when calling hstack() ( which calls concatenate() ) >on two arrays being a int32 and a float32 respectively. > >I understand now that a int32 cannot be safely converted into a float32 >but why does concatenate not automatically >up(?) cast to float64 ?? > > Basically, NumPy is following Numeric's behavior of raising an error in this case of unsafe casting in concatenate. For functions that are not universal-function objects, mixed-type behavior works basically just like Numeric did (using the ordering of the types to determine which one to choose as the output). It could be argued that the ufunc-rules should be followed instead. -Travis |
From: Sebastian H. <ha...@ms...> - 2006-08-25 00:47:10
|
On Thursday 24 August 2006 17:28, Travis Oliphant wrote: > Sebastian Haase wrote: > >Hi, > >I get > >TypeError: array cannot be safely cast to required type > > > >when calling hstack() ( which calls concatenate() ) > >on two arrays being a int32 and a float32 respectively. > > > >I understand now that a int32 cannot be safely converted into a float32 > >but why does concatenate not automatically > >up(?) cast to float64 ?? > > Basically, NumPy is following Numeric's behavior of raising an error in > this case of unsafe casting in concatenate. For functions that are not > universal-function objects, mixed-type behavior works basically just > like Numeric did (using the ordering of the types to determine which one > to choose as the output). > > It could be argued that the ufunc-rules should be followed instead. > > -Travis > Are you saying the ufunc-rules would convert "int32-float32" to float64 and hence make my code "just work" !? And why are there two sets of rules ? Are the Numeric rules used at many places ? Thanks, Sebastian Haase |
From: Travis O. <oli...@ie...> - 2006-08-25 03:07:07
|
Sebastian Haase wrote: > On Thursday 24 August 2006 17:28, Travis Oliphant wrote: > > Are you saying the ufunc-rules would convert "int32-float32" to float64 and > hence make my code "just work" !? > Yes. That's what I'm saying (but you would get float64 out --- but if you didn't want that then you would have to be specific). > And why are there two sets of rules ? > Because there are two modules (multiarray and umath) where the functionality is implemented. > Are the Numeric rules used at many places ? > Not that many. I did abstract the notion to a C-API: PyArray_ConvertToCommonType and implemented the scalars-don't-cause-upcasting part of the ufunc rules in that code. But, I followed the old-style Numeric coercion rules for the rest of it (because I was adapting Numeric). Right now, unless there are strong objections, I'm leaning to changing that so that the same coercion rules are used whenever a common type is needed. It would not be that difficult of a change. -Travis |
From: Sebastian H. <ha...@ms...> - 2006-08-25 03:59:27
|
Travis Oliphant wrote: > Sebastian Haase wrote: >> On Thursday 24 August 2006 17:28, Travis Oliphant wrote: >> >> Are you saying the ufunc-rules would convert "int32-float32" to float64 and >> hence make my code "just work" !? >> > Yes. That's what I'm saying (but you would get float64 out --- but if > you didn't want that then you would have to be specific). > >> And why are there two sets of rules ? >> > Because there are two modules (multiarray and umath) where the > functionality is implemented. > >> Are the Numeric rules used at many places ? >> > Not that many. I did abstract the notion to a C-API: > PyArray_ConvertToCommonType and implemented the > scalars-don't-cause-upcasting part of the ufunc rules in that code. > But, I followed the old-style Numeric coercion rules for the rest of it > (because I was adapting Numeric). > > Right now, unless there are strong objections, I'm leaning to changing > that so that the same coercion rules are used whenever a common type is > needed. If you mean keeping the ufunc rules (which seem more liberal, fix my problem ;-) and might make using float32 in general more painless) - I would be all for it ... simplifying is always good in the long term ... Cheers, Sebastian > > It would not be that difficult of a change. |
From: Travis O. <oli...@ie...> - 2006-08-25 04:03:07
|
Sebastian Haase wrote: > On Thursday 24 August 2006 17:28, Travis Oliphant wrote: > >> Sebastian Haase wrote: >> >>> Hi, >>> I get >>> TypeError: array cannot be safely cast to required type >>> >>> when calling hstack() ( which calls concatenate() ) >>> on two arrays being a int32 and a float32 respectively. >>> >>> I understand now that a int32 cannot be safely converted into a float32 >>> but why does concatenate not automatically >>> up(?) cast to float64 ?? >>> >> Basically, NumPy is following Numeric's behavior of raising an error in >> this case of unsafe casting in concatenate. For functions that are not >> universal-function objects, mixed-type behavior works basically just >> like Numeric did (using the ordering of the types to determine which one >> to choose as the output). >> >> It could be argued that the ufunc-rules should be followed instead. >> >> -Travis >> >> > Are you saying the ufunc-rules would convert "int32-float32" to float64 and > hence make my code "just work" !? > This is now the behavior in SVN. Note that this is different from both Numeric (which gave an error) and numarray (which coerced to float32). But, it is consistent with how mixed-types are handled in calculations and is thus an easier rule to explain. Thanks for the testing. -Travis |
From: Sebastian H. <ha...@ms...> - 2006-08-25 15:18:20
|
was: Re: [Numpy-discussion] hstack(arr_Int32, arr_float32) fails because of casting rules Travis Oliphant wrote: > Sebastian Haase wrote: >> On Thursday 24 August 2006 17:28, Travis Oliphant wrote: >> >>> Sebastian Haase wrote: >>> >>>> Hi, >>>> I get >>>> TypeError: array cannot be safely cast to required type >>>> >>>> when calling hstack() ( which calls concatenate() ) >>>> on two arrays being a int32 and a float32 respectively. >>>> >>>> I understand now that a int32 cannot be safely converted into a float32 >>>> but why does concatenate not automatically >>>> up(?) cast to float64 ?? >>>> >>> Basically, NumPy is following Numeric's behavior of raising an error in >>> this case of unsafe casting in concatenate. For functions that are not >>> universal-function objects, mixed-type behavior works basically just >>> like Numeric did (using the ordering of the types to determine which one >>> to choose as the output). >>> >>> It could be argued that the ufunc-rules should be followed instead. >>> >>> -Travis >>> >>> >> Are you saying the ufunc-rules would convert "int32-float32" to float64 and >> hence make my code "just work" !? >> > > This is now the behavior in SVN. Note that this is different from both > Numeric (which gave an error) and numarray (which coerced to float32). > > But, it is consistent with how mixed-types are handled in calculations > and is thus an easier rule to explain. > > Thanks for the testing. > > -Travis After sleeping over this, I am contemplating about the cases where one would use float32 in the first place. My case yesterday, where I only had a 1d line profile of my data, I was of course OK with coercion to float64. But if you are working with 3D image data (as in medicine) or large 2D images as in astronomy I would assume the reason use float32 is that computer memory is to tight to afford 64bits per pixel. This is probably why numarray tried to keep float32. Float32 can handle a few more digits of precision than int16, but not as much as int32. But I find that I most always have int32s only because its the default, whereas I have float32 as a clear choice to save memory. How hard would it be to change the rules back to the numarray behavior ? Who would be negatively affected ? And who positively ? Thanks for the great work. Sebastian |
From: Travis O. <oli...@ie...> - 2006-08-25 18:50:33
|
Sebastian Haase wrote: >> This is now the behavior in SVN. Note that this is different from both >> Numeric (which gave an error) and numarray (which coerced to float32). >> >> But, it is consistent with how mixed-types are handled in calculations >> and is thus an easier rule to explain. >> >> Thanks for the testing. >> >> -Travis >> > > How hard would it be to change the rules back to the numarray behavior ? > It wouldn't be hard, but I'm not so sure that's a good idea. I do see the logic behind that approach and it is worthy of some discussion. I'll give my current opinion: The reason I changed the behavior is to get consistency so there is one set of rules on mixed-type interaction to explain. You can always do what you want by force-casting your int32 arrays to float32. There will always be some people who don't like whichever behavior is selected, but we are trying to move NumPy in a direction of consistency with fewer exceptions to explain (although this is a guideline and not an absolute requirement). Mixed-type interaction is always somewhat ambiguous. Now there is a consistent rule for both universal functions and other functions (move to a precision where both can be safely cast to --- unless one is a scalar and then its precision is ignored). If you don't want that to happen, then be clear about what data-type should be used by casting yourself. In this case, we should probably not try and guess about what users really want in mixed data-type situations. -Travis |
From: Charles R H. <cha...@gm...> - 2006-08-25 19:19:34
|
Hi, On 8/25/06, Travis Oliphant <oli...@ie...> wrote: > > Sebastian Haase wrote: > >> This is now the behavior in SVN. Note that this is different from > both > >> Numeric (which gave an error) and numarray (which coerced to float32). > >> > >> But, it is consistent with how mixed-types are handled in calculations > >> and is thus an easier rule to explain. > >> > >> Thanks for the testing. > >> > >> -Travis > >> > > > > How hard would it be to change the rules back to the numarray behavior ? > > > It wouldn't be hard, but I'm not so sure that's a good idea. I do see > the logic behind that approach and it is worthy of some discussion. > I'll give my current opinion: > > The reason I changed the behavior is to get consistency so there is one > set of rules on mixed-type interaction to explain. You can always do > what you want by force-casting your int32 arrays to float32. There > will always be some people who don't like whichever behavior is > selected, but we are trying to move NumPy in a direction of consistency > with fewer exceptions to explain (although this is a guideline and not > an absolute requirement). > > Mixed-type interaction is always somewhat ambiguous. Now there is a > consistent rule for both universal functions and other functions (move > to a precision where both can be safely cast to --- unless one is a > scalar and then its precision is ignored). I think this is a good thing. It makes it easy to remember what the function will produce. The only oddity the user has to be aware of is that int32 has more precision than float32. Probably not obvious to a newbie, but a newbie will probably be using the double defaults anyway. Which is another good reason for making double the default type. If you don't want that to happen, then be clear about what data-type > should be used by casting yourself. In this case, we should probably > not try and guess about what users really want in mixed data-type > situations. I wonder if it would be reasonable to add the dtype keyword to hstack itself? Hmmm, what are the conventions for coercions to lesser precision? That could get messy indeed, maybe it is best to leave such things alone and let the programmer deal with it by rethinking the program. In the float case that would probably mean using a float32 array instead of an int32 array. Chuck |
From: Sebastian H. <ha...@ms...> - 2006-08-25 19:32:32
|
On Friday 25 August 2006 12:19, Charles R Harris wrote: > Hi, > > On 8/25/06, Travis Oliphant <oli...@ie...> wrote: > > Sebastian Haase wrote: > > >> This is now the behavior in SVN. Note that this is different from > > > > both > > > > >> Numeric (which gave an error) and numarray (which coerced to float32). > > >> > > >> But, it is consistent with how mixed-types are handled in calculations > > >> and is thus an easier rule to explain. > > >> > > >> Thanks for the testing. > > >> > > >> -Travis > > > > > > How hard would it be to change the rules back to the numarray behavior > > > ? > > > > It wouldn't be hard, but I'm not so sure that's a good idea. I do see > > the logic behind that approach and it is worthy of some discussion. > > I'll give my current opinion: > > > > The reason I changed the behavior is to get consistency so there is one > > set of rules on mixed-type interaction to explain. You can always do > > what you want by force-casting your int32 arrays to float32. There > > will always be some people who don't like whichever behavior is > > selected, but we are trying to move NumPy in a direction of consistency > > with fewer exceptions to explain (although this is a guideline and not > > an absolute requirement). > > > > Mixed-type interaction is always somewhat ambiguous. Now there is a > > consistent rule for both universal functions and other functions (move > > to a precision where both can be safely cast to --- unless one is a > > scalar and then its precision is ignored). > > I think this is a good thing. It makes it easy to remember what the > function will produce. The only oddity the user has to be aware of is that > int32 has more precision than float32. Probably not obvious to a newbie, > but a newbie will probably be using the double defaults anyway. Which is > another good reason for making double the default type. Not true - a numpy-(or numeric-programming) newbie working in medical imaging or astronomy would still get float32 data to work with. He/She would do some operations on the data and be surprised that memory (or disk space) blows up. > > If you don't want that to happen, then be clear about what data-type > > > should be used by casting yourself. In this case, we should probably > > not try and guess about what users really want in mixed data-type > > situations. > > I wonder if it would be reasonable to add the dtype keyword to hstack > itself? Hmmm, what are the conventions for coercions to lesser precision? > That could get messy indeed, maybe it is best to leave such things alone > and let the programmer deal with it by rethinking the program. In the float > case that would probably mean using a float32 array instead of an int32 > array. > > Chuck I think my main argument is that float32 is a very common type in (large) data processing to save memory. But I don't know about how many exceptions like an extra "float32 rule" we can handle ... I would like to hear how the numarray (STScI) folks think about this. Who else works with data of the order of GBs !? - Sebastian |