Thread: [Numpy-discussion] hstack(arr_Int32, arr_float32) fails because of casting rules

A package for scientific computing with Python

Brought to you by: charris208, jarrodmillman, kern, rgommers, teoliphant

numpy-discussion

[Numpy-discussion] hstack(arr_Int32, arr_float32) fails because of casting rules

From: Sebastian H. <ha...@ms...> - 2006-08-25 00:09:51

Hi,
I get
TypeError: array cannot be safely cast to required type

when calling hstack()  ( which calls  concatenate() )
on two arrays being a int32 and a float32 respectively.

I understand now that a int32 cannot be safely converted into a float32
but why does  concatenate  not automatically
up(?) cast to float64  ??

Is this really required to be done *explicitly* every time ?  
** In general it makes float32 cubersome to use. **

(
Background:  my large image data is float32 (float64 would require too much 
memory)  and the hstack call happens inside scipy plt module when I try to 
get a 1d line profile and the "y_data" is hstack'ed with the x-axis values 
(int32))
)

Thanks,
Sebastian Haase

Re: [Numpy-discussion] hstack(arr_Int32, arr_float32) fails because of casting rules

From: Travis O. <oli...@ee...> - 2006-08-25 00:28:21

Sebastian Haase wrote:

>Hi,
>I get
>TypeError: array cannot be safely cast to required type
>
>when calling hstack()  ( which calls  concatenate() )
>on two arrays being a int32 and a float32 respectively.
>
>I understand now that a int32 cannot be safely converted into a float32
>but why does  concatenate  not automatically
>up(?) cast to float64  ??
>  
>
Basically, NumPy is following Numeric's behavior of raising an error in 
this case of unsafe casting in concatenate.  For functions that are not 
universal-function objects, mixed-type behavior works basically just 
like Numeric did (using the ordering of the types to determine which one 
to choose as the output).  

It could be argued that the ufunc-rules should be followed instead.

-Travis

Re: [Numpy-discussion] hstack(arr_Int32, arr_float32) fails because of casting rules

From: Sebastian H. <ha...@ms...> - 2006-08-25 00:47:10

On Thursday 24 August 2006 17:28, Travis Oliphant wrote:
> Sebastian Haase wrote:
> >Hi,
> >I get
> >TypeError: array cannot be safely cast to required type
> >
> >when calling hstack()  ( which calls  concatenate() )
> >on two arrays being a int32 and a float32 respectively.
> >
> >I understand now that a int32 cannot be safely converted into a float32
> >but why does  concatenate  not automatically
> >up(?) cast to float64  ??
>
> Basically, NumPy is following Numeric's behavior of raising an error in
> this case of unsafe casting in concatenate.  For functions that are not
> universal-function objects, mixed-type behavior works basically just
> like Numeric did (using the ordering of the types to determine which one
> to choose as the output).
>
> It could be argued that the ufunc-rules should be followed instead.
>
> -Travis
>
Are you saying the  ufunc-rules  would convert "int32-float32" to float64  and 
hence make my code "just work" !?

And why are there two sets of rules ?
Are the Numeric rules used at many places ?

Thanks,
Sebastian Haase

Re: [Numpy-discussion] hstack(arr_Int32, arr_float32) fails because of casting rules

From: Travis O. <oli...@ie...> - 2006-08-25 03:07:07

Sebastian Haase wrote:
> On Thursday 24 August 2006 17:28, Travis Oliphant wrote:
>   
> Are you saying the  ufunc-rules  would convert "int32-float32" to float64  and 
> hence make my code "just work" !?
>   
Yes.  That's what I'm saying (but you would get float64 out --- but if 
you didn't want that then you would have to be specific).

> And why are there two sets of rules ?
>   
Because there are two modules (multiarray and umath) where the 
functionality is implemented.

> Are the Numeric rules used at many places ?
>   
Not that many.  I did abstract the notion to a C-API:  
PyArray_ConvertToCommonType and implemented the 
scalars-don't-cause-upcasting part of the ufunc rules in that code.   
But, I followed the old-style Numeric coercion rules for the rest of it 
(because I was adapting Numeric).

Right now, unless there are strong objections, I'm leaning to changing 
that so that the same coercion rules are used whenever a common type is 
needed. 

It would not be that difficult of a change.

-Travis

Re: [Numpy-discussion] hstack(arr_Int32, arr_float32) fails because of casting rules

From: Sebastian H. <ha...@ms...> - 2006-08-25 03:59:27

Travis Oliphant wrote:
> Sebastian Haase wrote:
>> On Thursday 24 August 2006 17:28, Travis Oliphant wrote:
>>   
>> Are you saying the  ufunc-rules  would convert "int32-float32" to float64  and 
>> hence make my code "just work" !?
>>   
> Yes.  That's what I'm saying (but you would get float64 out --- but if 
> you didn't want that then you would have to be specific).
> 
>> And why are there two sets of rules ?
>>   
> Because there are two modules (multiarray and umath) where the 
> functionality is implemented.
> 
>> Are the Numeric rules used at many places ?
>>   
> Not that many.  I did abstract the notion to a C-API:  
> PyArray_ConvertToCommonType and implemented the 
> scalars-don't-cause-upcasting part of the ufunc rules in that code.   
> But, I followed the old-style Numeric coercion rules for the rest of it 
> (because I was adapting Numeric).
> 
> Right now, unless there are strong objections, I'm leaning to changing 
> that so that the same coercion rules are used whenever a common type is 
> needed. 

If you mean keeping the ufunc rules (which seem more liberal, fix my 
problem ;-) and might make using float32 in general more painless) - I 
would be all for it ...   simplifying is always good in the long term ...

Cheers,
Sebastian

> 
> It would not be that difficult of a change.

Re: [Numpy-discussion] hstack(arr_Int32, arr_float32) fails because of casting rules

From: Travis O. <oli...@ie...> - 2006-08-25 04:03:07

Sebastian Haase wrote:
> On Thursday 24 August 2006 17:28, Travis Oliphant wrote:
>   
>> Sebastian Haase wrote:
>>     
>>> Hi,
>>> I get
>>> TypeError: array cannot be safely cast to required type
>>>
>>> when calling hstack()  ( which calls  concatenate() )
>>> on two arrays being a int32 and a float32 respectively.
>>>
>>> I understand now that a int32 cannot be safely converted into a float32
>>> but why does  concatenate  not automatically
>>> up(?) cast to float64  ??
>>>       
>> Basically, NumPy is following Numeric's behavior of raising an error in
>> this case of unsafe casting in concatenate.  For functions that are not
>> universal-function objects, mixed-type behavior works basically just
>> like Numeric did (using the ordering of the types to determine which one
>> to choose as the output).
>>
>> It could be argued that the ufunc-rules should be followed instead.
>>
>> -Travis
>>
>>     
> Are you saying the  ufunc-rules  would convert "int32-float32" to float64  and 
> hence make my code "just work" !?
>   

This is now the behavior in SVN.   Note that this is different from both 
Numeric (which gave an error) and numarray (which coerced to float32). 

But, it is consistent with how mixed-types are handled in calculations 
and is thus an easier rule to explain.

Thanks for the testing.

-Travis

[Numpy-discussion] coercion rules for float32 in numpy are different from numarray

From: Sebastian H. <ha...@ms...> - 2006-08-25 15:18:20

was: Re: [Numpy-discussion] hstack(arr_Int32, arr_float32) fails because 
of casting  rules
Travis Oliphant wrote:
> Sebastian Haase wrote:
>> On Thursday 24 August 2006 17:28, Travis Oliphant wrote:
>>   
>>> Sebastian Haase wrote:
>>>     
>>>> Hi,
>>>> I get
>>>> TypeError: array cannot be safely cast to required type
>>>>
>>>> when calling hstack()  ( which calls  concatenate() )
>>>> on two arrays being a int32 and a float32 respectively.
>>>>
>>>> I understand now that a int32 cannot be safely converted into a float32
>>>> but why does  concatenate  not automatically
>>>> up(?) cast to float64  ??
>>>>       
>>> Basically, NumPy is following Numeric's behavior of raising an error in
>>> this case of unsafe casting in concatenate.  For functions that are not
>>> universal-function objects, mixed-type behavior works basically just
>>> like Numeric did (using the ordering of the types to determine which one
>>> to choose as the output).
>>>
>>> It could be argued that the ufunc-rules should be followed instead.
>>>
>>> -Travis
>>>
>>>     
>> Are you saying the  ufunc-rules  would convert "int32-float32" to float64  and 
>> hence make my code "just work" !?
>>   
> 
> This is now the behavior in SVN.   Note that this is different from both 
> Numeric (which gave an error) and numarray (which coerced to float32). 
> 
> But, it is consistent with how mixed-types are handled in calculations 
> and is thus an easier rule to explain.
> 
> Thanks for the testing.
> 
> -Travis
After sleeping over this, I am contemplating about the cases where one 
would use float32 in the first place.
My case yesterday, where I only had a 1d line profile of my data, I was 
of course OK with coercion to float64.
But if you are working with 3D image data (as in medicine) or large 2D 
images as in astronomy I would assume the reason use float32 is that 
computer memory is to tight to afford 64bits per pixel.
This is probably why numarray tried to keep float32.
Float32 can handle a few more digits of precision than int16, but not as 
much as int32.  But I find that I most always have int32s only because 
its the default, whereas I have float32 as a clear choice to save memory.

How hard would it be to change the rules back to the numarray behavior ?
Who would be negatively affected ?
And who positively ?

Thanks for the great work.
Sebastian

Re: [Numpy-discussion] coercion rules for float32 in numpy are different from numarray

From: Travis O. <oli...@ie...> - 2006-08-25 18:50:33

Sebastian Haase wrote:
>> This is now the behavior in SVN.   Note that this is different from both 
>> Numeric (which gave an error) and numarray (which coerced to float32). 
>>
>> But, it is consistent with how mixed-types are handled in calculations 
>> and is thus an easier rule to explain.
>>
>> Thanks for the testing.
>>
>> -Travis
>>     
>
> How hard would it be to change the rules back to the numarray behavior ?
>   
It wouldn't be hard, but I'm not so sure that's a good idea.   I do see 
the logic behind that approach and it is worthy of some discussion.   
I'll give my current opinion:

The reason I changed the behavior is to get consistency so there is one 
set of rules on mixed-type interaction to explain. You can always do 
what you want by force-casting your int32 arrays to float32.    There 
will always be some people who don't like whichever behavior is 
selected, but we are trying to move NumPy in a direction of consistency 
with fewer exceptions to explain (although this is a guideline and not 
an absolute requirement).

Mixed-type interaction is always somewhat ambiguous.  Now there is a 
consistent rule for both universal functions and other functions (move 
to a precision where both can be safely cast to --- unless one is a 
scalar and then its precision is ignored).

If you don't want that to happen, then be clear about what data-type 
should be used by casting yourself.   In this case, we should probably 
not try and guess about what users really want in mixed data-type 
situations.

-Travis

Re: [Numpy-discussion] coercion rules for float32 in numpy are different from numarray

From: Charles R H. <cha...@gm...> - 2006-08-25 19:19:34

Hi,

On 8/25/06, Travis Oliphant <oli...@ie...> wrote:
>
> Sebastian Haase wrote:
> >> This is now the behavior in SVN.   Note that this is different from
> both
> >> Numeric (which gave an error) and numarray (which coerced to float32).
> >>
> >> But, it is consistent with how mixed-types are handled in calculations
> >> and is thus an easier rule to explain.
> >>
> >> Thanks for the testing.
> >>
> >> -Travis
> >>
> >
> > How hard would it be to change the rules back to the numarray behavior ?
> >
> It wouldn't be hard, but I'm not so sure that's a good idea.   I do see
> the logic behind that approach and it is worthy of some discussion.
> I'll give my current opinion:
>
> The reason I changed the behavior is to get consistency so there is one
> set of rules on mixed-type interaction to explain. You can always do
> what you want by force-casting your int32 arrays to float32.    There
> will always be some people who don't like whichever behavior is
> selected, but we are trying to move NumPy in a direction of consistency
> with fewer exceptions to explain (although this is a guideline and not
> an absolute requirement).
>
> Mixed-type interaction is always somewhat ambiguous.  Now there is a
> consistent rule for both universal functions and other functions (move
> to a precision where both can be safely cast to --- unless one is a
> scalar and then its precision is ignored).

I think this is a good thing. It makes it easy to remember what the function
will produce. The only oddity the user has to be aware of is that int32 has
more precision than float32. Probably not obvious to a newbie, but a newbie
will probably be using the double defaults anyway. Which is another good
reason for making double the default type.

If you don't want that to happen, then be clear about what data-type
> should be used by casting yourself.   In this case, we should probably
> not try and guess about what users really want in mixed data-type
> situations.

I wonder if it would be reasonable to add the dtype keyword to hstack
itself? Hmmm, what are the conventions for coercions to lesser precision?
That could get messy indeed, maybe it is best to leave such things alone and
let the programmer deal with it by rethinking the program. In the float case
that would probably mean using a float32 array instead of an int32 array.

Chuck

Re: [Numpy-discussion] coercion rules for float32 in numpy are different from numarray

From: Sebastian H. <ha...@ms...> - 2006-08-25 19:32:32

On Friday 25 August 2006 12:19, Charles R Harris wrote:
> Hi,
>
> On 8/25/06, Travis Oliphant <oli...@ie...> wrote:
> > Sebastian Haase wrote:
> > >> This is now the behavior in SVN.   Note that this is different from
> >
> > both
> >
> > >> Numeric (which gave an error) and numarray (which coerced to float32).
> > >>
> > >> But, it is consistent with how mixed-types are handled in calculations
> > >> and is thus an easier rule to explain.
> > >>
> > >> Thanks for the testing.
> > >>
> > >> -Travis
> > >
> > > How hard would it be to change the rules back to the numarray behavior
> > > ?
> >
> > It wouldn't be hard, but I'm not so sure that's a good idea.   I do see
> > the logic behind that approach and it is worthy of some discussion.
> > I'll give my current opinion:
> >
> > The reason I changed the behavior is to get consistency so there is one
> > set of rules on mixed-type interaction to explain. You can always do
> > what you want by force-casting your int32 arrays to float32.    There
> > will always be some people who don't like whichever behavior is
> > selected, but we are trying to move NumPy in a direction of consistency
> > with fewer exceptions to explain (although this is a guideline and not
> > an absolute requirement).
> >
> > Mixed-type interaction is always somewhat ambiguous.  Now there is a
> > consistent rule for both universal functions and other functions (move
> > to a precision where both can be safely cast to --- unless one is a
> > scalar and then its precision is ignored).
>
> I think this is a good thing. It makes it easy to remember what the
> function will produce. The only oddity the user has to be aware of is that
> int32 has more precision than float32. Probably not obvious to a newbie,
> but a newbie will probably be using the double defaults anyway. Which is
> another good reason for making double the default type.
Not true - a numpy-(or numeric-programming) newbie working in medical imaging 
or astronomy  would still get float32 data to work with. He/She would do some 
operations on the data and be surprised that memory (or disk space) blows up.

>
> If you don't want that to happen, then be clear about what data-type
>
> > should be used by casting yourself.   In this case, we should probably
> > not try and guess about what users really want in mixed data-type
> > situations.
>
> I wonder if it would be reasonable to add the dtype keyword to hstack
> itself? Hmmm, what are the conventions for coercions to lesser precision?
> That could get messy indeed, maybe it is best to leave such things alone
> and let the programmer deal with it by rethinking the program. In the float
> case that would probably mean using a float32 array instead of an int32
> array.
>
> Chuck

I think my main argument is that float32 is a very common type in (large) data 
processing to save memory.
But I don't know about how many exceptions like an extra "float32 rule" we can 
handle ...

I would like to hear how the numarray (STScI) folks think about this.  Who 
else works with data of the order of GBs !?

- Sebastian