From: Stefan v. d. W. <st...@su...> - 2006-10-18 00:25:09
|
Hi all, Some of you may have seen the interesting thread on Fortran-ordering earlier. I thought it might be fun to set up a short quiz which tests your knowledge on the topic. If you're up for the challenge, take a look at http://mentat.za.net/numpy/quiz I won't be held liable for any emotional trauma caused, self-inflicted wounds or brain-core meltdowns. Cheers St=E9fan |
From: Lisandro D. <da...@gm...> - 2006-10-18 01:13:15
|
I was surprised by this In [14]: array([[1,2,3],[4,5,6]]).reshape((3,2),order=3D'F') Out[14]: array([[1, 5], [4, 3], [2, 6]]) In [15]: array([1,2,3,4,5,6]).reshape((3,2),order=3D'F') Out[15]: array([[1, 2], [3, 4], [5, 6]]) On 10/17/06, Stefan van der Walt <st...@su...> wrote: > Hi all, > > Some of you may have seen the interesting thread on Fortran-ordering > earlier. I thought it might be fun to set up a short quiz which tests > your knowledge on the topic. > > If you're up for the challenge, take a look at > > http://mentat.za.net/numpy/quiz > > I won't be held liable for any emotional trauma caused, self-inflicted > wounds or brain-core meltdowns. > > Cheers > St=E9fan > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job ea= sier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronim= o > http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D120709&bid=3D263057&dat= =3D121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > --=20 Lisandro Dalc=EDn --------------- Centro Internacional de M=E9todos Computacionales en Ingenier=EDa (CIMEC) Instituto de Desarrollo Tecnol=F3gico para la Industria Qu=EDmica (INTEC) Consejo Nacional de Investigaciones Cient=EDficas y T=E9cnicas (CONICET) PTLC - G=FCemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 |
From: Travis O. <oli...@ie...> - 2006-10-18 01:37:15
|
Lisandro Dalcin wrote: > I was surprised by this > > In [14]: array([[1,2,3],[4,5,6]]).reshape((3,2),order='F') > Out[14]: > array([[1, 5], > [4, 3], > [2, 6]]) > > In [15]: array([1,2,3,4,5,6]).reshape((3,2),order='F') > Out[15]: > array([[1, 2], > [3, 4], > [5, 6]]) > This is a bug that was recently fixed in SVN. -Travis |
From: Charles R H. <cha...@gm...> - 2006-10-18 01:47:23
|
On 10/17/06, Lisandro Dalcin <da...@gm...> wrote: > > I was surprised by this > > In [14]: array([[1,2,3],[4,5,6]]).reshape((3,2),order='F') > Out[14]: > array([[1, 5], > [4, 3], > [2, 6]]) This one still looks wrong. In [15]: array([1,2,3,4,5,6]).reshape((3,2),order='F') > Out[15]: > array([[1, 2], > [3, 4], > [5, 6]]) This one is fixed, In [3]: array([[1,2,3,4,5,6]]).reshape((3,2),order='F') Out[3]: array([[1, 4], [2, 5], [3, 6]]) I also don't understand why a copy is returned if 'F' just fiddles with the indices and strides; the underlying data should be the same, just the view changes. FWIW, I think both examples should be returning views. Chuck |
From: Travis O. <oli...@ie...> - 2006-10-18 01:58:42
|
Charles R Harris wrote: > > > On 10/17/06, *Lisandro Dalcin* <da...@gm... > <mailto:da...@gm...>> wrote: > > I was surprised by this > > In [14]: array([[1,2,3],[4,5,6]]).reshape((3,2),order='F') > Out[14]: > array([[1, 5], > [4, 3], > [2, 6]]) > > > This one still looks wrong. > > In [15]: array([1,2,3,4,5,6]).reshape((3,2),order='F') > Out[15]: > array([[1, 2], > [3, 4], > [5, 6]]) > > > > This one is fixed, > > In [3]: array([[1,2,3,4,5,6]]).reshape((3,2),order='F') > Out[3]: > array([[1, 4], > [2, 5], > [3, 6]]) > > I also don't understand why a copy is returned if 'F' just fiddles > with the indices and strides; the underlying data should be the same, > just the view changes. FWIW, I think both examples should be returning > views. You are right, it doesn't need to. My check is not general enough. It can be challenging to come up with a general way to differentiate the view-vs-copy situation and I struggled with it. In this case, it's the fact that while self->nd > 1, the other dimensions are only of shape 1 and so don't really matter. If you could come up with some kind of striding check that would distinguish the two cases, I would appreciate it. -Travis |
From: Charles R H. <cha...@gm...> - 2006-10-18 02:21:06
|
On 10/17/06, Travis Oliphant <oli...@ie...> wrote: > > Charles R Harris wrote: > > > > > > On 10/17/06, *Lisandro Dalcin* <da...@gm... > > <mailto:da...@gm...>> wrote: > > > > I was surprised by this > > > > In [14]: array([[1,2,3],[4,5,6]]).reshape((3,2),order='F') > > Out[14]: > > array([[1, 5], > > [4, 3], > > [2, 6]]) > > > > > > This one still looks wrong. > > > > In [15]: array([1,2,3,4,5,6]).reshape((3,2),order='F') > > Out[15]: > > array([[1, 2], > > [3, 4], > > [5, 6]]) > > > > > > > > This one is fixed, > > > > In [3]: array([[1,2,3,4,5,6]]).reshape((3,2),order='F') > > Out[3]: > > array([[1, 4], > > [2, 5], > > [3, 6]]) > > > > I also don't understand why a copy is returned if 'F' just fiddles > > with the indices and strides; the underlying data should be the same, > > just the view changes. FWIW, I think both examples should be returning > > views. > > You are right, it doesn't need to. My check is not general enough. > > It can be challenging to come up with a general way to differentiate the > view-vs-copy situation and I struggled with it. In this case, it's the > fact that while self->nd > 1, the other dimensions are only of shape 1 > and so don't really matter. If you could come up with some kind of > striding check that would distinguish the two cases, I would appreciate > it. I suppose the problem is mostly in discontiguous arrays. Hmmm..., this isn't too different that reshaping the transpose. a.reshape((m,n),order='F' ~ a.reshape((n,m)).T.reshape(m,n) for instance: In [26]: a Out[26]: array([[1, 2, 3], [4, 5, 6]]) In [27]: a.reshape((3,2)).T.reshape((2,3)) Out[27]: array([[1, 3, 5], [2, 4, 6]]) In [28]: a.reshape((2,3), order='F') Out[28]: array([[1, 2, 3], [4, 5, 6]]) Where I actually think the second reshape is correct and the third incorrect. This has the advantage that *all* the steps return views. Chuck |
From: Charles R H. <cha...@gm...> - 2006-10-18 02:53:55
|
On 10/17/06, Charles R Harris <cha...@gm...> wrote: > > > > On 10/17/06, Travis Oliphant <oli...@ie...> wrote: > > > > Charles R Harris wrote: > > > > > > > > > On 10/17/06, *Lisandro Dalcin* <da...@gm... > > > <mailto:da...@gm...>> wrote: > > > > > > I was surprised by this > > > > > > In [14]: array([[1,2,3],[4,5,6]]).reshape((3,2),order='F') > > > Out[14]: > > > array([[1, 5], > > > [4, 3], > > > [2, 6]]) > > > > > > > > > This one still looks wrong. > > > > > > In [15]: array([1,2,3,4,5,6]).reshape((3,2),order='F') > > > Out[15]: > > > array([[1, 2], > > > [3, 4], > > > [5, 6]]) > > > > > > > > > > > > This one is fixed, > > > > > > In [3]: array([[1,2,3,4,5,6]]).reshape((3,2),order='F') > > > Out[3]: > > > array([[1, 4], > > > [2, 5], > > > [3, 6]]) > > > > > > I also don't understand why a copy is returned if 'F' just fiddles > > > with the indices and strides; the underlying data should be the same, > > > just the view changes. FWIW, I think both examples should be returning > > > > > views. > > > > You are right, it doesn't need to. My check is not general enough. > > > > It can be challenging to come up with a general way to differentiate the > > view-vs-copy situation and I struggled with it. In this case, it's the > > fact that while self->nd > 1, the other dimensions are only of shape 1 > > and so don't really matter. If you could come up with some kind of > > striding check that would distinguish the two cases, I would appreciate > > it. > > > I suppose the problem is mostly in discontiguous arrays. Hmmm..., this > isn't too different that reshaping the transpose. > > a.reshape((m,n),order='F' ~ a.reshape((n,m)).T.reshape(m,n) > Erm, brainfart. Make that a.reshape((m,n),order='F') ~ a.reshape((n,m)).T That's how I think of it, anyway. Actual layout in memory is not affected until something like ascontiguosarray is called. Chuck |
From: Travis O. <oli...@ie...> - 2006-10-18 04:42:30
|
> > You are right, it doesn't need to. My check is not general > enough. > > It can be challenging to come up with a general way to > differentiate the > view-vs-copy situation and I struggled with it. In this case, > it's the > fact that while self->nd > 1, the other dimensions are only of > shape 1 > and so don't really matter. If you could come up with some > kind of > striding check that would distinguish the two cases, I would > appreciate it. > > > I suppose the problem is mostly in discontiguous arrays. Hmmm..., > this isn't too different that reshaping the transpose. > > a.reshape((m,n),order='F' ~ a.reshape((n,m)).T.reshape(m,n) > I just committed a change to the check-for-copy code to SVN. A copy will occur if an actual reshaping is taking place that involves more than just adding or deleting ones from the old shape and if 1) self is not "single-segment". Single-segment means either Fortran-order or C-order. If we don't have a single-segment array, then we have to get a (Fortran or C-) contiguous buffer to do any reshaping (there may be special cases when it's possible, but I haven't figure them out...) 2) self is single-segment but self.squeeze().ndim > 1 and it is in the "wrong" order from what was requested in the order argument (i.e. self is in C-contiguous order but a Fortran-order reshape was requested or self is in F-contiguous order but a C-order reshape was requested). If there are any cases satisfying these rules where a copy does not have to occur then let me know. -Travis |
From: A. M. A. <per...@gm...> - 2006-10-18 14:55:59
|
On 18/10/06, Travis Oliphant <oli...@ie...> wrote: > > I just committed a change to the check-for-copy code to SVN. A copy > will occur if an actual reshaping is taking place that involves more > than just adding or deleting ones from the old shape and if > > 1) self is not "single-segment". Single-segment means either > Fortran-order or C-order. If we don't have a single-segment array, then > we have to get a (Fortran or C-) contiguous buffer to do any reshaping > (there may be special cases when it's possible, but I haven't figure > them out...) Indeed it is often possible to reshape an array that is not contiguous; for example zeros(100)[::10] is not contiguous but can be reshaped to any shape without any copying. More generally, there are all sorts of peculiar arrangements that can be reshaped witout a copy, for example: input has lengths (8,2,3) and strides (39,9,3); output should have lengths (4,4,3,2), so take strides (156,39,6,3) Whether this sort of thing actually *occurs* very often, I don't know... > 2) self is single-segment but self.squeeze().ndim > 1 and it is in the > "wrong" order from what was requested in the order argument (i.e. self > is in C-contiguous order but a Fortran-order reshape was requested or > self is in F-contiguous order but a C-order reshape was requested). > > If there are any cases satisfying these rules where a copy does not have > to occur then let me know. There are. I came up with a general condition for describing exactly when you need to copy an array, but it's difficult to put into words. Here goes: (0) Dimensions of length 1 are an irrelevant annoyance. Conceptually they should be squeezed out before beginning and newaxised back in at the end. (What strides should they have?) Assume that I've done so from now on. (Actual code can just skip them appropriately.) (1) If the array looks like an evenly-strided 1D array that has been reshaped (so strides[i+-1]=lengths[i]*strides[i] for all i) it can always be reshaped without a copy. (For example, zeros(1000)[::10] can be reshaped arbitrarily and as many times as desired without copying.) (2) If the array does *not* look like an evenly-strided 1D array that has been reshaped, you can still reshape it without a copy if it looks like an array of such arrays, each of which can be reshaped separately. (For example, you can view an array of length (8,2,3) that you have to resize to an array of size (4,4,3,2) as two separate pieces: an array of size (2,3) that you have to resize to (3,2) and an array of size (8,) that you need to resize to size (4,4). Each of these pieces separately needs to look like a reshaped 1d array, but there need be no relation between them.) (The condition for when you can pull a smaller resize operation off the end is precisely when you can find a tail segment of each tuple that has the same product; in the case above, 2*3=3*2, so we could stop and do that reshape separately.) (3) If the array cannot be reshaped without a copy using the rules above, it cannot be reshaped without a copy at all. The python code in my previous message actually implements this rule, if you want a less vague description. I should also point out that views are not always more efficient than copies (because of cache concerns, and because a small view can prevent a giant block of memory from being deallocated); nevertheless it should be up to the user to copy when it helps. A. M. Archibald |
From: A. M. A. <per...@gm...> - 2006-10-20 07:01:51
|
On 18/10/06, Travis Oliphant <oli...@ie...> wrote: > If there are any cases satisfying these rules where a copy does not have > to occur then let me know. For example, zeros((4,4))[:,1].reshape((2,2)) need not be copied. I filed a bug in trac and supplied a patch to multiarray.c that avoids copies in PyArray_NewShape unless absolutely necessary. A. M. Archibald |
From: Travis O. <oli...@ee...> - 2006-10-20 07:31:13
|
A. M. Archibald wrote: >On 18/10/06, Travis Oliphant <oli...@ie...> wrote: > > > >>If there are any cases satisfying these rules where a copy does not have >>to occur then let me know. >> >> > >For example, zeros((4,4))[:,1].reshape((2,2)) need not be copied. > >I filed a bug in trac and supplied a patch to multiarray.c that avoids >copies in PyArray_NewShape unless absolutely necessary. > > > Very, very nice. Thanks. |
From: A. M. A. <per...@gm...> - 2006-10-18 05:57:40
Attachments:
stride.py
|
On 17/10/06, Travis Oliphant <oli...@ie...> wrote: > Charles R Harris wrote: > > > > In [3]: array([[1,2,3,4,5,6]]).reshape((3,2),order='F') > > Out[3]: > > array([[1, 4], > > [2, 5], > > [3, 6]]) > > > > I also don't understand why a copy is returned if 'F' just fiddles > > with the indices and strides; the underlying data should be the same, > > just the view changes. FWIW, I think both examples should be returning > > views. > > You are right, it doesn't need to. My check is not general enough. > > It can be challenging to come up with a general way to differentiate the > view-vs-copy situation and I struggled with it. In this case, it's the > fact that while self->nd > 1, the other dimensions are only of shape 1 > and so don't really matter. If you could come up with some kind of > striding check that would distinguish the two cases, I would appreciate it. I wrote, just for exercise I guess, a routine that checks to see if a copy is needed for a reshape (and if not, it computes the resulting strides) for an arbitrary array. (It currently only does a C-order reshape, but F-order would be an easy addition.) It does, however, successfully handle arbitrarily strided arrays with arbitrary arrangements of "1" dimensions having arbitrary strides. I think it also correctly handles 0 strides. I don't imagine you'd want to *use* it, but it does provide a complete solution to copy-less reshaping. I'd be happy to help with the C implementation built into numpy, but for the life of me I can't figure out how it works. A. M. Archibald P.S. I haven't written down a *proof* that this implementation never copies unnecessarily, but I'm pretty sure that it doesn't. It has a reshape_restricted that effectively does a ravel() first; this can't cope with things like length (5,2,3) -> length (5,3,2) unless the strides are conveniently equal, so there's a wrapper, reshape(), that breaks the reshape operation up into pieces that can be done by reshape_restricted. -AMA |
From: Travis O. <oli...@ie...> - 2006-10-18 02:14:15
|
Charles R Harris wrote: > > > On 10/17/06, *Lisandro Dalcin* <da...@gm... > <mailto:da...@gm...>> wrote: > > I was surprised by this > > In [14]: array([[1,2,3],[4,5,6]]).reshape((3,2),order='F') > Out[14]: > array([[1, 5], > [4, 3], > [2, 6]]) > > > This one still looks wrong. Why does this look wrong. What do you want it to be? Perhaps you are thinking about the input array as C-order and the output array as Fortran order. That's not what reshape does. The order argument specifies how you think about both the input and output. Thus, reshape does the equivalent of a Fortran ravel to [1,4,2,5,3,6] and then a Fortran-order based fill of an empty (3,2) array: giving you the result. -Travis |
From: Charles R H. <cha...@gm...> - 2006-10-18 02:28:39
|
On 10/17/06, Travis Oliphant <oli...@ie...> wrote: > > Charles R Harris wrote: > > > > > > On 10/17/06, *Lisandro Dalcin* <da...@gm... > > <mailto:da...@gm...>> wrote: > > > > I was surprised by this > > > > In [14]: array([[1,2,3],[4,5,6]]).reshape((3,2),order='F') > > Out[14]: > > array([[1, 5], > > [4, 3], > > [2, 6]]) > > > > > > This one still looks wrong. > > Why does this look wrong. What do you want it to be? Perhaps you are > thinking about the input array as C-order and the output array as > Fortran order. That's not what reshape does. The order argument > specifies how you think about both the input and output. > > Thus, reshape does the equivalent of a Fortran ravel to [1,4,2,5,3,6] > and then a Fortran-order based fill of an empty (3,2) array: giving you > the result. Why a Fortran ravel? I am thinking of it as preserving the memory layout, just messing with how it is addressed. Chuck |
From: A. M. A. <per...@gm...> - 2006-10-18 03:25:35
|
On 17/10/06, Charles R Harris <cha...@gm...> wrote: > > > On 10/17/06, Travis Oliphant <oli...@ie...> wrote: > > Thus, reshape does the equivalent of a Fortran ravel to [1,4,2,5,3,6] > > and then a Fortran-order based fill of an empty (3,2) array: giving you > > the result. > > Why a Fortran ravel? I am thinking of it as preserving the memory layout, > just messing with how it is addressed. Because, to first order, the memory layout is totally irrelevant to a python user. array([[1,2,3],[4,5,6]],order='C') array([[1,2,3],[4,5,6]],order='F') should be nearly identical to the python user. If the storage order mattered, you'd have to know, every time you used reshape, what order your matrix was stored in (did it come from a transpose operation, for example?). As for why a Fortran ravel rather than a C ravel, that's a simplifying decision. If you like, you can think of reshape as happening in two steps: o1='C' o2='C' A.reshape((6,),order=o1).reshape((2,3),order=o2) The design could have allowed, as Travis Oliphant says, o1!=o2, but explaining that would have been quite difficult (even more than now, I mean). And in any case you can use the code above to achieve that, if you really want. A. M. Archibald |
From: Charles R H. <cha...@gm...> - 2006-10-18 04:26:19
|
On 10/17/06, A. M. Archibald <per...@gm...> wrote: > > On 17/10/06, Charles R Harris <cha...@gm...> wrote: > > > > > > On 10/17/06, Travis Oliphant <oli...@ie...> wrote: > > > > Thus, reshape does the equivalent of a Fortran ravel to [1,4,2,5,3,6] > > > and then a Fortran-order based fill of an empty (3,2) array: giving > you > > > the result. > > > > Why a Fortran ravel? I am thinking of it as preserving the memory > layout, > > just messing with how it is addressed. > > Because, to first order, the memory layout is totally irrelevant to a > python user. > > array([[1,2,3],[4,5,6]],order='C') > array([[1,2,3],[4,5,6]],order='F') > > should be nearly identical to the python user. So what is the point of having a fortran layout if things are not actually layed out in fortran order? As far as I can see, memory layout is the only reason for fortran ordering. Now I can sort of see where the fortran ravel comes from, because the result can be passed to a fortran routine. And it does look like a.ravel('F') is the same as a.T.ravel(). Hmmm. Now I wonder about this: In [62]: array([[1,2,3],[4,5,6]], dtype=int8, order='F').flags Out[62]: CONTIGUOUS : False FORTRAN : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False Now, either CONTIGUOUS is in error, because it really *is* fortran contiguous (but not c contiguous), or the array cannot be passed to a fortran routine that expects fortran ordering, or CONTIGUOUS refers to C addressing, which we already know is not contiguous, in which case we feel uncertain. Note that knowing the answer matters if I want to call a fortran routine with this by pulling out the data pointer. The fortran routine could be in a library, or maybe in the LaPack wrapper, but whatever, it hasn't been wrapped by f2py that takes care of such details. This also looks fishy: In [93]: asfortranarray(array([[1,2,3],[4,5,6]], dtype=int8)) Out[93]: array([[1, 2, 3], [4, 5, 6]], dtype=int8) In [96]: asfortranarray(array([[1,2,3],[4,5,6]], dtype=int8)).flags Out[96]: CONTIGUOUS : False FORTRAN : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False because the docstring says: asfortranarray(a, dtype=None) Return 'a' as an array laid out in Fortran-order in memory. Which doesn't seem to be the case here. I am beginning to wonder if we really need fortran order, seems that a few well chosen interface routines would fill the need and avoid much confusion. Chuck |
From: Charles R H. <cha...@gm...> - 2006-10-18 04:56:07
|
On 10/17/06, Charles R Harris <cha...@gm...> wrote: > > > > On 10/17/06, A. M. Archibald <per...@gm...> wrote: > > > > On 17/10/06, Charles R Harris <cha...@gm...> wrote: > > > > > > > > > On 10/17/06, Travis Oliphant <oli...@ie... > wrote: > > <snip> Which doesn't seem to be the case here. I am beginning to wonder if we > really need fortran order, seems that a few well chosen interface routines > would fill the need and avoid much confusion. > For instance, it would be nice if flatten took an order keyword: In [107]: array([[1,2,3],[4,5,6]], dtype=int8, order='F').flatten() Out[107]: array([1, 2, 3, 4, 5, 6], dtype=int8) Chuck |
From: Travis O. <oli...@ie...> - 2006-10-18 17:18:11
|
Charles R Harris wrote: > > > On 10/17/06, *Charles R Harris* <cha...@gm... > <mailto:cha...@gm...>> wrote: > > > > On 10/17/06, *A. M. Archibald* < per...@gm... > <mailto:per...@gm...>> wrote: > > On 17/10/06, Charles R Harris <cha...@gm... > <mailto:cha...@gm...>> wrote: > > > > > > On 10/17/06, Travis Oliphant < oli...@ie... > <mailto:oli...@ie...>> wrote: > > > <snip> > > Which doesn't seem to be the case here. I am beginning to wonder > if we really need fortran order, seems that a few well chosen > interface routines would fill the need and avoid much confusion. > > > For instance, it would be nice if flatten took an order keyword: > > In [107]: array([[1,2,3],[4,5,6]], dtype=int8, order='F').flatten() > Out[107]: array([1, 2, 3, 4, 5, 6], dtype=int8) It does take an argument (just not a keyword argument). The general rule I followed (probably inappropriately) was that single-argument methods didn't need keywords. so a.flatten('F') gives you a Fortran-order flattening. -Travis |
From: A. M. A. <per...@gm...> - 2006-10-18 14:30:10
|
On 18/10/06, Charles R Harris <cha...@gm...> wrote: > > > On 10/17/06, A. M. Archibald <per...@gm...> wrote: > > On 17/10/06, Charles R Harris <cha...@gm...> wrote: > > > > > > > > > On 10/17/06, Travis Oliphant <oli...@ie... > wrote: > > > > > > Thus, reshape does the equivalent of a Fortran ravel to [1,4,2,5,3,6] > > > > and then a Fortran-order based fill of an empty (3,2) array: giving > you > > > > the result. > > > > > > Why a Fortran ravel? I am thinking of it as preserving the memory > layout, > > > just messing with how it is addressed. > > > > Because, to first order, the memory layout is totally irrelevant to a > > python user. > > > > array([[1,2,3],[4,5,6]],order='C') > > array([[1,2,3],[4,5,6]],order='F') > > > > should be nearly identical to the python user. > > So what is the point of having a fortran layout if things are not actually > layed out in fortran order? As far as I can see, memory layout is the only > reason for fortran ordering. Now I can sort of see where the fortran ravel > comes from, because the result can be passed to a fortran routine. And it > does look like a.ravel('F') is the same as a.T.ravel(). Hmmm. Now I wonder > about this: Thins are laid out in Fortran order if you request Fortran order upon array creation. You just can't see it, normally. Numpy tries hard to make arrays of all different orders look identical. Of course, the reason I said "to first order" is that when determining when to copy and when not to, or when passing arrays to C or Fortran functions, sometimes it does matter. For example, a typical Fortran linear algebra routine needs its arrays to be in Fortran order in memory. > In [62]: array([[1,2,3],[4,5,6]], dtype=int8, order='F').flags > Out[62]: > CONTIGUOUS : False > FORTRAN : True > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False Unless you're working with C extensions, it doesn't make much sense to ever look at flags. The memory layout does not affect normal array behaviour (including reshape). A. M. Archibald |
From: Bill B. <wb...@gm...> - 2006-10-18 01:30:27
|
I think the answer to #3 is wrong. >From 1.0rc2 I get: >>> array([1,2,3,4,5,6],order='C').reshape((2,3),order='F') array([[1, 2, 3], [4, 5, 6]]) But the quiz wants me to answer something different. --bb |
From: Stefan v. d. W. <st...@su...> - 2006-10-18 01:43:01
|
On Wed, Oct 18, 2006 at 10:30:26AM +0900, Bill Baxter wrote: > I think the answer to #3 is wrong. >=20 > >From 1.0rc2 I get: > >>> array([1,2,3,4,5,6],order=3D'C').reshape((2,3),order=3D'F') > array([[1, 2, 3], > [4, 5, 6]]) >=20 > But the quiz wants me to answer something different. This recently changed. The quiz is correct for r3348. Cheers St=E9fan |
From: Travis O. <oli...@ie...> - 2006-10-18 01:52:11
|
Stefan van der Walt wrote: > Hi all, > > Some of you may have seen the interesting thread on Fortran-ordering > earlier. I thought it might be fun to set up a short quiz which tests > your knowledge on the topic. > > If you're up for the challenge, take a look at > > http://mentat.za.net/numpy/quiz > > I won't be held liable for any emotional trauma caused, self-inflicted > wounds or brain-core meltdowns. > Cute (especially the comment if you get them all right). I'm not sure if this quiz is a veiled complaint about the rules for Fortran ordering or not ;-) In my mind the Fortran ordering rules are consistent (if not completely bug-free). You just have to get the right idea of what is meant by the order argument when you use it. If you think you are having trouble figuring out the rules, think of the trouble it was to figure out what they should be and then to code them up. Two rules help you pass the quiz with a perfect score: 1) On array construction, the order argument allows you to specify how the array will be organized in memory. This has no effect on what is printed as the user doesn't usually care how the array is stored in memory. So, you can ignore all order= expressions in the array construct for the quiz 2) On reshaping, the order argument specifies how you think the array is organized. Whenever you make a significant reshape you are telling the computer to re-interpret the chunk of data in a different way, it makes a big difference as to how you think about that chunk of data. Do you think of it as organized rows-first (C-order) or columns-first (Fortran-order). The order argument allows you to specify how you think about it and indicates the 1-d indexing order of the array. It also fills in the newly-shaped array in exactly that same order. Semantically, one could technically separate those two concepts and have one order argument that specifies how you think about the input and another that specifies how you think about the output. But, I really didn't want to go there and couldn't see a real advantage to that. So, the single order argument specifies how you think about both. -Travis |
From: Travis O. <oli...@ie...> - 2006-10-18 17:35:05
|
> > Currently, the key operation is reshape, which only needs to return a > view in fortran order and doesn't even need to mark the resulting > array as fortran order because, well, because it works just fine in > numpy as is, it just isn't contiguous. If the other functions took > shape and order, reshape wouldn't even need the order keyword. The flag is the there as a quick check for interfacing. The order keyword grew because it was useful to avoid the arbitrariness of C-contiguous order for those who prefer to think of it differently. Remember the .T attribute for .transpose() was a recent addition and sticking .transpose() everywhere is a lot more ugly. But, yes, many uses of the order keyword could be replaced by preceding with .transpose() --- this is not without cost, however. > > I don't see why the array constructor needs the order keyword, it > doesn't *do* anything. For instance > > a = array([[1,2,3],[4,5,6]], order='F') > > doesn't produce a fortran contiguous array, it produces the same array > as the 'C' form, just sets the fortran flag and marks contiguous as > False. What is the use of that? It is just a generic non-contiguous > numpy array. What? You're not understanding something. The order flag definitely does something here. First of all it seems like you are not understanding the meaning of the CONTIGUOUS flag. CONTIGUOUS means "C-order contiguous" while FORTRAN means "FORTRAN-order contiguous". That's why I use the word single-segment to talk about FORTRAN-order or C-contiguous order. For Numeric, CONTIGUOUS always meant C-order contiguous and we are continuing that tradition. All we've done is notice that there is such a think as FORTRAN-order contiguous and copies do not need to be made in all circumstances when you have FORTRAN-order. Look at the difference between: a = array([[1,2,3],[4,5,6]],order='F').data[:] b = array([[1,2,3],[4,5,6]]).data[:] Notice the layout is definitely different between a and b. > And > > In [131]: ascontiguousarray(array([[1,2,3],[4,5,6]], dtype=int8, > order='F')).flags > Out[131]: > CONTIGUOUS : True > FORTRAN : False > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > Doesn't produce a fortran contiguous array, so what use was the flag? And Because you requested a C-contiguous array --- that's what contiguous means in NumPy (exactly what it meant in Numeric). > > In [141]: array([1,2,3,4,5,6], dtype=int8).reshape((2,3), > order='F').astype(int16).flags > Out[141]: > CONTIGUOUS : True > FORTRAN : False > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > reorders stuff in memory, so is a bug looking to happen in a fortran > interface. Yes, like I said before, all kinds of operations alter the "layout" of data. You can't assume all operations will preserve FORTRAN ordering. FORTRAN-order has meaning beyond how the data is actually set out in memory. Sometimes it indicates how you think it is layed out when you are doing re-shaping operations. > > mmapped files are the only thing I can think of where one might want > vary an operation depending on Fortran ordering because seeking out of > order is very expensive. But that means adapting algorithms depending > on order type, better I think to just stick to using the small strided > dimensions when appropriate. > > It would be helpful in debugging all this order stuff if it was clear > what was supposed to happen in every case. Ravel, for instance, > ignores the FORTRAN flag, again begging the question as to why we > *have* the flag. No it doesn't. Please show your evidence. Look: a = array([[1,2,3],[4,5,6]]) print a.ravel() [1 2 3 4 5 6] print a.ravel('F') [1 4 2 5 3 6] If it's not working in some cases, please report that as a bug. -Travis |
From: Tim H. <tim...@ie...> - 2006-10-18 17:58:33
|
Charles R Harris wrote: [SNIP] > > I'm not talking about the keyword in the ravel call, I'm talking about > the flag in a. The question is: do we *need* a fortran flag. I am > argueing not, because the only need is for fortran contiguous arrays > to pass to fortran function, or translation from fortran contiguous > arrays to numpy arrays. What I am saying is that things are > unnecessarily complicated. None of the LaPack stuff seems to use the > Fortran stuff, they just transpose and copy. I don't even think I want > to change that, because it is *clear* what is going on. Interfacing to > fortran is all about memory layout, nothing more or less. > Chuck, There are two things here. One is the order keyword and one is the FORTRAN flag. The latter is mainly an optimization for use at the C-level so that one doesn't have to check whether a given array is in contiguous FORTRAN order by examining the strides, in the same way that the CONTIGUOUS flag allows you to skip examining the strides when you need a contiguous C-order matrix. I believe it is the former that you are objecting to, but it would help if you could specify whether you are talking about the order keyword or whether you are talking about the FORTRAN flag. I'll also note that the order keyword could probably have been used to fix a performance problem someone was having a few weeks ago. We ended up transposing the data, but the individual felt that obscured the intent of the algorithm. I believe the same effect could probably have been been achieved without re jiggering the algorithm by using the order parameter. -tim |
From: Tim H. <tim...@ie...> - 2006-10-18 18:27:47
|
Charles R Harris wrote: > > > On 10/18/06, *Tim Hochberg* <tim...@ie... > <mailto:tim...@ie...>> wrote: > > Charles R Harris wrote: > > [SNIP] > > > > I'm not talking about the keyword in the ravel call, I'm talking > about > > the flag in a. The question is: do we *need* a fortran flag. I am > > argueing not, because the only need is for fortran contiguous > arrays > > to pass to fortran function, or translation from fortran contiguous > > arrays to numpy arrays. What I am saying is that things are > > unnecessarily complicated. None of the LaPack stuff seems to use > the > > Fortran stuff, they just transpose and copy. I don't even think > I want > > to change that, because it is *clear* what is going on. > Interfacing to > > fortran is all about memory layout, nothing more or less. > > > > Chuck, > > There are two things here. One is the order keyword and one is the > FORTRAN flag. The latter is mainly an optimization for use at the > C-level so that one doesn't have to check whether a given array is in > contiguous FORTRAN order by examining the strides, in the same way > that > the CONTIGUOUS flag allows you to skip examining the strides when you > need a contiguous C-order matrix. > > > That sounds like the two flags should be named f-contiguous and > c-contiguous. Then they would be orthogonal and one could have all > four combinations. Is that the case now? Perhaps I am misunderstanding > the meaning of the flags. That is the case now. The flag names simply mirror their values in C. Why they have those names in something of a historical accident I believe. Take a look at this: >>> array([1,2,3,4]).flags CONTIGUOUS : True FORTRAN : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >>> array([1,2,3,4])[::2].flags CONTIGUOUS : False FORTRAN : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >>> array([[1,2],[3,4]]).flags CONTIGUOUS : True FORTRAN : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >>> array([[1,2],[3,4]], order='F').flags CONTIGUOUS : False FORTRAN : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False See, all four combinations. I guess my previous post was wrong -- there really are four combinations not three like I said, but they are C-order, Fortran-order, Both and neither. I forgot what the flags signified and had to play with them for a bit to remember. > I believe it is the former that you > are objecting to, but it would help if you could specify whether > you are > talking about the order keyword or whether you are talking about the > FORTRAN flag. > > > Both. I was argueing against the FORTRAN flag, and of limiting the > order keyword to those cases where f or c contiguous arrays were the > output or input. > > I'll also note that the order keyword could probably have been used to > fix a performance problem someone was having a few weeks ago. We > ended > up transposing the data, but the individual felt that obscured the > intent of the algorithm. I believe the same effect could probably have > been been achieved without re jiggering the algorithm by using the > order > parameter. > > > Some more details would be helpful. It would be good to know what > problem the order keyword should solve. > Well, in general, memory layout can be important for performance not just for interfacing with Fortran. You can do this with suitable applications of transpose, but using the order flag is probably clearer. Particularly, if you are trying to match a textbook algorithm, its nice to have the axes in the same places. I'm just moving over from numarray which didn't have the equivalent of the order flag as far as I know, so I don't have experience with this at this point though. Here is one of the posts in questions: > David Cournapeau wrote: > Hi, > > I was wondering if there was any way to speed up the following code: > > y = N.zeros((n, K)) > for i in range(K): > y[:, i] = gauss_den(data, mu[i, :], va[i, :]) > > Where K is of order 1e1, n of order 1e5. Normally, gauss_den is a > quite expensive function, but the profiler tells me that the indexing > y[:,i] takes almost as much time as the gauss_den computation (which > computes n exp !). To see if the profiler is "right", i replaces with > the (non valid) following function: > > y = N.zeros((n, K)) > for i in range(K): > yt = gauss_den(data, mu[i, :], va[i, :]) > return y > > Where more than 99% of the code is spent inside gauss_den. > > I guess the problem is coming from the fact that y being C order, y[:, > i] needs accessing data in a non 'linear' way. Is there a way to speed > this up ? I did something like this: > > y = N.zeros((K, n)) > for i in range(K): > y[i] = gauss_den(data, mu[i, :], va[i, :]) > return y.T > > which works, but I don't like it very much. I believe that the same efficiency as the last could have been achieved using something like: y = N.zeros((n,K), order='F') for i in range(K): y[:,i] = gauss_den(data, mu[i, :], va[i, :]) return y This probably would have made the original poster happier. -tim |