From: Daniel M. <dm...@gm...> - 2006-10-08 05:36:20
|
Is there a 'loop free' way to do this in Numeric for i in arange(l): a[b[i]]+=c[i] where l == len(b) == len(c) thanks Daniel |
From: Bill B. <wb...@gm...> - 2006-10-08 05:49:02
|
Yes, that'd be a[b] += c On 10/8/06, Daniel Mahler <dm...@gm...> wrote: > Is there a 'loop free' way to do this in Numeric > > for i in arange(l): > a[b[i]]+=c[i] > > where l == len(b) == len(c) > > thanks > Daniel |
From: Daniel M. <dm...@gm...> - 2006-10-08 06:03:40
|
Thanks Bill. Thats what I was hoping for, but I get >>> a array([0, 0]) >>> b array([0, 1, 0, 1, 0]) >>> c array([1, 1, 1, 1, 1]) >>> a[b]+=c Traceback (most recent call last): File "<stdin>", line 1, in ? IndexError: invalid index whereas i would like to get array([3, 2]) On 10/8/06, Bill Baxter <wb...@gm...> wrote: > Yes, that'd be > a[b] += c > > On 10/8/06, Daniel Mahler <dm...@gm...> wrote: > > Is there a 'loop free' way to do this in Numeric > > > > for i in arange(l): > > a[b[i]]+=c[i] > > > > where l == len(b) == len(c) > > > > thanks > > Daniel > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Bill B. <wb...@gm...> - 2006-10-08 06:17:31
|
Yeh, I spoke too soon. Tried a little example and it seemed to work. I don't get a traceback here, but your example doesn't work as expected, either. I get [1,1] as the answer with numpy 1.0rc1. Probably it should be an exception, though. It seems to work if len(b)<=len(a) and when no indices are repeated in b. --bb On 10/8/06, Daniel Mahler <dm...@gm...> wrote: > Thanks Bill. > Thats what I was hoping for, > but I get > > >>> a > array([0, 0]) > >>> b > array([0, 1, 0, 1, 0]) > >>> c > array([1, 1, 1, 1, 1]) > >>> a[b]+=c > Traceback (most recent call last): > File "<stdin>", line 1, in ? > IndexError: invalid index > > whereas i would like to get > > array([3, 2]) > > > > On 10/8/06, Bill Baxter <wb...@gm...> wrote: > > Yes, that'd be > > a[b] += c > > > > On 10/8/06, Daniel Mahler <dm...@gm...> wrote: > > > Is there a 'loop free' way to do this in Numeric > > > > > > for i in arange(l): > > > a[b[i]]+=c[i] > > > > > > where l == len(b) == len(c) > > > > > > thanks > > > Daniel > > > > ------------------------------------------------------------------------- > > Take Surveys. Earn Cash. Influence the Future of IT > > Join SourceForge.net's Techsay panel and you'll get the chance to share your > > opinions on IT & business topics through brief surveys -- and earn cash > > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > > _______________________________________________ > > Numpy-discussion mailing list > > Num...@li... > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Greg W. <gre...@gm...> - 2006-10-08 22:20:40
|
On 10/8/06, Daniel Mahler <dm...@gm...> wrote: > > >>> a > array([0, 0]) > >>> b > array([0, 1, 0, 1, 0]) > >>> c > array([1, 1, 1, 1, 1]) > Well for this particular example you could do a=array([len(b)-sum(b), sum(b)]) Since you are just counting the ones and zeros. This next one is a little closer for the case when c is not just a bunch of 1's but you still have to know how the highest number in b. a=array([sum(c[b==0]), sum(c[b==1]), ... sum(c[b==N]) ] ) So it sort of depends on your ultimate goal. Greg -- Linux. Because rebooting is for adding hardware. |
From: Daniel M. <dm...@gm...> - 2006-10-09 05:23:19
|
On 10/8/06, Greg Willden <gre...@gm...> wrote: > On 10/8/06, Daniel Mahler <dm...@gm...> wrote: > > > > >>> a > > array([0, 0]) > > >>> b > > array([0, 1, 0, 1, 0]) > > >>> c > > array([1, 1, 1, 1, 1]) > > > > > Well for this particular example you could do > a=array([len(b)-sum(b), sum(b)]) > Since you are just counting the ones and zeros. > > This next one is a little closer for the case when c is not just a bunch of > 1's but you still have to know how the highest number in b. > a=array([sum(c[b==0]), sum(c[b==1]), ... sum(c[b==N]) ] ) > > So it sort of depends on your ultimate goal. > Greg > Linux. Because rebooting is for adding hardware. In my case all a, b, c are large with b and c being orders of magnitude lareger than a. b is known to contain only, but potentially any, a-indexes, reapeated many times. c contains arbitray floats. essentially it is to compute class totals as in total[class[i]] += value[i] Daniel |
From: Robert K. <rob...@gm...> - 2006-10-08 22:59:32
|
Bill Baxter wrote: > Yes, that'd be > a[b] += c No, I'm afraid that fancy indexing does not do the loop that you are thinking it would (and for reasons that we've discussed previously on this list, *can't* do that loop). That statement reduces to something like the following: tmp = a[b] tmp = tmp.__iadd__(c) a[b] = tmp In [1]: from numpy import * In [2]: a = array([0, 0]) In [3]: b = array([0, 1, 0, 1, 0]) In [4]: c = array([1, 1, 1, 1, 1]) In [5]: a[b] += c In [6]: a Out[6]: array([1, 1]) In [7]: a = array([0, 0]) In [8]: tmp = a[b] In [9]: tmp Out[9]: array([0, 0, 0, 0, 0]) In [10]: tmp = tmp.__iadd__(c) In [11]: tmp Out[11]: array([1, 1, 1, 1, 1]) In [12]: a[b] = tmp In [13]: a Out[13]: array([1, 1]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: Bill B. <wb...@gm...> - 2006-10-08 23:07:06
|
So what's the answer then? Can it be made faster? --bb On 10/9/06, Robert Kern <rob...@gm...> wrote: > > Bill Baxter wrote: > > Yes, that'd be > > a[b] += c > > No, I'm afraid that fancy indexing does not do the loop that you are > thinking it > would (and for reasons that we've discussed previously on this list, > *can't* do > that loop). That statement reduces to something like the following: > > tmp = a[b] > tmp = tmp.__iadd__(c) > a[b] = tmp > > > In [1]: from numpy import * > > In [2]: a = array([0, 0]) > > In [3]: b = array([0, 1, 0, 1, 0]) > > In [4]: c = array([1, 1, 1, 1, 1]) > > In [5]: a[b] += c > > In [6]: a > Out[6]: array([1, 1]) > > In [7]: a = array([0, 0]) > > In [8]: tmp = a[b] > > In [9]: tmp > Out[9]: array([0, 0, 0, 0, 0]) > > In [10]: tmp = tmp.__iadd__(c) > > In [11]: tmp > Out[11]: array([1, 1, 1, 1, 1]) > > In [12]: a[b] = tmp > > In [13]: a > Out[13]: array([1, 1]) > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma > that is made terrible by our own mad attempt to interpret it as though > it had > an underlying truth." > -- Umberto Eco > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: A. M. A. <per...@gm...> - 2006-10-08 23:08:31
|
On 08/10/06, Robert Kern <rob...@gm...> wrote: > Bill Baxter wrote: > > Yes, that'd be > > a[b] += c > > No, I'm afraid that fancy indexing does not do the loop that you are thinking it > would (and for reasons that we've discussed previously on this list, *can't* do > that loop). That statement reduces to something like the following: So the question remains, is there a for-loop-free way to do this? (This, specifically, is: for i in range(len(b)): a[b[i]]+=c[i] where b[i] may contain repetitions.) I didn't find one, but came to the conclusion that for loops are not necessarily slower than fancy indexing, so the way to do this one is just to use a for loop. A. M. Archibald |
From: Robert K. <rob...@gm...> - 2006-10-09 05:41:21
|
Daniel Mahler wrote: > On 10/8/06, Greg Willden <gre...@gm...> wrote: >> This next one is a little closer for the case when c is not just a bunch of >> 1's but you still have to know how the highest number in b. >> a=array([sum(c[b==0]), sum(c[b==1]), ... sum(c[b==N]) ] ) >> >> So it sort of depends on your ultimate goal. >> Greg >> Linux. Because rebooting is for adding hardware. > > In my case all a, b, c are large with b and c being orders of > magnitude lareger than a. > b is known to contain only, but potentially any, a-indexes, reapeated > many times. > c contains arbitray floats. > essentially it is to compute class totals > as in total[class[i]] += value[i] In that case, a slight modification to Greg's suggestion will probably be fastest: import numpy as np # Set up the problem. lena = 10 lenc = 10000 a = np.zeros(lena, dtype=float) b = np.random.randint(lena, size=lenc) c = np.random.uniform(size=lenc) idx = np.arange(lena, dtype=int)[:, np.newaxis] mask = (b == idx) for i in range(lena): a[i] = c[b[i]].sum() -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: A. M. A. <per...@gm...> - 2006-10-09 18:30:09
|
On 09/10/06, Robert Kern <rob...@gm...> wrote: > Daniel Mahler wrote: > > In my case all a, b, c are large with b and c being orders of > > magnitude lareger than a. > > b is known to contain only, but potentially any, a-indexes, reapeated > > many times. > > c contains arbitray floats. > > essentially it is to compute class totals > > as in total[class[i]] += value[i] > > In that case, a slight modification to Greg's suggestion will probably be fastest: If a is even moderately large and you don't care what's left behind in b and c you will probably accelerate the process by sorting b and c together (for cache coherency in a) This seems like a rather common operation - I know I've needed it on at least two occasions - is it worth creating some sort of C implementation? What is the appropriate generalization? A. M. Archibald |
From: Charles R H. <cha...@gm...> - 2006-10-09 18:57:47
|
On 10/9/06, A. M. Archibald <per...@gm...> wrote: > > On 09/10/06, Robert Kern <rob...@gm...> wrote: > > Daniel Mahler wrote: > > > In my case all a, b, c are large with b and c being orders of > > > magnitude lareger than a. > > > b is known to contain only, but potentially any, a-indexes, reapeated > > > many times. > > > c contains arbitray floats. > > > essentially it is to compute class totals > > > as in total[class[i]] += value[i] > > > > In that case, a slight modification to Greg's suggestion will probably > be fastest: > > If a is even moderately large and you don't care what's left behind in > b and c you will probably accelerate the process by sorting b and c > together (for cache coherency in a) > > This seems like a rather common operation - I know I've needed it on > at least two occasions - is it worth creating some sort of C > implementation? What is the appropriate generalization? Some sort of indirect addressing infrastructure. But it looks like this could be tricky to make safe, it would need to do bounds checking at the least and would probably work best with a contiguous array as the target. I could see some sort of low-level function called argassign(target, indirect index, source) that could be used to build more complicated things in python. Chuck A. M. Archibald > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Johannes L. <a.u...@gm...> - 2006-10-10 08:03:41
|
Hi, > > This seems like a rather common operation - I know I've needed it on > > at least two occasions - is it worth creating some sort of C > > implementation? What is the appropriate generalization? > > Some sort of indirect addressing infrastructure. But it looks like this > could be tricky to make safe, it would need to do bounds checking at the > least and would probably work best with a contiguous array as the target. I > could see some sort of low-level function called argassign(target, indirect > index, source) that could be used to build more complicated things in > python. This looks somehow like the behaviour of builtin map. One could do map(fn, index) with appropriate fn. But iirc this is not faster than a for loop if fn is not a builtin function. An infrastructure like you imagine might use a similar syntax (with underlying C funcs). The main point is, how to tell it which operation to perform (add, multiply, average, whatever). Implementing a bunch of functions add_argassign, ... whatever_argassign contradicts my understanding of "generalized". ;) Maybe it would be simpler to just have functions which handle the index arrays in advance. An example will show it best: index = array([1, 2, 4, 2, 3, 1]) # 1 and 2 occur twice data = array([1, 1, 1, 1, 1, 1]) newindex, newdata = filter_and_add(index, data) # the kind of function I mean print newindex --> array([1, 2, 4, 3]) # duplicates have been removed print newdata --> array([2, 2, 1, 1]) # corresponding entries have been added a[newindex] += newdata Johannes |
From: A. M. A. <per...@gm...> - 2006-10-09 20:00:02
|
> > > > c contains arbitray floats. > > > > essentially it is to compute class totals > > > > as in total[class[i]] += value[i] > > This seems like a rather common operation - I know I've needed it on > > at least two occasions - is it worth creating some sort of C > > implementation? What is the appropriate generalization? > > Some sort of indirect addressing infrastructure. But it looks like this > could be tricky to make safe, it would need to do bounds checking at the > least and would probably work best with a contiguous array as the target. I > could see some sort of low-level function called argassign(target, indirect > index, source) that could be used to build more complicated things in > python. If it were only assignment that was needed, fancy indexing could already handle it. The problem is that this is something that can't *quite* be done with the current fancy indexing infrastructure - every time an index comes up we want to add the value to what's there, rather than replacing it. I suppose histogram covers one major application; in fact if histogram allowed weightings ("count this point as -0.6") it would solve the OP's problem. A. M. Archibald |
From: Charles R H. <cha...@gm...> - 2006-10-09 20:14:54
|
On 10/9/06, A. M. Archibald <per...@gm...> wrote: > > > > > > c contains arbitray floats. > > > > > essentially it is to compute class totals > > > > > as in total[class[i]] += value[i] > > > > This seems like a rather common operation - I know I've needed it on > > > at least two occasions - is it worth creating some sort of C > > > implementation? What is the appropriate generalization? > > > > Some sort of indirect addressing infrastructure. But it looks like this > > could be tricky to make safe, it would need to do bounds checking at the > > least and would probably work best with a contiguous array as the > target. I > > could see some sort of low-level function called argassign(target, > indirect > > index, source) that could be used to build more complicated things in > > python. > > If it were only assignment that was needed, fancy indexing could > already handle it. The problem is that this is something that can't > *quite* be done with the current fancy indexing infrastructure - every > time an index comes up we want to add the value to what's there, > rather than replacing it. I suppose histogram covers one major > application; in fact if histogram allowed weightings ("count this > point as -0.6") it would solve the OP's problem. Sure, just add functions arg_addassign, etc., which means dest[ind[i]] += src[i], just as arg_assign would mean dest[ind[i]] = src[i]. If you covered all the assign variants I think you could do most everything. Upper level python routines could deal with shaping and such while the lower level routines dealt with flat, contiguous arrays. Chuck |