From: Charles R H. <cha...@gm...> - 2006-08-29 23:26:36
|
On 8/29/06, Keith Goodman <kwg...@gm...> wrote: > > On 8/29/06, Mathew Yeates <my...@jp...> wrote: > > > I have an M by N array of floats. Associated with the columns are > > character labels > > ['a','b','b','c','d','e','e','e'] note: already sorted so duplicates > > are contiguous > > > > I want to replace the 2 'b' columns with the sum of the 2 columns. > > Similarly, replace the 3 'e' columns with the sum of the 3 'e' columns. > > Make a cumsum of the array. Find the index of the last 'a', last 'b', > etc and make the reduced array from that. Then take the diff of the > columns. > > I know that's vague, but so is my understanding of python/numpy. > > Or even more vague: make a function that does what you want. Or you could use searchsorted on the labels to get a sequence of ranges. What you have is a sort of binning applied to columns instead of values in a vector. Or, if the overhead isn't to much, use a dictionary of with (keys: array) entries. Index thru the columns adding keys, when the key is new insert a column copy, when it is already present add the new column to the old one. Chuck |