[Numpy-discussion] filter a recarray

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

hi, i have a recarray of  > 60K  records and i'm wondering if there's a
numpy/vectorized way to the following.

get a new array where there will be unique column0 + column1 rows with the
row that remains being chosen because it has the highest value in the last
column. so in the paste below, there are 4
'AT1G01070', 'AT1G11450' rows, but i only want to keep the row:
('AT1G01070', 'AT1G11450', 78.660003662109375, 717, 140, 5, 1129L, 1838L,
1098L, 1808L, 1e-168, 592.0)

because 592.0 is the highest value.  i can do this using a hash and looping,
but i'm guessing there's a more efficient way. any tips?
thanks.
-brent

[  ('AT1G01030', 'AT1G13260', 70.339996337890625, 263, 75, 1, 835L, 1094L,
778L, 1040L, 5.9999999999999998e-38, 158.0)
 ('AT1G01040', 'AT1G01040', 100.0, 8019, 0, 0, 1L, 8019L, 1L, 8019L, 0.0,
12620.0)
 ('AT1G01050', 'AT1G01050', 100.0, 1968, 0, 0, 1L, 1968L, 1L, 1968L, 0.0,
3090.0)
 ('AT1G01060', 'AT1G01060', 100.0, 4175, 0, 0, 1L, 4175L, 1L, 4175L, 0.0,
6355.0)
 ('AT1G01070', 'AT1G01070', 100.0, 2192, 0, 0, 1L, 2192L, 1L, 2192L, 0.0,
3426.0)
 ('AT1G01070', 'AT1G11450', 78.660003662109375, 717, 140, 5, 1129L, 1838L,
1098L, 1808L, 1e-168, 592.0)
 ('AT1G01070', 'AT1G11450', 79.870002746582031, 303, 51, 2, 1L, 293L, 1L,
303L, 1.9999999999999999e-67 , 256.0)
 ('AT1G01070', 'AT1G11450', 88.400001525878906, 181, 21, 0, 587L, 767L,
724L, 904L, 6e-57, 221.0)
 ('AT1G01070', 'AT1G11450', 83.209999084472656, 131, 19, 1, 1875L, 2002L,
1878L, 2008L, 1.9999999999999999e-28 , 126.0)
 ('AT1G01070', 'AT1G11460', 82.919998168945312, 480, 77, 3, 1129L, 1607L,
1216L, 1691L, 6.9999999999999999e-132, 470.0)]

[Numpy-discussion] filter a recarray

A package for scientific computing with Python

[Numpy-discussion] filter a recarray