From: Erin S. <eri...@gm...> - 2006-11-12 23:56:33
|
Hi all- I want to take the result from a database query, and create a numpy array with field names and types corresponding to the returned columns. The DBI 2.0 compliant interfaces return lists of lists. E.g. [[94137100072000193L, 94, 345.57215100000002, -0.83673208099999996], [94137100072000368L, 94, 345.60217299999999, -0.83766954299999996], .... [94137100083000157L, 94, 347.21668099999999, -0.83572582399999995], [94137100084000045L, 94, 347.45524799999998, -0.829750074]] But the only examples I have found for creating an inhomogeneous array with fields involves lists of tuples. e.g. >>> mydescriptor = {'names': ('gender','age','weight'), 'formats':('S1', >>> 'f4', 'f4')} >>> a = array([('M',64.0,75.0),('F',25.0,60.0)], dtype=mydescriptor) Trying something like this with a list of lists results in the following error: TypeError: expected a readable buffer object Now I could create the array and run a loop, copying in, but this would be less efficient. Is there a way to do this in one step? Thanks, Erin |
From: Erin S. <eri...@gm...> - 2006-11-13 00:09:27
|
I have to not ammend my statement a bit: DBI 2.0 actually returns a lists of tuples, which would work. It appears to just be pgdb, the postgres interface, that is returning lists of lists. Still, I need to interact with this database. Erin On Sun, Nov 12, 2006 at 06:56:29PM -0500, Erin Sheldon wrote: > Hi all- > > I want to take the result from a database query, > and create a numpy array with field names and types > corresponding to the returned columns. > > The DBI 2.0 compliant interfaces return lists > of lists. E.g. > > > [[94137100072000193L, 94, 345.57215100000002, -0.83673208099999996], > [94137100072000368L, 94, 345.60217299999999, -0.83766954299999996], > .... > [94137100083000157L, 94, 347.21668099999999, -0.83572582399999995], > [94137100084000045L, 94, 347.45524799999998, -0.829750074]] > > > But the only examples I have found for creating an inhomogeneous > array with fields involves lists of tuples. e.g. > > >>> mydescriptor = {'names': ('gender','age','weight'), 'formats':('S1', > >>> 'f4', 'f4')} > >>> a = array([('M',64.0,75.0),('F',25.0,60.0)], dtype=mydescriptor) > > Trying something like this with a list of lists results in > the following error: > > TypeError: expected a readable buffer object > > Now I could create the array and run a loop, copying > in, but this would be less efficient. Is there a way > to do this in one step? > > Thanks, > Erin > |
From: Pierre GM <pgm...@gm...> - 2006-11-13 00:15:02
|
You could try the fromarrays function of numpy.core.records >>> mydescriptor = {'names': (a','b','c','d'), 'formats':('f4', 'f4', 'f4', 'f4')} >>> a = N.core.records.fromarrays(N.transpose(yourlist), dtype=mydescriptor) The 'transpose' function ensures that 'fromarrays' sees 4 arrays (one for each column). |
From: Erin S. <eri...@gm...> - 2006-11-13 00:50:42
|
On 11/12/06, Pierre GM <pgm...@gm...> wrote: > > You could try the fromarrays function of numpy.core.records > > >>> mydescriptor = {'names': (a','b','c','d'), 'formats':('f4', 'f4', 'f4', > 'f4')} > >>> a = N.core.records.fromarrays(N.transpose(yourlist), dtype=mydescriptor) > > The 'transpose' function ensures that 'fromarrays' sees 4 arrays (one for each > column). > That worked as advertised. Thanks. Erin |
From: Erin S. <eri...@gm...> - 2006-11-13 01:10:18
|
On 11/12/06, Erin Sheldon <eri...@gm...> wrote: > On 11/12/06, Pierre GM <pgm...@gm...> wrote: > > > > You could try the fromarrays function of numpy.core.records > > > > >>> mydescriptor = {'names': (a','b','c','d'), 'formats':('f4', 'f4', 'f4', > > 'f4')} > > >>> a = N.core.records.fromarrays(N.transpose(yourlist), dtype=mydescriptor) > > > > The 'transpose' function ensures that 'fromarrays' sees 4 arrays (one for each > > column). Actually, there is a problem with that approach. It first converts the entire array to a single type, by default a floating type. For very large integers this precision is insufficient. For example, I have the following integer in my arrays: 94137100072000193L which ends up as 94137100072000192 after going to a float and then back to an integer. Erin |
From: Tim H. <tim...@ie...> - 2006-11-13 03:08:06
|
Erin Sheldon wrote: > On 11/12/06, Erin Sheldon <eri...@gm...> wrote: > >> On 11/12/06, Pierre GM <pgm...@gm...> wrote: >> >>> You could try the fromarrays function of numpy.core.records >>> >>> >>>>>> mydescriptor = {'names': (a','b','c','d'), 'formats':('f4', 'f4', 'f4', >>>>>> >>> 'f4')} >>> >>>>>> a = N.core.records.fromarrays(N.transpose(yourlist), dtype=mydescriptor) >>>>>> >>> The 'transpose' function ensures that 'fromarrays' sees 4 arrays (one for each >>> column). >>> > > Actually, there is a problem with that approach. It first converts > the entire array to a single type, by default a floating type. For > very large integers this precision is insufficient. For example, I > have the following integer in my arrays: > 94137100072000193L > which ends up as > 94137100072000192 > after going to a float and then back to an integer. > I haven't been following this too closely, but if you need to transpose your data without converting all to one type, I can think of a couple of different approaches: 1. zip(*yourlist) 2. numpy.transpose(numpy.array(yourlist, dtype=object) I haven't tested them though (particularly the second one), so caveat emptor, etc, etc. -tim |
From: Erin S. <eri...@gm...> - 2006-11-13 03:17:23
|
On 11/12/06, Tim Hochberg <tim...@ie...> wrote: > I haven't been following this too closely, but if you need to transpose > your data without converting all to one type, I can think of a couple of > different approaches: > > 1. zip(*yourlist) > 2. numpy.transpose(numpy.array(yourlist, dtype=object) > > I haven't tested them though (particularly the second one), so caveat > emptor, etc, etc. Its not that I want to transpose data. I'm trying to convert the output of a pgdb postgres query into an array with fields and types corresponding to the columns I have selected. The problem is pgdb does not return a list of tuples as it should according to DB 2.0, but instead a list of lists. So numpy.array(lol, dtype=) fails, and so will your solution #2. I don't want to copy the data more than once obviously, so I'm looking for a way to call array() with a lists of lists. Erin |
From: Tim H. <tim...@ie...> - 2006-11-13 04:01:42
|
Erin Sheldon wrote: > On 11/12/06, Tim Hochberg <tim...@ie...> wrote: > >> I haven't been following this too closely, but if you need to transpose >> your data without converting all to one type, I can think of a couple of >> different approaches: >> >> 1. zip(*yourlist) >> 2. numpy.transpose(numpy.array(yourlist, dtype=object) >> >> I haven't tested them though (particularly the second one), so caveat >> emptor, etc, etc. >> > > Its not that I want to transpose data. > > I'm trying to convert the output of a pgdb postgres query into > an array with fields and types corresponding to the columns > I have selected. The problem is pgdb does not return a list > of tuples as it should according to DB 2.0, but instead > a list of lists. So numpy.array(lol, dtype=) fails, and so will your > solution #2. In that case, I suggest just using a list comprehension or map, [tuple(x) for x in lol] for example. > I don't want to copy the data more than once > obviously, so I'm looking for a way to call array() with a lists > of lists. > It's probably pointless to worry about his. You are already allocating 5*N python objects (all those Python floats and integers as well as the lists themselves). I believe the list comprehension above is only going to allocate an additional N objects (the new tuples). Admittedly, the objects aren't all the same size, but in this case they are close enough that I doubt it'll matter. -tim |
From: A. M. A. <per...@gm...> - 2006-11-13 03:18:44
|
On 12/11/06, Erin Sheldon <eri...@gm...> wrote: > Actually, there is a problem with that approach. It first converts > the entire array to a single type, by default a floating type. For > very large integers this precision is insufficient. For example, I > have the following integer in my arrays: > 94137100072000193L > which ends up as > 94137100072000192 > after going to a float and then back to an integer. That's an unfortunate limitation of numpy; it views double-precision floats as higher precision than 64-bit integers, but of course they aren't. If you want to put all your data in a record array, you could try transposing the lists using a list comprehension - numpy is not always as much faster than pure python as it looks. You could then convert that to a list of four arrays and do the assignment as appropriate. Alternatively, you could convert your array into a higher-precision floating-point format (if one is available on your machine) before transposing and storing in a record array. A. M. Archibald |
From: Pierre GM <pgm...@gm...> - 2006-11-13 04:20:51
|
On Sunday 12 November 2006 20:10, Erin Sheldon wrote: > Actually, there is a problem with that approach. It first converts > the entire array to a single type, by default a floating type. As A.M. Archibald suggested, you can use list comprehension: N.array([(a,b,c,d,) for (a,b,c,d) in yourlist], dtype=yourdesc) or N.fromiter(((a,b,c,d) for (a,b,c,d,) in yourlist), dtype=yourdesc) Would you mind trying that, and let us know which one works best ? That could be put on the wiki somewhere... |
From: Robert K. <rob...@gm...> - 2006-11-13 04:28:26
|
Pierre GM wrote: > On Sunday 12 November 2006 20:10, Erin Sheldon wrote: >> Actually, there is a problem with that approach. It first converts >> the entire array to a single type, by default a floating type. > > As A.M. Archibald suggested, you can use list comprehension: > N.array([(a,b,c,d,) for (a,b,c,d) in yourlist], dtype=yourdesc) > > or > > N.fromiter(((a,b,c,d) for (a,b,c,d,) in yourlist), dtype=yourdesc) > > Would you mind trying that, and let us know which one works best ? That could > be put on the wiki somewhere... N.array(map(tuple, yourlist), dtype=yourdesc) is probably the best option. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: Erin S. <eri...@gm...> - 2006-11-13 05:56:31
|
Hi all - Thanks to everyone for the suggestions. I think map(tuple, list) is probably the most compact, but the list comprehension also works well. Because map() is proably going to disappear someday, I'll stick with the list comprehension. array( [tuple(row) for row in result], dtype=dtype) That said, is there some compelling reason that the array function doesn't support this operation? Thanks again, Erin On 11/12/06, Robert Kern <rob...@gm...> wrote: > Pierre GM wrote: > > On Sunday 12 November 2006 20:10, Erin Sheldon wrote: > >> Actually, there is a problem with that approach. It first converts > >> the entire array to a single type, by default a floating type. > > > > As A.M. Archibald suggested, you can use list comprehension: > > N.array([(a,b,c,d,) for (a,b,c,d) in yourlist], dtype=yourdesc) > > > > or > > > > N.fromiter(((a,b,c,d) for (a,b,c,d,) in yourlist), dtype=yourdesc) > > > > Would you mind trying that, and let us know which one works best ? That could > > be put on the wiki somewhere... > > N.array(map(tuple, yourlist), dtype=yourdesc) > > is probably the best option. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless enigma > that is made terrible by our own mad attempt to interpret it as though it had > an underlying truth." > -- Umberto Eco > > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Charles R H. <cha...@gm...> - 2006-11-13 06:18:26
|
On 11/12/06, Erin Sheldon <eri...@gm...> wrote: > > Hi all - > > Thanks to everyone for the suggestions. > I think map(tuple, list) is probably the most compact, > but the list comprehension also works well. > > Because map() is proably going to disappear someday, I'll > stick with the list comprehension. > array( [tuple(row) for row in result], dtype=dtype) > > That said, is there some compelling reason that the array > function doesn't support this operation? My understanding is that the array needs to be allocated up front. Since the list comprehension is iterative it is impossible to know how big the result is going to be. BTW, it might be possible to use fromfile('name', dtype=dtype) to do what you want if the data is stored by rows in a file. Chuck |
From: Erin S. <eri...@gm...> - 2006-11-13 07:07:32
|
On 11/13/06, Charles R Harris <cha...@gm...> wrote: > > > On 11/12/06, Erin Sheldon <eri...@gm...> wrote: > > Hi all - > > > > Thanks to everyone for the suggestions. > > I think map(tuple, list) is probably the most compact, > > but the list comprehension also works well. > > > > Because map() is proably going to disappear someday, I'll > > stick with the list comprehension. > > array( [tuple(row) for row in result], dtype=dtype) > > > > That said, is there some compelling reason that the array > > function doesn't support this operation? > > My understanding is that the array needs to be allocated up front. Since the > list comprehension is iterative it is impossible to know how big the result > is going to be. Isn't it the same with a list of tuples? But you can send that directly to the array constructor. I don't see the fundamental difference, except that the code might be simpler to write. > > BTW, it might be possible to use fromfile('name', dtype=dtype) to do what > you want if the data is stored by rows in a file. I'm reading from a database. Erin |
From: Francesc A. <fa...@ca...> - 2006-11-13 08:19:24
|
El dl 13 de 11 del 2006 a les 02:07 -0500, en/na Erin Sheldon va escriure: > On 11/13/06, Charles R Harris <cha...@gm...> wrote: > > > > > > On 11/12/06, Erin Sheldon <eri...@gm...> wrote: > > > Hi all - > > > > > > Thanks to everyone for the suggestions. > > > I think map(tuple, list) is probably the most compact, > > > but the list comprehension also works well. > > > > > > Because map() is proably going to disappear someday, I'll > > > stick with the list comprehension. > > > array( [tuple(row) for row in result], dtype=dtype) > > > > > > That said, is there some compelling reason that the array > > > function doesn't support this operation? > > > > My understanding is that the array needs to be allocated up front. Since the > > list comprehension is iterative it is impossible to know how big the result > > is going to be. > > Isn't it the same with a list of tuples? But you can send that directly to the > array constructor. I don't see the fundamental difference, except that the > code might be simpler to write. I think that the correct explanation is that Travis has chosen a tuple as the way to refer to a inhomogeneous list of values (a record) and a list as the way to refer to homogenous list of values. I'm not completely sure why he did this, but I guess the reason was to be able to distinguish the records in scenarios where nested records do appear. In any case, you can also use rec.fromrecords for build recarrays from lists of lists. This breaks the aforementioned rule, but Travis allowed this because rec.* had to mimic numarray behaviour as much as possible. Here is an example of use: In [46]:mydescriptor = {'names': ('gender','age','weight'), 'formats':('S1','f4', 'f4')} In [47]:results=[['M',64.0,75.0],['F',25.0,60.0]] In [48]:a = numpy.rec.fromrecords(results, dtype=mydescriptor) In [49]:b = numpy.array([tuple(row) for row in results], dtype=mydescriptor) In [50]:a==b Out[50]:recarray([True, True], dtype=bool) OTOH, it is said in the docs that fromrecords is discouraged because it is somewhat slow, but apparently it has similar performance than using comprehensions lists: In [51]:Timer("numpy.rec.fromrecords(results, dtype=mydescriptor)", "import numpy; results = [['M',64.0,75.0]]*10000; mydescriptor = {'names': ('gender','age','weight'), 'formats':('S1','f4', 'f4')}").repeat(3,10) Out[51]:[0.44204592704772949, 0.43584394454956055, 0.50145101547241211] In [52]:Timer("numpy.array([tuple(row) for row in results], dtype=mydescriptor)", "import numpy; results = [['M',64.0,75.0]]*10000; mydescriptor = {'names': ('gender','age','weight'), 'formats':('S1','f4', 'f4')}").repeat(3,10) Out[52]:[0.49885106086730957, 0.4325258731842041, 0.43297886848449707] HTH, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth |
From: Tim H. <tim...@ie...> - 2006-11-13 15:03:51
|
Francesc Altet wrote: > El dl 13 de 11 del 2006 a les 02:07 -0500, en/na Erin Sheldon va > escriure: > >> On 11/13/06, Charles R Harris <cha...@gm...> wrote: >> >>> On 11/12/06, Erin Sheldon <eri...@gm...> wrote: >>> >>>> Hi all - >>>> >>>> Thanks to everyone for the suggestions. >>>> I think map(tuple, list) is probably the most compact, >>>> but the list comprehension also works well. >>>> >>>> Because map() is proably going to disappear someday, I'll >>>> stick with the list comprehension. >>>> array( [tuple(row) for row in result], dtype=dtype) >>>> >>>> That said, is there some compelling reason that the array >>>> function doesn't support this operation? >>>> >>> My understanding is that the array needs to be allocated up front. Since the >>> list comprehension is iterative it is impossible to know how big the result >>> is going to be. >>> >> Isn't it the same with a list of tuples? But you can send that directly to the >> array constructor. I don't see the fundamental difference, except that the >> code might be simpler to write. >> > > I think that the correct explanation is that Travis has chosen a tuple > as the way to refer to a inhomogeneous list of values (a record) and a > list as the way to refer to homogenous list of values. Just for the record, this is the officially blessed usage of tuple and lists for all of Python (by Guido himself). On the other hand, it's honored more in the breach than in reality. Since other factors, such as mutability/immutability or the mistaken belief that using tuples everywhere will make code noticeably faster or more memory frugal or something. > I'm not > completely sure why he did this, but I guess the reason was to be able > to distinguish the records in scenarios where nested records do appear. > I suspect that this could be made a little more forgiving, without loosing rigor. As long as none of the fields are objects of course in which case nearly all bets are off. Then again, the rule that tuple designate records is a lot simpler than something like tuples designate records, but you can use lists too, unless of course you have an object field in your array, in which case you really need to use tuples, except sometimes lists will work anyway, depending on where the object is fields is. So, maybe it's best just to keep it strict. > In any case, you can also use rec.fromrecords for build recarrays from > lists of lists. This breaks the aforementioned rule, but Travis allowed > this because rec.* had to mimic numarray behaviour as much as possible. > Here is an example of use: > [SNIP] > Just for completeness, I benchmarked the fromiter and map(tuple, results) solutions as well. Map is fastest, followed by fromiter, list comprehension and then fromrecords. The differences are pretty minor however, so I'd stick with whatever seems clearest. -tim print Timer("numpy.rec.fromrecords(results, dtype=mydescriptor)", """import numpy; results = [['M',64.0,75.0]]*100000; mydescriptor = {'names': ('gender','age','weight'), 'formats':('S1','f4','f4')}""").repeat(3,10) print Timer("numpy.array([tuple(row) for row in results], dtype=mydescriptor)", """import numpy; results = [['M',64.0,75.0]]*100000; mydescriptor = {'names': ('gender','age','weight'),'formats':('S1','f4', 'f4')}""").repeat(3,10) print Timer("numpy.fromiter((tuple(x) for x in results), dtype=mydescriptor, count=len(results))", """import numpy; results = [['M',64.0,75.0]]*100000; mydescriptor = {'names': ('gender','age','weight'),'formats':('S1','f4', 'f4')}""").repeat(3,10) print Timer("numpy.array(map(tuple, results), dtype=mydescriptor)", """import numpy; results = [['M',64.0,75.0]]*100000; mydescriptor = {'names': ('gender','age','weight'),'formats':('S1','f4', 'f4')}""").repeat(3,10) ===> [1.3928521641717035, 1.3892659541925021, 1.3949996438094785] [1.344854164425926, 1.3157404083479882, 1.3207066819944986] [1.2768430065832401, 1.2742884919731416, 1.2736657871321633] [1.2081393026208644, 1.2025276955590734, 1.205871416618594] |
From: Erin S. <eri...@gm...> - 2006-11-13 15:10:47
|
On 11/13/06, Francesc Altet <fa...@ca...> wrote: > In any case, you can also use rec.fromrecords for build recarrays from > lists of lists. This breaks the aforementioned rule, but Travis allowed > this because rec.* had to mimic numarray behaviour as much as possible. > Here is an example of use: > > In [46]:mydescriptor = {'names': ('gender','age','weight'), > 'formats':('S1','f4', 'f4')} > In [47]:results=[['M',64.0,75.0],['F',25.0,60.0]] > In [48]:a = numpy.rec.fromrecords(results, dtype=mydescriptor) > In [49]:b = numpy.array([tuple(row) for row in results], > dtype=mydescriptor) > In [50]:a==b > Out[50]:recarray([True, True], dtype=bool) > > OTOH, it is said in the docs that fromrecords is discouraged because it > is somewhat slow, but apparently it has similar performance than using > comprehensions lists: > > In [51]:Timer("numpy.rec.fromrecords(results, dtype=mydescriptor)", > "import numpy; results = [['M',64.0,75.0]]*10000; mydescriptor = > {'names': ('gender','age','weight'), 'formats':('S1','f4', > 'f4')}").repeat(3,10) > Out[51]:[0.44204592704772949, 0.43584394454956055, 0.50145101547241211] > > In [52]:Timer("numpy.array([tuple(row) for row in results], > dtype=mydescriptor)", "import numpy; results = [['M',64.0,75.0]]*10000; > mydescriptor = {'names': ('gender','age','weight'), > 'formats':('S1','f4', 'f4')}").repeat(3,10) > Out[52]:[0.49885106086730957, 0.4325258731842041, 0.43297886848449707] I checked the code. For lists of lists it just creates the recarray and runs a loop copying in the data row by row. The fact that they are of similar speed is actually good news because the list comprehension was making an extra copy of the data in memory. For large memory usage, which is my case, this 50% overhead would have been an issue. Erin |
From: Erin S. <eri...@gm...> - 2006-11-13 19:43:28
|
On 11/13/06, Tim Hochberg <tim...@ie...> wrote: > Here's one more approach that's marginally faster than the map based > solution and also won't chew up an extra memory since it's based on from > iter: > > numpy.fromiter(itertools.imap(tuple, results), dtype=mydescriptor, > count=len(results)) Yes, this is what I need. BTW, there is no doc string for this. I just added an example the the Numpy Example List. Thanks, Erin |
From: Tim H. <tim...@ie...> - 2006-11-13 21:29:36
|
Erin Sheldon wrote: > On 11/13/06, Tim Hochberg <tim...@ie...> wrote: > >> Here's one more approach that's marginally faster than the map based >> solution and also won't chew up an extra memory since it's based on from >> iter: >> >> numpy.fromiter(itertools.imap(tuple, results), dtype=mydescriptor, >> count=len(results)) >> > > Yes, this is what I need. BTW, there is no doc string for this. Yeah, I noticed that too. I swear I wrote one at one point I'm not sure what happened to it. Sigh. > I just added > an example the the Numpy Example List. > Great. -tim |
From: Stefan v. d. W. <st...@su...> - 2006-11-13 21:50:38
|
On Mon, Nov 13, 2006 at 02:29:11PM -0700, Tim Hochberg wrote: > Erin Sheldon wrote: > > On 11/13/06, Tim Hochberg <tim...@ie...> wrote: > > =20 > >> Here's one more approach that's marginally faster than the map based > >> solution and also won't chew up an extra memory since it's based on = from > >> iter: > >> > >> numpy.fromiter(itertools.imap(tuple, results), dtype=3Dmydescriptor, > >> count=3Dlen(results)) > >> =20 > > > > Yes, this is what I need. BTW, there is no doc string for this.=20 > Yeah, I noticed that too. I swear I wrote one at one point I'm not sure= =20 > what happened to it. Sigh. A typo slipped into add_newdocs.py. Fixed in SVN. Cheers St=E9fan |
From: Charles R H. <cha...@gm...> - 2006-11-13 01:15:13
|
On 11/12/06, Erin Sheldon <eri...@gm...> wrote: > > On 11/12/06, Erin Sheldon <eri...@gm...> wrote: > > On 11/12/06, Pierre GM <pgm...@gm...> wrote: > > > > > > You could try the fromarrays function of numpy.core.records > > > > > > >>> mydescriptor = {'names': (a','b','c','d'), 'formats':('f4', 'f4', > 'f4', > > > 'f4')} > > > >>> a = N.core.records.fromarrays(N.transpose(yourlist), > dtype=mydescriptor) > > > > > > The 'transpose' function ensures that 'fromarrays' sees 4 arrays (one > for each > > > column). > > Actually, there is a problem with that approach. It first converts > the entire array to a single type, by default a floating type. For > very large integers this precision is insufficient. For example, I > have the following integer in my arrays: > 94137100072000193L > which ends up as > 94137100072000192 > after going to a float and then back to an integer. Out of curiosity, where does that large integer come from? Chuck |
From: Erin S. <eri...@gm...> - 2006-11-13 01:20:36
|
On 11/12/06, Charles R Harris <cha...@gm...> wrote: > > 94137100072000193L > > which ends up as > > 94137100072000192 > > after going to a float and then back to an integer. > > Out of curiosity, where does that large integer come from? It is a unique object identifier. It is a combination of various numbers related to an astronomical object detected by the sloan digital sky survey. (www.sdss.org). cheers, Erin |
From: Tim H. <tim...@ie...> - 2006-11-13 17:46:24
|
Tim Hochberg wrote: > [SNIP] >> >> > > Just for completeness, I benchmarked the fromiter and map(tuple, > results) solutions as well. Map is fastest, followed by fromiter, list > comprehension and then fromrecords. The differences are pretty minor > however, so I'd stick with whatever seems clearest. > > -tim > > Here's one more approach that's marginally faster than the map based solution and also won't chew up an extra memory since it's based on from iter: numpy.fromiter(itertools.imap(tuple, results), dtype=mydescriptor, count=len(results)) [SNIP] -tim |