You can subscribe to this list here.
2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
From: Todd M. <jm...@st...> - 2003-02-18 12:27:17
|
Francesc Alted wrote: >Hi, > >I'm trying to map Numeric character typecode ('c') to chararrays, but I have >a problem to distinguish between > >In [109]: chararray.array("qqqq") >Out[109]: CharArray(['qqqq']) > >and > >In [110]: chararray.array(["qqqq"]) # Note the extra "[" "]" >Out[110]: CharArray(['qqqq']) # The same result as 109 > > The chararray API pre-dates our awareness, ultimate implemenation, and final rejection of rank-0 arrays. In retrospect, your usage above makes sense. Whether we change things now or not is another matter. You are giving me interface angst... :) You can create rank-0 arrays by specifying shape=() and itemsize=len(buffer). However, these do not repr correctly (unless you update from CVS). >while in Numeric we have: > >In [113]: Numeric.array("qqqq") >Out[113]: array([q, q, q, q],'c') > >In [114]: Numeric.array(["qqqq"]) >Out[114]: array([ [q, q, q, q]],'c') # Differs from 113 > >even in numarray objects, rank-0 seems to work well: > >In [107]: numarray.array(1) >Out[107]: array(1) > >In [108]: numarray.array([1]) >Out[108]: array([1]) # Objects differ > This was not always so, be we made it work when we thought rank-0 had something to offer. After some discussion on numpy-discussion-list, rank-0 went out of vogue. > > >So, it seems like if chararray does not support well rank-0 objects. > That is true. CharArray never caught up because rank-0 became vestigal even for NumArray. >Is this >the expected behavior?. > Yes. But, rank-0 support for chararray is not far off, with the possible exception of breaking the public interface. >If yes, we have no possibility to distinguish >between object 109 and 110, and I'd like to distinguish between this two. > Why exactly do you need rank-0? >What can be done to achieve this? > 1. Add a little special casing to chararray._charArrayToStringList() to handle rank-0. I did this already in CVS. 2. Debate whether or not to change chararray.array() to work as you've shown above. Proceed from there. > >Thanks, > > > |
From: Tim H. <tim...@ie...> - 2003-02-17 21:05:15
|
The good news is Psymeric now supports complex numbers and inplace addition and complex numbers (Complex32 and Complex64). Also by doing some tuning, I got the overhead of Psymeric down to less than three times that of Numeric (versus 20 times in the version of Numarray that I have). Even without Psyco, the code only has overhead of five and half times that of Numeric, so it seems that the Numarray folks should at least be able to get down to that level without throwing everything into C. I have not been able to increase the asymptoptic speed and I think I'm probably stuck on that front for the time being. For the most part Psymeric is close to Numeric for large arrays, which makes it about 50% faster than numarray for noncontiguous and half as fast for contiguous arrays. These timings are for Float64: for Int8, Psymeric is ~3x slower than Numeric and for Int16 it's 50% slower, for Int32 2x slower. Psymeric is very slow for Float32 and Complex32 (~10x slower than Numeric) beacause of some Psyco issues with array.arrays and floats which I expect will be fixed at some point. And finally, for Complex64, psymeric is comparable to Numeric for addition and subtraction, but almost half as fast for multiplication and almost a third as fast for division. Barring some further improvements in Psyco or some new insights on my part, this is probably as far as I'll go with this. At this point, it would probably not be hard to make this into a work alike for Numeric or Numarray (excluding the various extension modules: FFT and the like). The one relatively hard part still outstanding it ufunc.accumulate/reduce. However, the performance while very impressive for an essentially pure python solution is not good enough to motivate me to use this in preference to Numeric. If anyone is interested in looking at the code, I'd be happy to send it to them. Regards, -tim |
From: Francesc A. <fa...@op...> - 2003-02-17 18:40:45
|
Hi, I'm trying to map Numeric character typecode ('c') to chararrays, but I h= ave a problem to distinguish between In [109]: chararray.array("qqqq") Out[109]: CharArray(['qqqq']) and In [110]: chararray.array(["qqqq"]) # Note the extra "[" "]" Out[110]: CharArray(['qqqq']) # The same result as 109 while in Numeric we have: In [113]: Numeric.array("qqqq") Out[113]: array([q, q, q, q],'c') In [114]: Numeric.array(["qqqq"]) Out[114]: array([ [q, q, q, q]],'c') # Differs from 113 even in numarray objects, rank-0 seems to work well: In [107]: numarray.array(1) Out[107]: array(1) In [108]: numarray.array([1]) Out[108]: array([1]) # Objects differ So, it seems like if chararray does not support well rank-0 objects. Is t= his the expected behavior?. If yes, we have no possibility to distinguish between object 109 and 110, and I'd like to distinguish between this two. What can be done to achieve this? Thanks, --=20 Francesc Alted |
From: MRS L E. <ms...@re...> - 2003-02-14 11:01:16
|
Dear sir My name is LOUISA C=2EESTRADA=2CThe wife of Mr=2E JOSEPH ESTRADA=2C the former President of Philippines located in the South East Asia=2E My husband was recently impeached from office by a backed uprising of mass demonstrators and the Senate=2E My husband is presently in jail and facing trial on charges of corruption=2C embezzlement=2C and the mysterious charge of plunder which might lead to death sentence=2E The present government is forcing my husband out of manila to avoid demonstration by his supporters=2E During my husband's regime as president of Philippine=2C I realized some reasonable amount of money from various deals that I successfully executed=2E I have plans to invest this money for my children's future on real estate and industrial production=2E My husband is not aware of this because I wish to do it secretly for now=2E Before my husband was impeached=2C I secretly siphoned the sum of $30=2C000=2C000 million USD =28Thirty million United states dollars=29 out of Philippines and deposited the money with a security firm that transports valuable goods and consignments through diplomatic means=2E I also declared that the consignment was solid gold and my foreign business partner owned it=2E I am contacting you because I want you to go to the security company and claim the money on my behalf since I have declared that the consignment belong to my foreign business partner=2E You shall also be required to assist me in investment in your country=2E I hope to trust you as a God fearing person who will not sit on this money when you claim it=2C rather assist me properly=2C I expect you to declare what percentage of the total money you will take for your assistance=2E When I receive your positive response I will let you know where the security company is and the payment pin code to claim the money which is very important=2E For now=2C let all our communication be by e-mail because my line are right now connected to the Philippines Telecommunication Network services=2E Please also send me your telephone and fax number=2E I will ask my son to contact you to give you more details on after i have received a responce from you=2E Thank you and God bless you and your family=2E MRS LOUISA C=2E ESTRADA |
From: Paul D. <pa...@pf...> - 2003-02-14 03:36:43
|
PEP-242 should be closed. The kinds module will not be added to the = standard library. There was no opposition to the proposal but only mild interest in using = it, not enough to justify adding the module to the standard library. = Instead, it will be made available as a separate distribution item at the Numerical Python site. At the next release of Numerical Python, it will no longer = be a part of the Numeric distribution. |
From: Guido v. R. <gu...@py...> - 2003-02-13 21:14:31
|
The Python 11 Conference is being held July 7-11 in Portland, Oregon as part of OSCON 2003. http://conferences.oreillynet.com/os2003/ The deadline for proposals is February 15th! You only need to have your proposal in this week, you don't need to worry about trying to put together the complete presentation or tutorial materials at this time. Proposal submissions page: http://conferences.oreillynet.com/cs/os2003/create/e_sess Few proposals have been submitted so far, we need many more to have a successful Python 11 conference. If you have submitted a proposal for one of the other Python conferences this year such as PyCon, I encourage you to go ahead and submit the proposal to Python 11 as well. If you are presenting at the Python UK Conference or EuroPython, but are unable to attend Python 11, you should consider having another team member do the presentation. The theme of OSCON 2003 is "Embracing and Extending Proprietary Software". Papers and presentations on how to successfully transition away from proprietary software would also be good, but it is not necessary for your proposal to cover the theme, proposals just need to be related to Python. COMPENSATION: Free registration for speakers (except lightning talks). Tutorial speakers also get: $500 honorarium; $50 per diem on day of tutorial; 1 night hotel; airfare. O'REILLY ANNOUNCEMENT: 2003 O'Reilly Open Source Convention Call For Participation Embracing and Extending Proprietary Software http://conferences.oreilly.com/oscon/ O'Reilly & Associates invites programmers, developers, strategists, and technical staff to submit proposals to lead tutorial and conference sessions at the 2003 Open Source Software Convention, slated for July 7-11 in Portland, OR. Proposals are due February 15, 2003. For more information please visit our OSCON website http://conferences.oreilly.com/oscon/ The theme this year is "Embracing and Extending Proprietary Software." Few companies use only one vendor's software on desktops, back office, and servers. Variety in operating systems and applications is becoming the norm, for sound financial and technical reasons. With variety comes the need for open unencumbered standards for data exchange and service interoperability. You can address the theme from any angle you like--for example, you might talk about migrating away from commercial software such as Microsoft Windows, or instead place your emphasis on coexistence. Convention Conferences Perl Conference 7 The Python 11 Conference PHP Conference 3 Convention Tracks Apache XML Applications MySQL and PostgreSQL Ruby --Guido van Rossum (home page: http://www.python.org/~guido/) |
From: Francesc A. <fa...@op...> - 2003-02-12 13:03:17
|
Hi, Some days ago, I've also done some benchmarks on this issue, and I think that could be good to share my results.=20 I'm basically reproducing the figures of Tim, although that with a difference still bigger in favour of Numeric for small matrix (2x2). The benchmarks are made with a combination of Python and Pyrex (in order to t= est also some functions in Numeric and numarray C API). The figures I'm getting are, roughly: Matrix multiplication: In Python: matrixmultiply (double(2,2) x double(2,)) in Numeric: 70 us matrixmultiply (double(2,2) x double(2,)) in numarray: 4800 us In Pyrex: numarray multiply in Pyrex, using NA_InputArray: 620 us numarray multiply in Pyrex, using PyObject_AsWriteBuffer: 146 us zeros: In Python: double(2,) in Numeric: 58 us double(2,) in numarray: 3100 us In Pyrex (using PyArray_FromDims): double(2,) with Numeric: 26 us double(2,) with numarray: 730 us As, you can see, in pure Python, numarray has a factor of 50 (for zeros) = and up to 70 (for matrix multiply) times more overhead than Numeric. Increasi= ng the matrix to a 200x20 the overhead difference falls down to factor of 16 (for matrix multiply) and 50 (for zeros) always in favor of Numeric. With Pyrex (i.e. making the C calls), the differences are not so big, but there is still a difference. In particular, when assuming a contiguous matrix and calling PyObject_AsWriteBuffer directly upon the object._data memory buffer, the factor falls down to 2. Increasing the matrix to 200x2= 0, the overhead for zeros (using PyArray_FromDims) is the same for numarray than Numeric (around 700 us), while multiply in Pyrex can't beat the matrixmultiply in Numeric (Numeric is still 2 times faster). Hope that helps. I also can send you my testbeds if you are interested in= =2E --=20 Francesc Alted |
From: Karthikesh R. <ka...@ja...> - 2003-02-12 09:31:04
|
Hi All, i was using view.py, that comes with NumPy. Somehow view.py slows down the whole ipython shell. Most of the times the program crashes dumping core. The version of my Numeric is 21.3. Is there some new version of view.py? Is there something needed? Or rather is there some better viewer for viewing images in python ( specifically for viewing image processing images). Best regards karthik ----------------------------------------------------------------------- Karthikesh Raju, email: ka...@ja... Researcher, http://www.cis.hut.fi/karthik Helsinki University of Technology, Tel: +358-9-451 5389 Laboratory of Comp. & Info. Sc., Fax: +358-9-451 3277 Department of Computer Sc., P.O Box 5400, FIN 02015 HUT, Espoo, FINLAND ----------------------------------------------------------------------- |
From: Francesc A. <fa...@op...> - 2003-02-12 09:24:33
|
Hi, Here is the Greg's reply to my questions. It seems like Pyrex is not goin= g to change in these two issues. Well, at least he considered the first to = be an "interesting" idea. Cheers, ---------- Missatge transm=E8s ---------- Subject: Re: A couple of questions on Pyrex Date: Wed, 12 Feb 2003 19:23:01 +1300 (NZDT) From: Greg Ewing <gr...@co...> To: fa...@op... > numbuf =3D data[2:30:4][1] > > in order to get a copy (in a new memory location) of the memory buffer = in > the selected slice to work with it. Would that be interesting to > implement?. It's an interesting idea, but I think it's getting somewhat beyond the scope of Pyrex. I don't think I'll be trying to implement anything like that in the foreseeable future. The Pyrex compiler is complicated enough already, and I don't want to add anything more that isn't really necessary. > Is (or will be) there any way in Pyrex to automagically create diferent > flavors of this function to deal with different datatypes? Same here, and even more so -- I'm *definitely* not going to re-implement C++ templates! :-) Greg Ewing, Computer Science Dept, +-------------------------------------= -+ University of Canterbury,=09 | A citizen of NewZealandCorp, a=09 | Christchurch, New Zealand=09 | wholly-owned subsidiary of USA Inc. | gr...@co...=09 +--------------------------------------+ ------------------------------------------------------- --=20 Francesc Alted |
From: Tim H. <tim...@ie...> - 2003-02-11 21:04:09
|
Perry Greenfield wrote: >Tim Hochberg writes: > > >> Overhead (c) Overhead (nc) >>TimePerElement (c) TimePerElement (nc) >>NumPy 10 us 10 >>us 85 ps 95 ps >>NumArray 200 us 530 us >>45 ps 135 ps >>Psymeric 50 us 65 >>us 80 ps 80 ps >> >> >>The times shown above are for Float64s and are pretty approximate, and >>they happen to be a particularly favorable array shape for Psymeric. I >>have seen pymeric as much as 50% slower than NumPy for large arrays of >>certain shapes. >> >>The overhead for NumArray is surprisingly large. After doing this >>experiment I'm certainly more sympathetic to Konrad wanting less >>overhead for NumArray before he adopts it. >> >> >> >Wow! Do you really mean picoseconds? I never suspected that >either Numeric or numarray were that fast. ;-) > > My bad, I meant ns. What's a little factor of 10^3 among friends. >Anyway, this issue is timely [Err...]. As it turns out we started > > >looking at ways of improving small array performance a couple weeks >ago and are coming closer to trying out an approach that should >reduce the overhead significantly. > >But I have some questions about your benchmarks. Could you show me >the code that is used to generate the above timings? In particular >I'm interested in the kinds of arrays that are being operated on. >It turns out that that the numarray overhead depends on more than >just contiguity and it isn't obvious to me which case you are testing. > > I'll send you psymeric, including all the tests by private email to avoid cluttering up the list. (Don't worry, it's not huge -- only 750 lines of Python at this point). You can let me know if you find any horrible issues with it. >For example, Todd's benchmarks indicate that numarray's overhead is >about a factor of 5 larger than numpy when the input arrays are >contiguous and of the same type. On the other hand, if the array >is not contiguous or requires a type conversion, the overhead is >much larger. (Also, these cases require blocking loops over large >arrays; we have done nothing yet to optimize the block size or >the speed of that loop.) If you are doing the benchmark on >contiguous, same type arrays, I'd like to get a copy of the benchmark >program to try to see where the disagreement arises. > > Basically, I'm operating on two, random contiguous, 3x3, Float64 arrays.In the noncontiguous case the arrays are indexed using [::2,::2] and [1::2,::2] so these arrays are 2x2 and 1x2. Hmmm, that wasn't intentional, I'm measuring axis stretching as well. However using [::2.::2] for both axes doesn't change things a whole lot. The core timing part looks like this: t0 = clock() if op == '+': c = a + b elif op == '-': c = a - b elif op == '*': c = a * b elif op == '/': c = a / b elif op == '==': c = a==b else: raise ValueError("unknown op %s" % op) t1 = clock() This is done N times, the first M values are thrown away and the remaining values are averaged. Currently N is 3 and M is 1, so not a lot averaging is taking place. >The very preliminary indications are that we should be able to make >numarray overheads approximately 3 times higher for all ufunc cases. >That's still slower, but not by a factor of 20 as shown above. How >much work it would take to reduce it further is unclear (the main >bottleneck at that point appears to be how long it takes to create >new output arrays) > > That's good. I think it's important to get people like Konrad on board and that will require dropping the overhead. >We are still mainly in the analysis and design phase of how to >improve performance for small arrays and block looping. We believe >that this first step will not require moving very much of the >existing Python code into C (but some will be). Hopefully we >will have some working code in a couple weeks. > I hope it goes well. -tim |
From: Perry G. <pe...@st...> - 2003-02-11 20:15:22
|
Tim Hochberg writes: > Overhead (c) Overhead (nc) > TimePerElement (c) TimePerElement (nc) > NumPy 10 us 10 > us 85 ps 95 ps > NumArray 200 us 530 us > 45 ps 135 ps > Psymeric 50 us 65 > us 80 ps 80 ps > > > The times shown above are for Float64s and are pretty approximate, and > they happen to be a particularly favorable array shape for Psymeric. I > have seen pymeric as much as 50% slower than NumPy for large arrays of > certain shapes. > > The overhead for NumArray is surprisingly large. After doing this > experiment I'm certainly more sympathetic to Konrad wanting less > overhead for NumArray before he adopts it. > Wow! Do you really mean picoseconds? I never suspected that either Numeric or numarray were that fast. ;-) Anyway, this issue is timely [Err...]. As it turns out we started looking at ways of improving small array performance a couple weeks ago and are coming closer to trying out an approach that should reduce the overhead significantly. But I have some questions about your benchmarks. Could you show me the code that is used to generate the above timings? In particular I'm interested in the kinds of arrays that are being operated on. It turns out that that the numarray overhead depends on more than just contiguity and it isn't obvious to me which case you are testing. For example, Todd's benchmarks indicate that numarray's overhead is about a factor of 5 larger than numpy when the input arrays are contiguous and of the same type. On the other hand, if the array is not contiguous or requires a type conversion, the overhead is much larger. (Also, these cases require blocking loops over large arrays; we have done nothing yet to optimize the block size or the speed of that loop.) If you are doing the benchmark on contiguous, same type arrays, I'd like to get a copy of the benchmark program to try to see where the disagreement arises. The very preliminary indications are that we should be able to make numarray overheads approximately 3 times higher for all ufunc cases. That's still slower, but not by a factor of 20 as shown above. How much work it would take to reduce it further is unclear (the main bottleneck at that point appears to be how long it takes to create new output arrays) We are still mainly in the analysis and design phase of how to improve performance for small arrays and block looping. We believe that this first step will not require moving very much of the existing Python code into C (but some will be). Hopefully we will have some working code in a couple weeks. Thanks, Perry |
From: Francesc A. <fa...@op...> - 2003-02-11 19:33:46
|
A Dimarts 11 Febrer 2003 18:54, Chris Barker va escriure: > > > > def multMatVec(object a, object b, object c): > > cdef PyArrayObject carra, carrb, carrc > > cdef double *da, *db, *dc > > cdef int i, j > > > > carra =3D NA_InputArray(a, toenum[a._type], C_ARRAY) > > carrb =3D NA_InputArray(b, toenum[b._type], C_ARRAY) > > carrc =3D NA_InputArray(c, toenum[c._type], C_ARRAY) > > da =3D <double *>carra.data > > db =3D <double *>carrb.data > > dc =3D <double *>carrc.data > > dim1 =3D carra.dimensions[0] > > dim2 =3D carra.dimensions[1] > > for i from 0<=3D i < dim1: > > dc[i] =3D 0. > > for j from 0<=3D j < dim2: > > dc[i] =3D dc[i] + da[i*dim2+j] * db[j] > > > > return carrc > > > > > > For me Pyrex is like having Python but with the speed of C. This is w= hy > > I'm so enthusiastic with it. > > That actually looks more like C than Python to me. As soon as I am doin= g > pointer arithmetic, I don't feel like I'm writng Python. Would it be al= l > that much more code in C? Doing that in C implies writing the "glue" code. In the past example, multMatVec is a function *directly* accessible in Python, without any additional declaration. Moreover, you can do in Pyrex the same things you do in python, so you co= uld have written the last piece of code as: def multMatVec(object a, object b, object c): for i in range(a.shape[0]): c[i] =3D 0. for j in range(a.shape[1]): dc[i] =3D dc[i] + da[i][j] * db[j] return c but, of course, you get only Python speed. So, the moral is that C speed is only accessible in Pyrex if you use C li= ke types and constructions, it just don't come for free. I just find this wa= y to code to be more elegant than using Swig, or other approaches. But I'm most probably biased because Pyrex is my first (and unique) serious tool = for doing Python extensions. > > > speed. So, if you need speed, always use pointers to your data and us= e a > > bit of pointer arithmetic to access the element you want (look at the > > example). > > Is there really no way to get this to work? > > > Of course, you can also define C arrays if you know the boundaries in > > compilation time and let the compiler do the computations to access y= our > > desired element, but you will need first to copy the data from your > > buffers to the C-array, and perhaps this is a bit inconvenient in som= e > > situations. > > Couldn't you access the data array of the NumArray directly? I do this > all the time with Numeric. Yeah, you can, and both examples shown here (in Numeric and numarray), yo= u are accessing directly the array data buffer, with no copies (whenever yo= ur original array is well-.behaved, of course). > > > Why you are saying that slicing is not supported?. I've checked them = (as > > python expressions, of course) and work well. May be you are referrin= g to > > cdef'd c-typed arrays in Pyrex?. I think this should be a dangerous t= hing > > that can make the pointer arithmetic to slow down because of the > > additional required checks in the slice range. > > Well, there would need to be two value checks per slice. That would be > significant for small slices, but not for large ones, I'd love to have > it. It just doesn't feel like Python without slicing, and it doesn't > feel like NumPy without multi-dimensional slicing. > Again, right now, you can use slicing in Pyrex if you are dealing with Python objects, but from the moment you access to the lower-level Numeric/numarray buffer and assign to a Pyrex C-pointer, you can't do tha= t anymore. That's the price to pay for speed. About implementing slicing in Pyrex C-pointer arithmetics, well, it can b= e worth to ask Greg Ewing, the Pyrex author. I'll send him this particular question and forward his answer (if any) to the list. > > There can be drawbacks, like the one stated by Perry related with how= to > > construct general Ufuncs that can handle many different combinations = of > > arrays and types, although I don't understand that very well because > > Numeric and numarray crews already achieved to do that in C, so why i= t > > cannot be possible with Pyrex?. Mmm, perhaps there is some pre-proces= sor > > involved?. > > I was curious about this comment as well. I have only had success with > writing any of my Numeric based extensions for pre-determined types. If > I had to support additional types (and/or discontiguous and/or rank-N > arrays), I ended up with a whole pile of case and/or if statements. Als= o > kind of slow and inefficient code. > > It seems the only way to do this right is with C++ and templates (eg. > Blitz++), but there are good reasons not to go that route. > > Would it really be any harder to use Pyrex than C for this kind of > thing? Also, would it be possible to take a Pyrex type approach and hav= e > it do someting template-like: you wright the generic code in Pyrex, it > generates all the type-specific C code for you. Well, this is another good question for Greg. I'll try to ask him, althou= gh as I don't have experience on that kind of issues, chances are that my question might result a complete nonsense :). Cheers, --=20 Francesc Alted |
From: Chris B. <Chr...@no...> - 2003-02-11 18:35:07
|
Francesc Alted wrote: > First, define some enum types and headers: Could all this be put into Pyrex? (when NumArray becomes more stable anyway) It's well beyond me to understand it. > I will show here a function to multiply a matrix by a vector (C double > precision): > > def multMatVec(object a, object b, object c): > cdef PyArrayObject carra, carrb, carrc > cdef double *da, *db, *dc > cdef int i, j > > carra = NA_InputArray(a, toenum[a._type], C_ARRAY) > carrb = NA_InputArray(b, toenum[b._type], C_ARRAY) > carrc = NA_InputArray(c, toenum[c._type], C_ARRAY) > da = <double *>carra.data > db = <double *>carrb.data > dc = <double *>carrc.data > dim1 = carra.dimensions[0] > dim2 = carra.dimensions[1] > for i from 0<= i < dim1: > dc[i] = 0. > for j from 0<= j < dim2: > dc[i] = dc[i] + da[i*dim2+j] * db[j] > > return carrc > For me Pyrex is like having Python but with the speed of C. This is why I'm > so enthusiastic with it. That actually looks more like C than Python to me. As soon as I am doing pointer arithmetic, I don't feel like I'm writng Python. Would it be all that much more code in C? > speed. So, if you need speed, always use pointers to your data and use a bit > of pointer arithmetic to access the element you want (look at the example). Is there really no way to get this to work? > Of course, you can also define C arrays if you know the boundaries in > compilation time and let the compiler do the computations to access your > desired element, but you will need first to copy the data from your buffers > to the C-array, and perhaps this is a bit inconvenient in some situations. Couldn't you access the data array of the NumArray directly? I do this all the time with Numeric. > Why you are saying that slicing is not supported?. I've checked them (as > python expressions, of course) and work well. May be you are referring to > cdef'd c-typed arrays in Pyrex?. I think this should be a dangerous thing > that can make the pointer arithmetic to slow down because of the additional > required checks in the slice range. Well, there would need to be two value checks per slice. That would be significant for small slices, but not for large ones, I'd love to have it. It just doesn't feel like Python without slicing, and it doesn't feel like NumPy without multi-dimensional slicing. > There can be drawbacks, like the one stated by Perry related with how to > construct general Ufuncs that can handle many different combinations of > arrays and types, although I don't understand that very well because Numeric > and numarray crews already achieved to do that in C, so why it cannot be > possible with Pyrex?. Mmm, perhaps there is some pre-processor involved?. I was curious about this comment as well. I have only had success with writing any of my Numeric based extensions for pre-determined types. If I had to support additional types (and/or discontiguous and/or rank-N arrays), I ended up with a whole pile of case and/or if statements. Also kind of slow and inefficient code. It seems the only way to do this right is with C++ and templates (eg. Blitz++), but there are good reasons not to go that route. Would it really be any harder to use Pyrex than C for this kind of thing? Also, would it be possible to take a Pyrex type approach and have it do someting template-like: you wright the generic code in Pyrex, it generates all the type-specific C code for you. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chr...@no... |
From: Francesc A. <fa...@op...> - 2003-02-11 07:23:18
|
A Divendres 07 Febrer 2003 19:33, Chris Barker va escriure: > > Is Pyrex aware of Numeric Arrays? Joachim Saul already answered that, it is. More exactly, Pyrex is not aware of any special object outside the Python standard types, but with a bit of cleverness and patience, you can map an= y object you want to Pyrex. The Numeric array object map just happens to be documented in the FAQ, bu= t I managed to access numarray objects as well. Here is the recipe: First, define some enum types and headers: # Structs and functions from numarray cdef extern from "numarray/numarray.h": ctypedef enum NumRequirements: NUM_CONTIGUOUS NUM_NOTSWAPPED NUM_ALIGNED NUM_WRITABLE NUM_C_ARRAY NUM_UNCONVERTED ctypedef enum NumarrayByteOrder: NUM_LITTLE_ENDIAN NUM_BIG_ENDIAN cdef enum: UNCONVERTED C_ARRAY ctypedef enum NumarrayType: tAny tBool=09 tInt8 tUInt8 tInt16 tUInt16 tInt32 tUInt32 tInt64 tUInt64 tFloat32 tFloat64 tComplex32 tComplex64 tObject tDefault tLong =20 # Declaration for the PyArrayObject =20 struct PyArray_Descr: int type_num, elsize char type =20 ctypedef class PyArrayObject [type PyArray_Type]: # Compatibility with Numeric cdef char *data cdef int nd cdef int *dimensions, *strides cdef object base cdef PyArray_Descr *descr cdef int flags # New attributes for numarray objects cdef object _data # object must meet buffer API */ cdef object _shadows # ill-behaved original array. */ cdef int nstrides # elements in strides array */ cdef long byteoffset # offset into buffer where array data begin= s */ cdef long bytestride # basic seperation of elements in bytes */ cdef long itemsize # length of 1 element in bytes */ cdef char byteorder # NUM_BIG_ENDIAN, NUM_LITTLE_ENDIAN */ cdef char _aligned # test override flag */ cdef char _contiguous # test override flag */ void import_array() =20 # The Numeric API requires this function to be called before # using any Numeric facilities in an extension module. import_array() Then, declare the API routines you want to use: cdef extern from "numarray/libnumarray.h": PyArrayObject NA_InputArray (object, NumarrayType, int) PyArrayObject NA_OutputArray (object, NumarrayType, int) PyArrayObject NA_IoArray (object, NumarrayType, int) PyArrayObject NA_Empty(int nd, int *d, NumarrayType type) object PyArray_FromDims(int nd, int *d, NumarrayType type) define now a couple of maps between C enum types and Python numarrar type classes: # Conversion tables from/to classes to the numarray enum types toenum =3D {numarray.Int8:tInt8, numarray.UInt8:tUInt8, numarray.Int16:tInt16, numarray.UInt16:tUInt16, numarray.Int32:tInt32, numarray.UInt32:tUInt32, numarray.Float32:tFloat32, numarray.Float64:tFloat64, } toclass =3D {} for (key, value) in toenum.items(): toclass[value] =3D key ok. you are on the way. We can finally define our user funtion; for examp= le, I will show here a function to multiply a matrix by a vector (C double precision): def multMatVec(object a, object b, object c): cdef PyArrayObject carra, carrb, carrc cdef double *da, *db, *dc cdef int i, j =20 carra =3D NA_InputArray(a, toenum[a._type], C_ARRAY) carrb =3D NA_InputArray(b, toenum[b._type], C_ARRAY) carrc =3D NA_InputArray(c, toenum[c._type], C_ARRAY) da =3D <double *>carra.data db =3D <double *>carrb.data dc =3D <double *>carrc.data dim1 =3D carra.dimensions[0] dim2 =3D carra.dimensions[1] for i from 0<=3D i < dim1: dc[i] =3D 0. for j from 0<=3D j < dim2: dc[i] =3D dc[i] + da[i*dim2+j] * db[j] =20 return carrc where NA_InputArray is a high-level numarray API that ensures that the object retrieved is a well-behaved array, and not mis-aligned, discontigu= ous or whatever.=20 Maybe at first glance such a procedure would seem obscure, but it is not.= I find it to be quite elegant. Look at the "for i from 0<=3D i < dim1:" construction. We could have used= the more pythonic form: "for i in range(dim1):", but by using the former, the Pyrex compiler is able to produce a loop in plain C, so achieving C-speed= on this piece of code. Of course, you must be aware to not introduce Python objects inside the loop, or all the potential speed-up improvement will vanish. But, with a bit of practice, this is easy to avoid. For me Pyrex is like having Python but with the speed of C. This is why I= 'm so enthusiastic with it. > > I imagine it could use them just fine, using the generic Python sequenc= e > get item stuff, but that would be a whole lot lower performance than if > it understood the Numeric API and could access the data array directly. > Also, how does it deal with multiple dimension indexing ( array[3,6,2] = ) > which the standard python sequence types do not support? In general, you can access sequence objects like in Python (and I've just checked that extended slicing *is* supported, I don't know why Joachim wa= s saying that not; perhaps he was meaning Pyrex C-arrays?), but at Python speed. So, if you need speed, always use pointers to your data and use a = bit of pointer arithmetic to access the element you want (look at the example= ). Of course, you can also define C arrays if you know the boundaries in compilation time and let the compiler do the computations to access your desired element, but you will need first to copy the data from your buffe= rs to the C-array, and perhaps this is a bit inconvenient in some situations= =2E > As I think about this, I think your suggestion is fabulous. Pyrex (or a > Pyrex-like) language would be a fabulous way to write code for NumArray= , > if it really made use of the NumArray API. There can be drawbacks, like the one stated by Perry related with how to construct general Ufuncs that can handle many different combinations of arrays and types, although I don't understand that very well because Nume= ric and numarray crews already achieved to do that in C, so why it cannot be possible with Pyrex?. Mmm, perhaps there is some pre-processor involved?. Cheers, --=20 Francesc Alted |
From: Francesc A. <fa...@op...> - 2003-02-11 07:23:11
|
A Dissabte 08 Febrer 2003 11:54, Joachim Saul va escriure: > Please check out the Pyrex doc. It's actually very easy right now, > *if* you can live without "sequence operators" such as slicing, > list comprehensions... but this is going to be supported, again > according to the doc. Why you are saying that slicing is not supported?. I've checked them (as python expressions, of course) and work well. May be you are referring to cdef'd c-typed arrays in Pyrex?. I think this should be a dangerous thing that can make the pointer arithmetic to slow down because of the addition= al required checks in the slice range. > For example, I may call (C-like) > > arr =3D PyArray_FromDims(1, &n, PyArray_DOUBLE) > > but could have also used a corresponding Python construct like > > from Numeric import zeros > arr =3D zeros(n, 'd') > > I expect the latter to be slower (not tested), but one can take > Python code "as is" and "compile" it using Pyrex. I was curious about that and tested it in my Pentium 4 @ 2 GHz laptop and for small n (just to look for overhead). The C-like call takes 26 us and = the Python-like takes 52 us. Generally speaking, you can expect an overhead of 20 us (a bit more as yo= u pass more parameters) calling Python functions (or Python-like functions inside Pyrex) from Pyrex, compared to when you use a C-API to call the corresponding C function. In fact, calling a C-function (or a cdef Pyrex function) from Pyrex takes no more time than calling from C to C: on my laptop both scores at 0.5 us. The fact that calling C functions from Pyrex has not a significant overhe= ad (compared with calls from C to C) plus the fact that Pyrex offers a C integer loop makes Pyrex so appealing for linear algebra optimizations, n= ot only as a "glue" language. Another advantage is that with Pyrex you can define classes with a mix of C-type and Python-type attributes. This can be very handy to obtain a compact representation of objects (whenever you do not need to access the C-typed ones from Python, but anyway, you can always use accessors if needed). Cheers, -- Francesc Alted |
From: Magnus L. H. <ma...@he...> - 2003-02-11 03:37:20
|
I think perhaps I've asked this before -- but is there any reason why the average() function from MA can't be copied (without the mask stuff) to numarray? Maybe it's too trivial (unlike in the masked case)...? It just seems like a generally useful function to have... -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones |
From: Magnus L. H. <ma...@he...> - 2003-02-10 23:34:33
|
Todd Miller <jm...@st...>: > [snip] > It looks like a bug which resulted from Numeric compatability additions. > For backwards compatability with Numeric, I added the "sequence" > keyword as a synonym for the numarray "buffer" keyword. We're in the > process of getting rid of (deprecating) "buffer". When it's gone (a > couple releases), we can remove the default parameter to sequence and > the bug. OK -- but even until then, wouldn't it be possible to add a simple check for whether any arguments have been supplied? (Not a big priority, I guess :) > Todd -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones |
From: Todd M. <jm...@st...> - 2003-02-10 22:29:41
|
Magnus Lie Hetland wrote: >Is this a bug, or is there a motivation behind it? > > > >>>>from numarray import array >>>>array() >>>> >>>> >>>> > >IOW: Why is array callable without any arguments when it doesn't >return anything? E.g. if I call array(**kwds) with some dictionary, >I'd expect an exception (since a default array isn't really possible) >if kwds were empty... Or? > >(I'm using 0.4 -- for some reason I can't get the cvs version to >compile on Solaris.) > > > It looks like a bug which resulted from Numeric compatability additions. For backwards compatability with Numeric, I added the "sequence" keyword as a synonym for the numarray "buffer" keyword. We're in the process of getting rid of (deprecating) "buffer". When it's gone (a couple releases), we can remove the default parameter to sequence and the bug. Todd |
From: Magnus L. H. <ma...@he...> - 2003-02-10 21:45:56
|
Is this a bug, or is there a motivation behind it? >>> from numarray import array >>> array() >>> IOW: Why is array callable without any arguments when it doesn't return anything? E.g. if I call array(**kwds) with some dictionary, I'd expect an exception (since a default array isn't really possible) if kwds were empty... Or? (I'm using 0.4 -- for some reason I can't get the cvs version to compile on Solaris.) -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones |
From: Magnus L. H. <ma...@he...> - 2003-02-10 21:42:26
|
Paul F Dubois <pa...@pf...>: > > The problem with na=EFve benchmarks is that they *are* na=EFve. Indeed. My request was for a more dependable analysis. > In real applications you have a lot of arrays running around, and so > a full cache shows up with smaller array sizes. Because of this, > measuring performance is a really difficult matter. Indeed. I guess what I'm curious about is the motivation behind the array module... It seems to be mainly conserving memory -- or? --=20 Magnus Lie Hetland "Nothing shocks me. I'm a scientist."=20 http://hetland.org -- Indiana Jones |
From: Paul F D. <pa...@pf...> - 2003-02-10 21:39:38
|
The problem with na=EFve benchmarks is that they *are* na=EFve. In real applications you have a lot of arrays running around, and so a full = cache shows up with smaller array sizes. Because of this, measuring = performance is a really difficult matter. |
From: Magnus L. H. <ma...@he...> - 2003-02-10 21:02:42
|
Tim Hochberg <tim...@ie...>: > [snip] In my continued quest, I found this: http://www.penguin.it/pipermail/python/2002-October/001917.html It sums up (in Italian, though) the great memory advantage of arrays. (Might be a good idea to be explicit about this in the docs, perhaps... Hm.) > The reason I'm using arrays in psymeric are twofold. One is memory > usage. Right. > The other reason is that Psyco likes arrays > (http://arigo.tunes.org/psyco-preview/psycoguide/node26.html). I sort of thought that might be a reason... :) > In fact it was this note " The speed of a complex algorithm using an > array as buffer (like manipulating an image pixel-by-pixel) should > be very high; closer to C than plain Python." that led me to start > playing around with psymeric. I see. > Just for grins I disabled psyco and reran some tests on psymeric. > Instead of comporable speed to NumPy, the speed drops to about 25x > slower. Yikes! > I actually would have expected it to be worse, but the drop off is > still pretty steep. Indeed... Hm... If only we could have Psyco for non-x86 platforms... Oh, well. I guess we will, some day. :) > -tim -- Magnus Lie Hetland "Nothing shocks me. I'm a scientist." http://hetland.org -- Indiana Jones |
From: Magnus L. H. <ma...@he...> - 2003-02-10 20:37:48
|
Just curious: What is the main strength of the array module in the standard library? Is it footprint/memory usage? Is it speed? If so, at what sizes? I ran some simple benchmarks (creating a list/array, iterating over them to sum up their elements, and extracting the slice foo[::2]) and got the following rations (array_time/list_time) for various sizes: Size 100: Creation: 1.13482142857 Sum: 1.54649265905 Slicing: 1.53736654804 Size 1000: Creation: 1.62444133147 Sum: 1.18439932835 Slicing: 1.56350184957 Size 10000: Creation: 1.61642712328 Sum: 1.47768567821 Slicing: 1.45889354599 Size 100000: Creation: 1.72711084285 Sum: 0.952593142445 Slicing: 1.05782341361 Size 1000000: Creation: 1.56617139425 Sum: 0.735687066032 Slicing: 0.773219364465 Size 10000000: Creation: 1.57903195174 Sum: 0.727253180418 Slicing: 0.726005428022 These benchmarks are pretty na=EFve, but it seems to me that unless you're working with quite large arrays, there is no great advantage to using arrays rather than lists... (I'm not including numarray or Numeric in the equation here -- I just raise the issue because of the use of arrays in Psymeric...) Just curious... --=20 Magnus Lie Hetland "Nothing shocks me. I'm a scientist."=20 http://hetland.org -- Indiana Jones |
From: Tim H. <tim...@ie...> - 2003-02-10 16:52:07
|
Chris Barker wrote: >Tim Hochberg wrote: > > >>Psyco seems fairly stable these days. However it's one of those things >>that probably needs to get a larger cabal of users to shake the bugs out >>of it. I still only use it to play around with because all things that I >>need speed from I end up doing in Numeric anyway. >> >> > >Hmmm. It always just seemed too bleeding edge for me to want to drop it >in inplace of my current Python, but maybe I should try... > > I think Psyco was a reworked interpreter at some point, but it isn't any longer. Now it's just an extension module. You typically use it like this: def some_function_that_needs_to_be_fast(...): .... psyco.bind(some_function_that_needs_to_be_fast) Of course, it's still possible to bomb the interpreter with Psyco and it's a huge memory hog if you bind a lot of functions. On the other hand in the course of playing with psymeric I found one way to crash the interpreter with Psyco, one way with Numeric, and one way to cause Numarray to fail, although this did not crash the interpreter. So if I was keeping a tally of evil bugs, they'd all be tied right now.... >>For Psyco at least you don't need a multidimensional type. You can get >>good results with flat array, in particular array.array. The number I >>posted earlier showed comparable performance for Numeric and a >>multidimensional array type written all in python and psycoized. >> >> > >What about non-contiguous arrays? Also, you pointed out yourself that >you are still looking at a factor of two slowdown, it would be nice to >get rid of that. > > Non contiguous arrays are easy to build on top of contiguous arrays, psymeric works with noncontiguous arrays now. If you'd like, I can send you some code. The factor of two slowdown is an issue. A bigger issue is that only x86 platforms are supported. Also there is not support for things like byteswapped and nonalligned arrays. There also might be problems getting the exception handling right. If this approach were to be done "right" for heavy duty number cruncher types, it would require a more capable, c-based, core buffer object, with most other things written in python and psycoized. This begins to sounds a lot like what you would get if you put a lot of psyco.bind calls into the python parts of Numarray now. On the other hand, it's possible some interesting stuff will come out of the PyPy project that will make this thing possible in pure Python. I'm watching that project wit interest. I did some more tuning of the Psymeric code to reduce overhead and this is what the speed situation is now. This is complicated to compare, since the relative speeds depend on both the array type and shaps but one can get a general feel for things by looking at two things: the overhead, that is the time it takes to operate on very small arrays, and the asymptotic time/element for large arrays. These numbers differ substantially for contiguous and noncontiguous arrays but there relative values are fairly constant across types. That gives four numbers: Overhead (c) Overhead (nc) TimePerElement (c) TimePerElement (nc) NumPy 10 us 10 us 85 ps 95 ps NumArray 200 us 530 us 45 ps 135 ps Psymeric 50 us 65 us 80 ps 80 ps The times shown above are for Float64s and are pretty approximate, and they happen to be a particularly favorable array shape for Psymeric. I have seen pymeric as much as 50% slower than NumPy for large arrays of certain shapes. The overhead for NumArray is suprisingly large. After doing this experiment I'm certainly more sympathetic to Konrad wanting less overhead for NumArray before he adopts it. -tim |
From: John E. <ja...@zh...> - 2003-02-08 23:10:14
|
Perry Greenfield wrote: > Both psyco and pyrex have some great aspects. But I think > it is worth a little reflection on what can and can't be > expected of them. I'm basically ignorant of both; I know > a little about them, but haven't used them. if anything I > say is wrong, please correct me. I'm going to make some > comments based on inferred characteristics of them that > could well be wrong. I'd like to suggest to anyone interested in these ideas that they take a look a the pypython/minimal-python mailing list: http://codespeak.net/mailman/listinfo/pypy-dev > Psyco is very cool and seems the answer to many dreams. > But consider the cost. From what I can infer, it obtains > its performance enhancements at least in part by constructing > machine code on the fly from the Python code. In other > words it is performing aspects of running on particular > processors that is usually relegated to C compilers by > Python. > > I'd guess that the price is the far greater difficulty of > maintaining such capability across many processor types. > It also likely increases the complexity of the implementation > of Python, perhaps making it much harder to change and > enhance. Even without it handling things that are needed > for array processing, how likely is it that it will be > accepted as the standard implementation for Python for > these reasons alone. The hope is that quite the opposite of just about every one of these points will be true. That once Python is reimplemented in Python, with psycho as a backend jit-like compiler, it will decrease the complexity of the implementation. Making it much easier to change and enhance. I tend to be quite optimistic about the potential for pypython and psycho. I think the added work of the platform dependent psycho modules will be offset by the rest of the system being written in Python. -- John Eikenberry [ja...@zh... - http://zhar.net] ______________________________________________________________ "Perfection (in design) is achieved not when there is nothing more to add, but rather when there is nothing more to take away." -- Antoine de Saint-Exupery |