You can subscribe to this list here.
2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
From: Ralf J. <jue...@in...> - 2002-04-20 16:58:07
|
I'm currently tinkering with the following problem and what like to hear your suggestions: Within a C module I define a new Python type 'IM' (representing an image).=20 The indexing or slicing facilities of NumPy arrays were tailormade=20 for the manipulation of the internal data of its instances. Thus, I could provide a method 'asarray', which creates a properly typed array object 'a' referring to the data of an IM instance 'im': a =3D im.asarray() I could use PyArray_FromDimsAndData() to create the array instance. Unfortunately, this wouldn't work, since 'a' would not get notified=20 about the death of 'im'. However, if I could prevent 'im' from being garbage collected before all array instances referring to its data are deleted, it should work. NumPy's array type uses a mechanism to prevent garbage collection of array instances if there are other instances that share data with it. My idea was, to use this mechanism, that is to let the asarray method increment im's reference count and let a->base refer to im. Do you think this is a reliable approach? Thanks, Ralf =20 --=20 -------------------------------------------------------------------------- Ralf J=FCngling Institut f=FCr Informatik - Lehrstuhl f=FCr Mustererkennung & Bildverarbeitung Georges-K=F6hler-Allee =20 Geb=E4ude 52 Tel: +49-(0)761-203-8215 79110 Freiburg Fax: +49-(0)761-203-8262 -------------------------------------------------------------------------- |
From: rob <ro...@py...> - 2002-04-19 15:59:05
|
There has been some discussion on the FreeBSD Ports list about an Icc compiled Python. Benchmarks much faster than the normal gcc compiled version. I'm wondering if anyone here knows anything about it. The discussion can be accessed via www.geocrawler.org/ FreeBSD/ freebsd-ports. Rob. -- ----------------------------- The Numeric Python EM Project www.pythonemproject.com |
From: Paul F D. <pa...@pf...> - 2002-04-19 14:26:46
|
Pyfort 7.1 is available at sf.net/projects/pyfortran. Support for single Fortran characters was added. (Michiel de Hoon) Corrected behavior of scalars with C routines. (Michiel de Hoon) Pyfort is a tool for connecting Python to Fortran. Just to let you know, I'm working on a little tool to make it easier to set up simple projects so that you can build and install them with less effort. I hope to have that available soon. |
From: Perry G. <pe...@st...> - 2002-04-18 17:38:12
|
> Now, the next interesting question is how much of the standard graph > algorithms can be implemented with ufuncs and array operations (which > I guess is the key to performance) and not straight for-loops... After > all, some of them are quite sequential. > I'm not sure about that (not being very familiar with graph algorithms). If you can give me some examples (perhaps off the mailing list) I could say whether they are easily cast into ufunc or library calls. Perry |
From: Perry G. <pe...@st...> - 2002-04-18 17:35:36
|
Behalf Of rob > > I'm sorry I missed the original post, but the topic is important for > me. I use the lightweight 3d volume renderer Animabob for most > everything. The interface code is in all of the FDTD programs in my > website. You just unwind a 3d array and scale it to +/- 128, turn it > into chararacters, and you have the input file. I wish Animabob could > somehow be turned into a Python package, as in Windows you need Cygwin > to run it. I've tried other 3d packages like OpenDX, and they seem to > be huge albatrosses. > It sound like you are trying to do something different than Magnus, but if what you are looking to scale floating or int data to byte size and apply some character mapping, numarray (or Numeric) should be able to do that very well. If that is all you want done, you might find either to be overkill though (if you already wrote a C extension to do so). Perry |
From: rob <ro...@py...> - 2002-04-18 16:17:39
|
I'm sorry I missed the original post, but the topic is important for me. I use the lightweight 3d volume renderer Animabob for most everything. The interface code is in all of the FDTD programs in my website. You just unwind a 3d array and scale it to +/- 128, turn it into chararacters, and you have the input file. I wish Animabob could somehow be turned into a Python package, as in Windows you need Cygwin to run it. I've tried other 3d packages like OpenDX, and they seem to be huge albatrosses. -- ----------------------------- The Numeric Python EM Project www.pythonemproject.com |
From: Magnus L. H. <ma...@he...> - 2002-04-18 15:47:56
|
Perry Greenfield <pe...@st...>: > [snip] > > In relation to what? Using dictionaries etc? Using the array module? > > No, in relation to operations on a 10K array. Basically, if an operation > on a 10K array spends half its time on set up, operations on a > 10 element array may only be twice as fast. I'm not making any claims > about speed in relation to any other data structure (other than Numeric) Aaah! Sorry to be so dense :) But the speedup in numeric between different sizes isn't as important to me as the speedup compared to other solutions (such as a dict-based one) of course... If a 10 element array is only twice as fast as a 10K array that's no problem if it's still faster than an alternative solution (though I'm sure it might not be...) The same goes for 10K element graphs -- the interesting point has to be whether it's faster than various alternatives (which I'm sure it is). > > [snip] > > > Before I go further, I need to find out if the preceeding has made > > > you gasp in horror or if the timescale is too slow for you to > > > accept. > > > > Hm. If you need 10000 elements before numarray pays off, I'm starting > > to wonder if I can use it for anything at all. :I > > > I didn't make clear that this threshold may improve in the future > (decrease). Right. Good. And -- on small graphs performance probably won't be much of a problem anyway. :) > The corresponding threshold for Numeric is probably > around 1000 to 2000 elements. (Likewise, operations on 10 element > Numeric arrays are only about twice as fast as for 1K arrays) > We may be able to eventually improve numarray performance to something > in that neighborhood (if we are luckly) but I would be surprised to > do much better (though if we use caching techniques, perhaps repeated > cases of arrays of identical shape, strides, type, etc. may run > much faster on subsequent operations). As usual, performance issues > can be complicated. You have to keep in mind that Numeric and numarray > provide much richer indexing and conversion handling feature than > something like the array module, and that comes at some price in > performance for small arrays. Of course. I guess an alternative (for the graph situation) could be to wrap the graphs with a common interface with various implementations, so that a solution more optimised for small graphs could be used (in a factory function) if the graph is small... (Not really an issue for me at the moment, but should be easy to do, I guess.) [snip] > > I'm not sure. A wide range, I should imagine. But with only 100 nodes, > > I'll get 10000 entries in the adjacency matrix, so perhaps it's > > worthwile anyway? > > > That's right, a 100 nodes is where performance is being competitive, > and if you feel you are worried about cases larger than that, then > it isn't a problem. Seems probable. For smaller problems I wouldn't be thinking in terms of numarray anyway, I think. (Just using plain Python dicts or something similar.) [snip] > > > On the other hand, since numarray has much better support for index > > > arrays, i.e., an array of indices that may be used to index another > > > array of values, index array(s), value array pair may itself serve > > > as a storage model for sparse arrays. > > > > That's an interesting idea, although I don't quite see how it would > > help in the case of adjacency matrices. (You'd still need at least one > > n**2 size matrix for n nodes, wouldn't you -- i.e. the index array... > > Right?) > > > Right. I might as well use a full adjacency matrix, then... So, the conclusion for now is that numarray may well be suited for working with relatively large (100+ nodes), relatively dense graphs. Now, the next interesting question is how much of the standard graph algorithms can be implemented with ufuncs and array operations (which I guess is the key to performance) and not straight for-loops... After all, some of them are quite sequential. -- Magnus Lie Hetland The Anygui Project http://hetland.org http://anygui.org |
From: Perry G. <pe...@st...> - 2002-04-18 15:21:56
|
> Behalf Of Magnus Lie Hetland > Perry Greenfield <pe...@st...>: > [snip] > > First of all, it may make sense, but I should say a few words about > > what scale sizes make sense. > [snip] > > So if you are working with much smaller arrays than 10K, you won't > > see total execution time decrease much > > In relation to what? Using dictionaries etc? Using the array module? No, in relation to operations on a 10K array. Basically, if an operation on a 10K array spends half its time on set up, operations on a 10 element array may only be twice as fast. I'm not making any claims about speed in relation to any other data structure (other than Numeric) > [snip] > > Before I go further, I need to find out if the preceeding has made > > you gasp in horror or if the timescale is too slow for you to > > accept. > > Hm. If you need 10000 elements before numarray pays off, I'm starting > to wonder if I can use it for anything at all. :I > I didn't make clear that this threshold may improve in the future (decrease). The corresponding threshold for Numeric is probably around 1000 to 2000 elements. (Likewise, operations on 10 element Numeric arrays are only about twice as fast as for 1K arrays) We may be able to eventually improve numarray performance to something in that neighborhood (if we are luckly) but I would be surprised to do much better (though if we use caching techniques, perhaps repeated cases of arrays of identical shape, strides, type, etc. may run much faster on subsequent operations). As usual, performance issues can be complicated. You have to keep in mind that Numeric and numarray provide much richer indexing and conversion handling feature than something like the array module, and that comes at some price in performance for small arrays. > > (This particular issue also makes me wonder if numarray would > > ever be a suitable substitute for the existing array module). > > Indeed. > > > What size graphs are you most concerned about as far as speed goes? > > I'm not sure. A wide range, I should imagine. But with only 100 nodes, > I'll get 10000 entries in the adjacency matrix, so perhaps it's > worthwile anyway? > That's right, a 100 nodes is where performance is being competitive, and if you feel you are worried about cases larger than that, then it isn't a problem. But if you are operating mostly on small graphs, then it may not be appropriate. The corresponding threshold for numeric would be on the order of 30 nodes. > > On the other hand, since numarray has much better support for index > > arrays, i.e., an array of indices that may be used to index another > > array of values, index array(s), value array pair may itself serve > > as a storage model for sparse arrays. > > That's an interesting idea, although I don't quite see how it would > help in the case of adjacency matrices. (You'd still need at least one > n**2 size matrix for n nodes, wouldn't you -- i.e. the index array... > Right?) > Right. > |
From: Magnus L. H. <ma...@he...> - 2002-04-18 14:54:21
|
Perry Greenfield <pe...@st...>: [snip] > First of all, it may make sense, but I should say a few words about > what scale sizes make sense. [snip] > So if you are working with much smaller arrays than 10K, you won't > see total execution time decrease much In relation to what? Using dictionaries etc? Using the array module? [snip] > Before I go further, I need to find out if the preceeding has made > you gasp in horror or if the timescale is too slow for you to > accept. Hm. If you need 10000 elements before numarray pays off, I'm starting to wonder if I can use it for anything at all. :I > (This particular issue also makes me wonder if numarray would > ever be a suitable substitute for the existing array module). Indeed. > What size graphs are you most concerned about as far as speed goes? I'm not sure. A wide range, I should imagine. But with only 100 nodes, I'll get 10000 entries in the adjacency matrix, so perhaps it's worthwile anyway? > > And -- is there any chance of getting sparse matrices in numarray? > > Since talk is cheap, yes :-). But I doubt it would be in the "core" > and some thought would have to be given to how best to represent them. > In one sense, since the underlying storage is different than numarray > assumes for all its arrays, sparse arrays don't really share the > same underlying C machinery very well. While it certainly would be > possible to devise a class with the same interface as numarray objects, > the implementation may have to be completely different. Yes, I realise that. > On the other hand, since numarray has much better support for index > arrays, i.e., an array of indices that may be used to index another > array of values, index array(s), value array pair may itself serve > as a storage model for sparse arrays. That's an interesting idea, although I don't quite see how it would help in the case of adjacency matrices. (You'd still need at least one n**2 size matrix for n nodes, wouldn't you -- i.e. the index array... Right?) > One still needs to implement ufuncs and other functions (including > simple things like indexing) using different machinery. It is > something that would be nice to have, but I can't say when we would > get around to it and don't want to raise hopes about how quickly it > would appear. No - no problem. Basically, I'm looking for a platform to implement graph algorithms that doesn't necessitate too many installed packages etc. numarray seemed promising since it's a candidate for inclusion in the standard library. I guess I'll just have to do some timing experiments... > Perry -- Magnus Lie Hetland The Anygui Project http://hetland.org http://anygui.org |
From: <vi...@id...> - 2002-04-17 22:24:41
|
# I'm running python 2.0 on Solaris and Numeric 21.0 #I have an m by n array -- called a and have # j an n long list of integers in range(m), such as j = argmax(a,0) # If I set z = zip(j,range(len(j))) # and try the statement res = take(a,z) # python appears to hang, but if I do res = array(map(lambda x,a=a: a[x[0],x[1]]],z) # It works. # Is there a simpler way of doing what I want, and why does take hang? # is it, perhaps, allocating some n by n work array (this would # probably make things thrash like crazy)? -- Victor S. Miller | " ... Meanwhile, those of us who can compute can hardly vi...@id... | be expected to keep writing papers saying 'I can do the CCR, Princeton, NJ | following useless calculation in 2 seconds', and indeed 08540 USA | what editor would publish them?" -- Oliver Atkin |
From: Perry G. <pe...@st...> - 2002-04-17 19:06:25
|
Hi Magnus, On Behalf Of Magnus Lie Hetland > > I'm looking at various ways of implementing graphs in Python (beyond > simple dict-based stuff -- more performance is needed). kjbuckets > looks like a nice alternative, as does the Boost Graph Library (not > sure how easy it is to use with Boost.Python) but if numarray is to > become a part of the standard library, it could be beneficial to use > that... > > For dense graphs, it makes sense to use an adjacency matrix directly > in numarray, I should think. (I haven't implemented many graph > algorithms with ufuncs yet, but it seems doable...) For sparse graphs > I guess some sort of sparse array implementation would be useful, > although the archives indicate that creating such a thing isn't a core > part of the numarray project. > First of all, it may make sense, but I should say a few words about what scale sizes make sense. Currently numarray is implemented mostly in Python (excepting the very low level, very simple C functions that do the computational and indexing loops. This means it currently has a pretty sizable overhead to set up an array operation (I'm guessing an order of magnitude slower than Numeric). Once set up, it generally is pretty fast. So it is pretty good for very large data sets. Very lousy for very small ones. We haven't measured efficiency lately (we are deferring optimization until we have all the major functionality present first), but I wouldn't be at all surprised to find that the set up time can be equal to the time to actually process ~10,000-20,000 elements (i.e., the time spent per element for a 10K array is roughly half that for much larger arrays. So if you are working with much smaller arrays than 10K, you won't see total execution time decrease much (it was already spending half its time in setup, which doesn't change). We would like to reduce this size threshhold in the future, either by optimizing the Python code, or moving some of it into C. This optimization wouldn't be for at least a couple more months; we have more urgent features to deal with. I doubt that we will ever surpass the current Numeric in its performance on small arrays (though who knows, perhaps we can come close). > What do you think -- is it reasonable to use numarray for graph > algorithms? Perhaps an additional module with standard graph > algorithms would be interesting? (I'm sure I could contribute some if > there is any interest...) > Before I go further, I need to find out if the preceeding has made you gasp in horror or if the timescale is too slow for you to accept. (This particular issue also makes me wonder if numarray would ever be a suitable substitute for the existing array module). What size graphs are you most concerned about as far as speed goes? > And -- is there any chance of getting sparse matrices in numarray? > Since talk is cheap, yes :-). But I doubt it would be in the "core" and some thought would have to be given to how best to represent them. In one sense, since the underlying storage is different than numarray assumes for all its arrays, sparse arrays don't really share the same underlying C machinery very well. While it certainly would be possible to devise a class with the same interface as numarray objects, the implementation may have to be completely different. On the other hand, since numarray has much better support for index arrays, i.e., an array of indices that may be used to index another array of values, index array(s), value array pair may itself serve as a storage model for sparse arrays. One still needs to implement ufuncs and other functions (including simple things like indexing) using different machinery. It is something that would be nice to have, but I can't say when we would get around to it and don't want to raise hopes about how quickly it would appear. Perry |
From: Magnus L. H. <ma...@he...> - 2002-04-17 14:31:51
|
I'm looking at various ways of implementing graphs in Python (beyond simple dict-based stuff -- more performance is needed). kjbuckets looks like a nice alternative, as does the Boost Graph Library (not sure how easy it is to use with Boost.Python) but if numarray is to become a part of the standard library, it could be beneficial to use that... For dense graphs, it makes sense to use an adjacency matrix directly in numarray, I should think. (I haven't implemented many graph algorithms with ufuncs yet, but it seems doable...) For sparse graphs I guess some sort of sparse array implementation would be useful, although the archives indicate that creating such a thing isn't a core part of the numarray project. What do you think -- is it reasonable to use numarray for graph algorithms? Perhaps an additional module with standard graph algorithms would be interesting? (I'm sure I could contribute some if there is any interest...) And -- is there any chance of getting sparse matrices in numarray? -- Magnus Lie Hetland The Anygui Project http://hetland.org http://anygui.org |
From: Paul F D. <pa...@pf...> - 2002-04-17 14:14:12
|
You need to link with the Python library. I suggest you learn to use distutils and then it will load for you correctly on both platforms. The file "setup.py" in the Numeric source distribution is a good if complicated example. Some of the setup.py files in the Packages area are simpler and easier to understand. -----Original Message----- From: num...@li... [mailto:num...@li...] On Behalf Of mekkaoui Sent: Tuesday, April 16, 2002 11:09 AM To: num...@li... Subject: [Numpy-discussion] Extension under windows Dear Numerical Python Users, I have writen an extension using GSL (Gnu Scientific Library) and Numerical Python. This extension work fine under Linux and I would to do the same under Windows. For that I use Cygwin. When I would create the module $ gcc -shared Example.o -o Example.pyd I receive this message : Example.o<.text+0x58>:Example.c: undefined reference to 'PyArg_ParseTuple' Example.o<.text+0x15e>:Example.c: undefined reference to 'Py_BuildValue' Example.o<.text+0x1b1>:Example.c: undefined reference to 'Py_InitModule4' Example.o<.text+0x1c1>:Example.c: undefined reference to 'PyImport_ImportModule' Example.o<.text+0x1db>:Example.c: undefined reference to 'PyModule_GetDict' Example.o<.text+0x1f4>:Example.c: undefined reference to 'PyDict_GetItemString' Example.o<.text+0x206>:Example.c: undefined reference to 'PyCObject_Type' Example.o<.text+0x214>:Example.c: undefined reference to 'PyCObject_AsVoidPtr' Perhaps this command is wrong. Perhaps, anyone could explain or show me a document which explain the procedure clearly ? Thanks in advance for your help Omar _______________________________________________ Numpy-discussion mailing list Num...@li... https://lists.sourceforge.net/lists/listinfo/numpy-discussion |
From: Perry G. <pe...@st...> - 2002-04-17 00:51:39
|
After Scott's last display of his powers of persuasion, I lack for a meaningful response. It seems appropriate to declare this thread closed. Besides, I've got to go change some diapers ;-) Perry |
From: Scott G. <xs...@ya...> - 2002-04-16 23:37:02
|
--- Perry Greenfield <pe...@st...> wrote: > > If one had an NDArray that happened to contain a type that numarray > supported, yes it is possible (in fact RecArray does that sort of thing). > > If your point is that in doing so one must use the private attributes > such as _strides, yes that is true. > My point was simply: = One *can* convert from (NDArray + typecode) to a full NumArray = You *do* already convert lists, tuples, ... to NumArrays in ufuncs = So you *could* convert *(NDArrays + typecode) to NumArrays in ufuncs in the same place that checks to see if it is a list, tuple, ... Therefore: = You possibly *could* standardize the attributes in an NDArray (buffer, typecode, shape, stride, offset, ...) = If you *did* standardize the attributes, then others *could* build UserDefinedNDArrays however they see fit and they would work with NumArrays However I get the sense that the numarray module is your baby, and you don't want to change him too much. That's very understandable, you're a proud parent. Truth be told, he's a good looking kid, and I look forward to hanging out with him when he's all grown up. We just have a little different view on parenting, and I was hoping my kid would have an easier time playing with yours. Now that I've beaten that silly metaphor to death... :-) Cheers, -Scott ps: It occurs to me, with the strong sense of encapsulation you desire, that I could have presented this better as requesting that you specify a set of standard *methods* instead of attributes. Something like: def __array_getbuffer__(self): def __array_getoffset__(self): def __array_getshape__(self): def __array_getstrides__(self): def __array_getitemsize__(self): def __array_gettypecode__(self): def __array_getendian__(self): # Who knows what the real list would consist of... # We never got to discuss what a really general # purpose description of an NDArray would require... Then anything which implemented those standard *methods* would be a viable NDArray. From my point of view it amounts to about the same thing, but I think it's a better design and that you might like this idea more. However I'm getting out of breath on this topic, and I have other things I need to do (I'm sure this is true for you too), so if you don't see any merit in this idea, I won't push for it any further. Cheers again. __________________________________________________ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/ |
From: mekkaoui <oma...@ec...> - 2002-04-16 18:03:55
|
Dear Numerical Python Users, I have writen an extension using GSL (Gnu Scientific Library) and Numerical Python. This extension work fine under Linux and I would to do the same under Windows. For that I use Cygwin. When I would create the module $ gcc -shared Example.o -o Example.pyd I receive this message : Example.o<.text+0x58>:Example.c: undefined reference to 'PyArg_ParseTuple' Example.o<.text+0x15e>:Example.c: undefined reference to 'Py_BuildValue' Example.o<.text+0x1b1>:Example.c: undefined reference to 'Py_InitModule4' Example.o<.text+0x1c1>:Example.c: undefined reference to 'PyImport_ImportModule' Example.o<.text+0x1db>:Example.c: undefined reference to 'PyModule_GetDict' Example.o<.text+0x1f4>:Example.c: undefined reference to 'PyDict_GetItemString' Example.o<.text+0x206>:Example.c: undefined reference to 'PyCObject_Type' Example.o<.text+0x214>:Example.c: undefined reference to 'PyCObject_AsVoidPtr' Perhaps this command is wrong. Perhaps, anyone could explain or show me a document which explain the procedure clearly ? Thanks in advance for your help Omar |
From: Perry G. <pe...@st...> - 2002-04-16 15:14:16
|
> > > Important Question: If an NDArray had a typecode (and it was a known > > > string), is it possible to promote it to one of the standard NumArray > > > types? > > > > > > > I think we want to avoid NDArray having any type attribute (Some types > > have subtypes and then the issue gets really messy). We leave it > > to the subclass to address how types will be handled. > > > > Ok that's what you're currently doing, but let me rephrase the question. > > :-) > > > Given a "leaf type" -- something that is really well specified and very > similar on all modern platforms: > > "Int32" - not just an arbitrary "Int" > "Float64" - not just an arbitrary "Float") > > > Do you think you could write a general purpose _function_ that > converted an > "NDArray" to a full featured "NumArray"? I know this would be in Python, > but let's pretend it's a C++ prototype to make the types clear: > > > NumArray NDArray_to_NumArray(NDArray nda, String typecode, Endian end) { > if (WellKnownLeafTypecodeString(typecode)) { > > /* fill in the blanks here */ > > return NumArray(result) > } > > throw "conversion really is impossible"; > } > I'm not sure I understand exactly what you are trying to do here, but I try to address the question as best I can. If one had an NDArray that happened to contain a type that numarray supported, yes it is possible (in fact RecArray does that sort of thing). If your point is that in doing so one must use the private attributes such as _strides, yes that is true. These attributes are private in the sense that users of instances of these objects should never have cause to access them. But it does not mean that classes that subclass NDArray or any of its subclasses, should not access them. They are not private in the sense of the class family (one reason we didn't use __strides since that mechanism is not usable (easily anyway) for subclasses. In that sense, the attributes form an interface within the class family. Some class extenders may need to access them, sure. Perry |
From: Scott G. <xs...@ya...> - 2002-04-15 22:32:24
|
Hi Perry. Well, I don't think I've made any progress convincing you that standardizing what it means to be an interoperable "NDArray" would be good for me or others in the community, but I do appreciate you letting me try. I'll take your suggestion and make my C-API understand a superset of array types. I'll wait to see how the tonumarray() thing pans out. That might meet all of my practical concerns even if I don't think it is as elegant of a solution as defining a strong interface. I'll just respond to the one point below. If I had to sum up my argument for why I think separate array implementations could (should) be compatible, it is buried in the answer to this question. > > > > > Important Question: If an NDArray had a typecode (and it was a known > > string), is it possible to promote it to one of the standard NumArray > > types? > > > > I think we want to avoid NDArray having any type attribute (Some types > have subtypes and then the issue gets really messy). We leave it > to the subclass to address how types will be handled. > Ok that's what you're currently doing, but let me rephrase the question. :-) Given a "leaf type" -- something that is really well specified and very similar on all modern platforms: "Int32" - not just an arbitrary "Int" "Float64" - not just an arbitrary "Float") Do you think you could write a general purpose _function_ that converted an "NDArray" to a full featured "NumArray"? I know this would be in Python, but let's pretend it's a C++ prototype to make the types clear: NumArray NDArray_to_NumArray(NDArray nda, String typecode, Endian end) { if (WellKnownLeafTypecodeString(typecode)) { /* fill in the blanks here */ return NumArray(result) } throw "conversion really is impossible"; } Cheers and thanks again for your time, -Scott Gilbert __________________________________________________ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/ |
From: Perry G. <pe...@st...> - 2002-04-15 21:14:16
|
Hi Scott, I'm not going to respond to all points but mainly concentrate on the last section. > > > Important Question: If an NDArray had a typecode (and it was a known > string), is it possible to promote it to one of the standard NumArray > types? > I think we want to avoid NDArray having any type attribute (Some types have subtypes and then the issue gets really messy). We leave it to the subclass to address how types will be handled. > Here goes (somewhat hypothetical, but close to the boat I'm currently in): > > Jon is our FPGA guy who makes screaming fast core files, but our FPGAs > don't do floating point. So I have to provide his driver with > ComplexInt16 > data. > > Jon and I write an extension module that calls his driver and reads data. > We also write a C routine (call it "munge") that takes both ComplexInt16 > data, and ComplexFloat64 data. We try it out for testing, and pass in my > arrays in both places. We could have used Numarray for the > ComplexFloat64, > but that meant we had to use two array packages, and use two C-APIs in our > extension. All we needed was a pointer to an array of doubles, > so we stuck > with mine. > > Ok, that part of development is done. Now we present it to the > application > developers. Their happy and we're rolling. Successful application. > > Another group find out about this and they want to use it. They're using > numarray for a large part of their application. In fact, their > calculating > the ComplexFloat64 half the data that they want to pass to my "munge" > routine using numarray, and they still need to use my ComplexInt32 data to > read the FPGA. > > They're going to be disappointed to find out my extension can't read > numarray data, and that they have to convert back and forth between the > two. And as the list of routines grow, they have to keep track of whether > it is a numarray-routine, or a scottarray-routine. > > It's not so bad for one simple "munge" function, but there are going to be > hundreds of functions... > > I don't expect you to have much sympathy for my having to convert > data back > and forth between my array types and yours, but it is an > avoidable problem. > > > > For the most part, we both agree on what parts an NDArray should have. If > we could only agree what to name them, and that we'd stick to those names, > that would be a large part of it for me. > > I'm not sure I understand the problem in all the details I need to. I'll restate it as best as I understand it and you can tell me if I understood incorrectly. You have extension modules that get complex int data from hardware. Other processing may be done to the complex int data in that format so it doesn't make sense to convert it to a more standard format when reading it in. You have C extensions that carry out certain tasks on complex data (in either complex int format or complex floats). You have users that would like to use your routine with numarray. (I haven't seen any specific mention of the need for ufuncs on complex ints so I'll assume you just need complex int arrays as containers for C programs to use.) [If you did need to perform ufuncs on complex ints, then extending numarray locally to handle them would be one possibility, but a little involved at the moment (a little easier later when we reimplement complex), then again, maybe not, the complex stuff is currently subclassed from numarray and not that hard to adapt to ints I think, but it isn't that well done now]. I guess my initial reaction is that you should develop a front- end C-API that handles obtaining data buffers from different sources. You get to define what kinds of things it supports, and changes to either the list of types you support and localizes any dependencies on our or anyone else's api to a small section of code. From what I'm hearing, you don't need it to provide much (pointer to arrays and associated information). If we are real bozos and change the interface, it doesn't hurt you much (not that we intend to be bozos or change the C-API willy nilly :-) To elaborate, you define your equivalent of our getNumInfo routine I don't think I've seen anything that requires explicit dependencies on Python attributes. Sure, you could use the same attribute names and use Python calls to get those just as our getNumInfo routine does, but I think that is bad practice. You may find some other representation for arrays out there that doesn't fit this model and you may want to work with those also and you won't be able to get them to adopt our scheme. You say that you don't want your users to have to convert between the two data representations. If they are using your C extensions that is understandable, and avoidable since you've written your programs to deal with the various types. On the other hand, unless you extend numarray, numarray clearly cannot deal with the complex ints so conversion is necessary. But understandably, you would like to eliminate the need for explicit conversions. I think there is an easy way of dealing with this. We haven't implemented this capability yet but we've been talking about having numarray check input values to see if they have a method "tonumarray" [not that we would choose that particular method name, I'm just illustrating the point]. If that method did exist, it would be called to create a numarray from the object. Thus you could add such a method to your class and when it is used in numarray ufuncs or in binary operations with numarray objects, your complex ints are automatically converted to numarray objects (presumably a complex float of some precision). Adding this capability to numarray should be pretty easy. True, the solution that I proposed doesn't protect you from making any changes ever. But we believe we are at a stage in the project where it is dangerous to lock ourselves into lower level details such as the internal description of the array. We still have things to implement and that may cause us to realize that some changes are needed. Our C-API stuff is relatively new. It may see changes in the near future, but likely not many related to what you need. And we intend to shield the C-API from changes in the Python attributes. We could change the name or contents of _byteswap and it would not change anything in the C-API. I see premature coupling of low level implementation details as a bad thing, not a good thing. Any change that are made to the API require changes only the corresponding routine in your C-API, and all your C applications are shielded from any changes (save rebuilding). If I've misunderstood your examples, please let me know. Perry |
From: Todd M. <jm...@st...> - 2002-04-15 18:18:20
|
Numarray 0.3.1 and 0.3.2 --------------------------------- Numarray is an array processing package designed to efficiently manipulate large multi-dimensional arrays. Numarray is modelled after Numeric and features c-code generated from python template scripts, the capacity to operate directly on arrays in files, and improved type promotions. Numarray-0.3.1 incorporates a number of bug fixes and enhancements to the C-API, including a minimal Numeric emulation layer which makes it easy to port simple Numeric C-extensions to numarray. The emulation layer is incomplete, so not all Numeric extensions will work, but simple ones *do* with a minimal amount of effort. See Doc/numpy_compat for an example of convolution done using the emulation layer. New for Numarray-0.3.1 is the Numarray manual in PDF and HTML formats; other formats are available for users if the source distribution. Numarray-0.3.2 is a source only release to support Alpha/Tru64. It is essentially Numarray-0.3.1 + one portability bug fix. WHERE ----------- Numarray-0.3.1 windows executable installers and source code tar ball is here: http://sourceforge.net/project/showfiles.php?group_id=1369 Numarray is hosted by Source Forge in the same project which hosts Numeric: http://sourceforge.net/projects/numpy/ The web page for Numarray information is at: http://stsdas.stsci.edu/numarray/index.html Trackers for Numarray Bugs, Feature Requests, Support, and Patches are at the Source Forge project for NumPy at: http://sourceforge.net/tracker/?group_id=1369 REQUIREMENTS -------------------------- numarray-0.3.1 requires Python 2.0 or greater. AUTHORS, LICENSE ------------------------------ Numarray was written by Perry Greenfield, Rick White, Todd Miller, JC Hsu, Paul Barrett, Phil Hodge at the Space Telescope Science Institute. Thanks go to Jochen Kupper of the University of North Carolina for his work on Numarray and for porting the Numarray manual to TeX format. Numarray is made available under a BSD-style License. See LICENSE.txt in the source distribution for details. -- Todd Miller jm...@st... |
From: Scott G. <xs...@ya...> - 2002-04-15 04:09:24
|
--- Perry Greenfield <pe...@st...> wrote: *** Just skim through my first few responses. About half way through writing this letter, a few things hit me. I still want to propose some changes, but I don't think you'll find them as intrusive... > > > > > Then anyone who implemented these could work with the same C API for > > getting the pointer to memory, shape array, stride array, and item > > size. > > > Then you are talking about standardizing a C-API. But I'm still > confused. If you write a class that implements these attributes, > is it your C-API that uses them, or do you mean our C-API uses > them? > I'm not really talking about standardizing a C-API. I'm talking about standardizing what that C-API would have to do. You would have your C-API as part of numarray proper. And, for the short term, I would have my own C-API as part of what I need to get done. Both C-API's would use the same attributes. Why do I want my own C-API today? Because numarray isn't done yet, and I can't create arrays of the types I need. I'll need a C-API to get at my types. It would be great if the same C-API could get at yours too. > > If you have your own C-API, then the attributes are not > relevant as an interface. If you intend to use our C-API to access > your objects, then they are. > Either C-API could access anything that looks like an NDArray. > > > > > Because truthfully arrays are little more than a pointer to memory. > > > > That's like asking "why in the world would we presume memcpy() or > > qsort() would know what to do with your memory?" > > > > Then you misunderstand Numarray. Numarrays are far more than just > a pointer to memory. You can get a pointer to memory from them, > but they entail much more than that. Numarray presumes that certain > things are possible with NumArray objects (like standard math > operations). If you want something that doesn't make such an > assumption, you should be using NDArray instead. NDArray makes > no presumptions about the contents of the memory other than > they are arranged in memory in array fashion. > I think I understand where you're coming from now. (BTW, I think some of our confusion comes from when I'm talking about "Numarray" or "numarray" the package versus "NumArray" and "NDArray" the classes.) *** Ok, I think there is light at the end of this tunnel... I guess what I've been arguing for all along is something a lot like an NDArray where I can specify the typecode (and possibly other things like 'endian' etc...), and that only NDArrays have a minimal set of standardized attributes. With this I can create extensions that will work with anything that looks like an NDArray. Your NDArrays from the numarray package, and my NDArrays of crazy types. I'm still left in the position of having to upcast an NDArray to a full blown NumArray if I ever want to use my NDArrays in a routine meant solely for NumArrays. However this conversion isn't difficult, and I think can do that when needed. Important Question: If an NDArray had a typecode (and it was a known string), is it possible to promote it to one of the standard NumArray types? Lesser Question: If an NDArray had a known typecode, is it desirable for numarray routines to promote the NDArray to a NumArray in the same way that the routines promote a Python list or tuple to a NumArray on the fly? Ok, my new proposal (again, treat it like a suggestion): - Do you think it would be possible to standardize the set of attributes that it requires to be an NDArray? NDArrays are simple and unlikely to change. I think _those_ really are just pointers to memory with array accounting information. We could agree on what exactly constitutes an NDArray. - Could this standard set of attributes optionally include the names for the typecode, endian, (and maybe some other) attributes? That doesn't mean that your NDArrays would have to have the typecode, endian or whatever information. It just means that when any class does add a typecode, it adds it as a specially named attribute. I realize that a large part of what I want is interoperability between separate implementations of NDArrays. Anything that has (_data, _shape, _itemsize, _type) is something I could work with in an extension. Some other fields are optional (_strides, _byteoffset) because they have sensible defaults that can be calculated from above in the common case. So the only difference between what you currently have and most of what I'm proposing is that the names of NDArray attributes become standardized. > > If you are presenting numarray with a type it already knows about, > why aren't you subclassing it? > Since I know I'll have to create types that numarray doesn't know about, I know I'm going to have to write a new array class (it's already written). It would be silly of my new array class to not implement the standard types just because numarray _does_ know about them. I now realize that I don't have to give my class to numarray directly. That didn't hit me before. I could promote/upcast it when necessary. The upcast-in and downcast-out thing will add up to extra work and messier code, but it is a workaround. > > If you present numarray an object > with a type it doesn't know about, then that is pointless. > Types and numarray are inextricably intertwined, and shall > remain so. > Understood. I don't want to ruin your NumArrays. > > ********************************************************** > > What I want to see is a specific example. I'm not going to > pay much attention to generalities because I'm still unclear > about how you intend to do what you say you will do. Perhaps > I'm slow, but I still don't get it. > Nope, clearly it was me that was being slow. There is still that bit about NDArrays that I'm trying to justify, so my example is below. > > (or alternatively, > create a numarray object that uses the same buffer yours does). > You're right. This hadn't occurred to me until just a little bit ago. > > E.g., "I want > complex ints and I will develop a class that will use this to > do the following things [it doesn't have to be exhaustive or > complete, but include just enough to illustrate the point]. > If the attributes were standardized then I would do this and that, > and use it with your stuff like this showing you the code > (and the behavior I expect)." > Here goes (somewhat hypothetical, but close to the boat I'm currently in): Jon is our FPGA guy who makes screaming fast core files, but our FPGAs don't do floating point. So I have to provide his driver with ComplexInt16 data. Jon and I write an extension module that calls his driver and reads data. We also write a C routine (call it "munge") that takes both ComplexInt16 data, and ComplexFloat64 data. We try it out for testing, and pass in my arrays in both places. We could have used Numarray for the ComplexFloat64, but that meant we had to use two array packages, and use two C-APIs in our extension. All we needed was a pointer to an array of doubles, so we stuck with mine. Ok, that part of development is done. Now we present it to the application developers. Their happy and we're rolling. Successful application. Another group find out about this and they want to use it. They're using numarray for a large part of their application. In fact, their calculating the ComplexFloat64 half the data that they want to pass to my "munge" routine using numarray, and they still need to use my ComplexInt32 data to read the FPGA. They're going to be disappointed to find out my extension can't read numarray data, and that they have to convert back and forth between the two. And as the list of routines grow, they have to keep track of whether it is a numarray-routine, or a scottarray-routine. It's not so bad for one simple "munge" function, but there are going to be hundreds of functions... I don't expect you to have much sympathy for my having to convert data back and forth between my array types and yours, but it is an avoidable problem. For the most part, we both agree on what parts an NDArray should have. If we could only agree what to name them, and that we'd stick to those names, that would be a large part of it for me. > > Given this I can either show you an alternate solution or > I can realize why you are right and we can discuss where > to go from there. Otherwise you are wasting your time. > Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/ |
From: Perry G. <pe...@st...> - 2002-04-14 18:54:05
|
Hi Scott, Just to be to the point, I'm still missing what I've been asking for, to wit a concrete example that illustrates your point. I'll try to address a few of your points that appear to try to answer that and clarify what I mean by concrete example. > > Here's what I'm proposing, and it's only a suggestion. > > > *** I think the requirements for being a general purpose "NDArray" > can be specified with only the following attributes: > > __array_buffer__ - as buffer object > __array_shape__ - as tuple of long > __array_itemsize__ - as int > > Optionally > __array_stride__ - as tuple of long (get from shape if None) > __array_offset__ - as int (would default to 0 if not present) > > Then anyone who implemented these could work with the same C API for > getting the pointer to memory, shape array, stride array, and item size. > Then you are talking about standardizing a C-API. But I'm still confused. If you write a class that implements these attributes, is it your C-API that uses them, or do you mean our C-API uses them? If you have your own C-API, then the attributes are not relevant as an interface. If you intend to use our C-API to access your objects, then they are. But if you want to use our C-API, that still doesn't explain why the alternatives aren't acceptable (namely subclassing). > > Because truthfully arrays are little more than a pointer to memory. > > That's like asking "why in the world would we presume memcpy() or > qsort() would know what to do with your memory?" > Then you misunderstand Numarray. Numarrays are far more than just a pointer to memory. You can get a pointer to memory from them, but they entail much more than that. Numarray presumes that certain things are possible with NumArray objects (like standard math operations). If you want something that doesn't make such an assumption, you should be using NDArray instead. NDArray makes no presumptions about the contents of the memory other than they are arranged in memory in array fashion. > > > > > You haven't provided any example (let > > alone a compelling one) of why we should accept any object that > > provides those attributes. > > > > Well, the UFuncs certainly should reject any object that they don't > know how to handle. I'm currently only addressing what it takes to be > an NDArray/NumArray object. OTOH, if I can present something to the > UFuncs that looks like a known array type, why wouldn't UFuncs > want to work with it? > If you are presenting numarray with a type is already knows about, why aren't you subclassing it? If you present numarray an object with a type it doesn't know about, then that is pointless. Types and numarray are inextricably intertwined, and shall remain so. > > - Allows me personally to distribute a separate (and simpler) > implementation of NDArrays/NumArrays right now and have the same data > objects work with yours when you're all done. If I give the UFuncs a > pointer to memory, and the attributes above, why shouldn't it work > correctly? > > > Am I doing any better? I am trying. > Not really. More on that later. > > > Is there a way, today, without modifying numarray, for me to use > numarray as a holder for these esoteric data types? Is that way > difficult? > Could it be easier? > No to the first, it isn't intended to serve that purpose. If you just need something to blindly hold values without doing anything with them use NDArray (and you can add whatever customization you wish regarding what methods or operators are available). > I'm not asking numarray to know about my types in it's core baseline. I'm > wondering what it takes to implement new types at all. > It's possible to extend (but not in any way that makes it automaticaly usable with anyone elses extension. Currently that sort of extension would not be hard for someone that knows how things work. We haven't documented how to do so, and won't for a while. It's not a high priority for us now. ********************************************************** What I want to see is a specific example. I'm not going to pay much attention to generalities becasue I'm still unclear about how you intend to do what you say you will do. Perhaps I'm slow, but I still don't get it. On the one hand, you ask us to have numarray accept objects with the same 'interface'. Well, if they are not of an existing supported type, thats pointless since numarray won't work properly with them. If it is an existing type, you haven't explained why you can't use numarray directly (or alternatively, create a numarray object that uses the same buffer yours does). I still haven't seen a specific example that illustrates why you cannot use subclassing or an instance of a numarray object instead. If you need to add a new type that's possible but you'll have to spend some time figuring out how to do that for your own extended version. If you just want to use arrays to hold values (of new types), then use NDArray. It doesn't care about types. But please give a specific case. E.g., "I want complex ints and I will develop a class that will use this to do the following things [it doesn't have to be exhastive or complete, but include just enough to illustrate the point]. If the attributes were standardized then I would do this and that, and use it with your stuff like this showing you the code (and the behavior I expect)." Given this I can either show you an alternate solution or I can realize why you are right and we can discuss where to go from there. Otherwise you are wasting your time. Perry |
From: Scott G. <xs...@ya...> - 2002-04-14 11:19:12
|
Perry, I've been trying to be persuasive, but I think all I've managed to do is to be verbose and annoy you. Please accept my apologies. I really am sorry this is going as poorly as it is. I'm doing a lousy job of getting my point across, and I'd like to turn around the tone this has taken. Email always comes off as more antagonistic than intended. Finally, my appeal to the fact that you are proposing a standard was heavy handed. I guess I was trying to use that to force you to consider my position. It clearly backfired... I'll try to be more to the point. Here's what I'm proposing, and it's only a suggestion. *** I think the requirements for being a general purpose "NDArray" can be specified with only the following attributes: __array_buffer__ - as buffer object __array_shape__ - as tuple of long __array_itemsize__ - as int Optionally __array_stride__ - as tuple of long (get from shape if None) __array_offset__ - as int (would default to 0 if not present) Then anyone who implemented these could work with the same C API for getting the pointer to memory, shape array, stride array, and item size. The set of operations on a pure "NDArray" is probably pretty minimal (reshape, transpose/rotate, index arrays?). So in order to create a full featured "NumArray", a few more attributes are required: __array_itemtype__ - as string? Optionally __array_endian__ - as 1 char string? (default to the native endian) This brings the total up to 4 required attributes, and 3 optional ones for a very general purpose array data structure. (I can think of other optional ones, but skip that for now.) > > All in all you are talking about checking quite a few attributes > to make sure the object has the interface. And even if it does, > *why* in the world would we presume that the C functions used by > numarray would work properly with the object you provide. > Because truthfully arrays are little more than a pointer to memory. That's like asking "why in the world would we presume memcpy() or qsort() would know what to do with your memory?" > > You haven't provided any example (let > alone a compelling one) of why we should accept any object that > provides those attributes. > Well, the UFuncs certainly should reject any object that they don't know how to handle. I'm currently only addressing what it takes to be an NDArray/NumArray object. OTOH, if I can present something to the UFuncs that looks like a known array type, why wouldn't UFuncs want to work with it? Ok, so what does this buy you? Well, it probably doesn't buy you personally very much. Your needs are already being met by the current implementation. Ok, so what does this cost you? A few translations: _data -> __array_buffer__ _shape -> __array_shape__ _strides -> __array_stride__ _itemsize -> __array_itemsize__ _offset -> __array_offset__ _type -> __array_type__ _byteswap -> __array_endian__ This isn't a style criticism. I'm not just asking you to change your names, I'm asking to promote the names to be a "standard interface" much like these things are in many places in Python. Also requires some small changes to getNDInfo() and getNumInfo() so that they can calculate the derived fields (contiguous, aligned, etc...). Also requires some changes to your scripts so that it checks for the interface rather than the inheritance. What are the benefits to anyone else? - Describes how anyone could implement something that looks and acts like NDArrays or NumArrays. There are probably a lot of reasons to want to do this. I have some reasons that I don't think you value too much. I think others would have reasons which I can't imagine too. - Allows one standard API for getting at the basics of NDArrays/NumArrays - Allows anyone to easily implement other data types for NumArrays. The typecode won't match any of your builtin types, but maybe other third parties could agree on other typecodes for their crazy needs and share modules. - Allows me personally to distribute a separate (and simpler) implementation of NDArrays/NumArrays right now and have the same data objects work with yours when you're all done. If I give the UFuncs a pointer to memory, and the attributes above, why shouldn't it work correctly? > > We're not going to budge until you show us what the hell you are talking > about. > Am I doing any better? I am trying. > > You are right on complex ints (that we won't consider them). One > could take numarray and add them if one wanted and have a more > extended version. But we won't do it, and we wouldn't support as > being in what we maintain. It's one of those trade offs. > Is there a way, today, without modifying numarray, for me to use numarray as a holder for these esoteric data types? Is that way difficult? Could it be easier? I'm not asking numarray to know about my types in it's core baseline. I'm wondering what it takes to implement new types at all. > > Your example shows nothing about what your > real needs for the object are. > My real needs are all over the place. Some of which you've shown me are solvable with the current implementation of numarray. Some of which you've not addressed or said you won't address. To be explicit: Here are (at least most of) my _needs_ for array objects: - support a wide variety of data types (user defined) - have efficient storage - support the pickle interface for serialization - allow alternate sources of underlying memory - have an easy interface for accessing the pieces necessary to create C extensions (buffer, shape, stride, ...) - completed and reliable in the near term Here are (at least some of) my _wants_ for array objects: - cooperate on some level with other standard array modules (once the standard is set) - have same API for accessing the pieces (buffer, shape, stride, ...) as all standard array modules will. - implementation in pure Python so that building extension modules is not required until the fast operations present in those modules is required. - implemented from a standard that is as good as it can be Here are (at least some of) my _whims_ for array objects: - has "windowing" functionality to work efficiently with really large files (on any modern platform). - alternate implementations for things such as "slicing behaviour" (copy on write, reference). Loosely following your design, I've already written a module that meets my "needs", I was hoping that we could cooperate towards filling in some of my "wants" (cooperating array modules), and I've brought up my "whims" because I thought they were interesting possibilities for discussion. I was going to respond to some of your other remarks, but I've probably wasted enough of your time. If you don't respond to this message, I'll take that as a sign that we just aren't going to see eye to eye on any of this, and I won't bother you any more. (I'll be half surprised if you even get this message. From the tone of your last one, I wouldn't be shocked to find out you've already added me to your killfile. :-) No hard feelings, -Scott Gilbert __________________________________________________ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/ |
From: Paul F D. <pa...@pf...> - 2002-04-14 02:34:19
|
I haven't been following this discussion (I have a product release on Monday). But I am getting a lot of mail stacking up for numpy-developers which will not go through unless you are one of the registered developers mailing from your registered mail account. All others, please do not use numpy-developers. This is a private channel for the official developers only. I gather from my brief reading that someone is looking for a standard to use now. That standard is Numeric. If you go with that now then when the time comes to switch to Numarray, you'll be in the same boat as the whole community and therefore liable to be able to profit from any conversion tools required. You can reduce your problems to a minimum by sticking with the Python interface where possible. If you have some special need that Numeric is not meeting please realize that what exists is a consensus product after a long evolution and it is not likely to change much to meet your particular needs. There are some areas where what is right for one set of people is wrong for the others. |
From: Perry G. <pe...@st...> - 2002-04-14 01:42:06
|
> Ok, here's my list: > > Philosophical > > You have a proposal in to the Python guys to make Numarray into the > standard _implementation_. I think standards like this should specify > an _interface_, not an implementation. > Sure (though there is often more to a standard than just an interface, but certainly an implementation is generally not the standard). I'm not sure why you think we imply the implementation is the standard. We are waiting to rewrite the PEP when we are closer to having the implementation ready, but we've been very open about the design and have asked for input on it for a long time now. > Simplicity > > I can give my users a single XArray.py file, and they can be off and > running with something that works right then and there, and it could in > many ways be compatible with Numarray (with some slight modifications) > when they decide they want the extra functionality of extension modules > that you or anyone else who follows your standard provides. But they > don't have to compile anything until they really need to. > > Your implementation leaves me with all or nothing. I'll have to build > and use numarray, or I've got an in house only solution. > Hard to comment on this. > Expediency > > I want to see a usable standard arise quickly. If you maintain the > stance that we should all use the Numarray implementation, instead of > just defining a good Numarray interface, everyone has to wait for you > to finish things enough to get them accepted by the Python group. Your > implementation is complicated, and I suspect they will have many things > that they will want you to change before they accept it into their > baseline. (If you think my list of suggestions is annoying, wait until > you see theirs!) > I have the strong sense you misunderstand how the process works. Guido will be driven in large part by the acceptance or non-acceptance of the Numeric community. If they don't buy into it. It won't be part of the standard. If it won't be used by many, it won't be part of the standard. Yes, he will review the design and interface to see if there should be a long term commitment by the Python maintainers to have it in the standard library. We have sent him the design documents, and we do keep him informed. He has given us feedback about it. But for the most part, the judgement is going to be by the Numeric community. > If a simple interface protocol is presented, and a simple pure Python > module that implements it. The PEP acceptance process might move along > quickly, but you could take your time with implementing your code. > > Pragmatic > > You guys aren't finished yet, and I need to give my users an array > module ASAP. As such a new project, there are likely to be many bugs > floating around in there. I think that when you are done, you will > probably have a very good library. Moreover, I'm grateful that you are > making it open source. That's very generous of you, and the fact that > you are tolerating this discussion is definitely appreciated. > > Still, I can't put off my projects, and I can't task you to > work faster. > > > However, I do think we could agree in a very short term that your design > for the interface is a good one. I also think that we (or just > me if you > like) could make a much smaller PEP that would be more readily accepted. > Then everyone in this community could proceed at their own pace > - knowing > that if we followed the simple standard we would have inter operability > with each other. > I think we still don't understand what you need yet. More elaboration on that later. > Social > > Normally I wouldn't expect you to care about any of my special issues. > You have your own problems to solve. As I said above, it's generous of > you to even offer your source code. > > However, you are (or at least were) trying to push for this to become a > standard. As such, considering how to be more general and apply to a > wider class of problems should be on your agenda. If it's not, then you > shouldn't be creating the standard. > Pleeease. Just because a library developer doesn't happen to meet your needs doesn't mean it can't be part of the standard library. There are plenty of modules in the standard library that could have been made more general in some way, but there they are. The criteria is whether it solves problems for a large community of users, not that it is infinitely extensible or so on. Software development is full of trade-offs and that includes limits to generalization. Sure we can discuss whether things could be made more general or not. But because you want it more general doesn't mean we just say "Sure, you define everything!" > If you don't care about numarray becoming standard, I would like to try > my hand at submitting the slightly modified version of your design. I > won't be compatible with your stuff, but hopefully others will follow > suit. > You are free to propose your own standard at any time. No one will stop you from doing so. > Functionality > > Data Types > > I have needs for other types of data that you probably have little use > for. If I can't coerce you to make a minor change in specification, I > really don't think I could coerce you to support brand new data types > (complex ints is the one I've beaten to death, because I > could use that > You are right on complex ints (that we won't consider them). One could take numarray and add them if one wanted and have a more extended version. But we won't do it, and we wouldn't support as being in what we maintain. It's one of those trade offs. > one in the short term). What happens when someone at my company wants > quaternions? I suspect that you won't have direct support for those. > I know that numarray is supposed to be extensible, but the following > raises an exception: > > from numarray import * > > class QuaternionType(NumericType): > def __init__(self): > NumericType.__init__(self, "Quaternion", 4*8, 0) > > Quaternion = QuaternionType() # BOOM! > > q = array(shape=(10, 10), type=Quaternion) > > Maybe I'm just doing something wrong, but it looks like your code > wants "Quaternion" to be in your (private?) typeConverters dictionary. > Yep, and there's a good reason for that. Just spend a few minutes thinking about the role types play with array packages and how they have traditionally been implemented. Generally speaking, it is presumed that any two numeric types may be used in a binary operator. So you, Scott, define your special type, Quaternions. You will need to provide the module all the machinery for knowing what to do with all the other numeric types available. You may not care, but it is a requirement that numarray (and Numeric) know what to do. If that doesn't fit in with your needs, then you shouldn't be trying to use it. The problem is worse than that. You supply a Quaternion type extension to numarray, and Bob supplies a super long int type (64 bytes!) also. Both of you have gone to the trouble of giving numarray the means of handling all other default numarray types. But you don't know to handle each other. How do you solve that problem? I don't know. If you do, let us know. Given the requirements, adding new numeric types is not going to allow indepenent extensions to work with each other. That's fairly limiting, but that's the price that is paid for the feature. > Ok, try two: > > from numarray import * > > q = NDArray(shape=(10, 10), itemsize=4*8) > > if a[5][5] is None: > print "No boom, but what can I do with it?" > > Maybe this is just a documentation problem. On the other hand, I can > do the following pretty readily: > > import array > class Quat2D: > def __init__(self, *shape): > assert len(shape) == 2 > self._buffer = array.array('d', [0])*shape[0]*shape[1]*4 > self._shape, self._stride = tuple(shape), (4*shape[0], 4) > self._itemsize = 4*8 > > def __getitem__(self, sub): > assert isinstance(sub, tuple) and len(sub) == 2 > offset = sub[0]*self._stride[0] + sub[1]*self._stride[1] > return tuple([self._buffer[offset + i] for i in range(4)]) > > def __setitem__(self, sub, val): > assert isinstance(sub, tuple) and len(sub) == 2 > offset = sub[0]*self._stride[0] + sub[1]*self._stride[1] > for i in range(4): self._buffer[offset + i] = val[i] > return val > > q = Quat2D(10, 10) > q[5, 5] = (1, 2, 3, 4) > print q[5, 5] > > This isn't very general, but it is short, and it makes a good example. > I'm not sure what it proves. If all you need is an array to store some kind of type, be able to index and slice it, and not provide numeric operations, by all means use the existing array module, it does that fine. It's more work to subclass NDArray, but it can do it too, and gives you more capabilities (you won't be able to use index arrays or broadcasting in the array module for example). The extra functionality comes at some price. Sure, it isn't as simple to extend. It's your choice if it is worth it or not. If you want to add your large quaterion array efficiently, then the array module is worthless. Your example shows nothing about what your real needs for the object are. > If they get half of their data from calculations using Numarray, and > half from whatever I provide them, and then try to mix the results in > an extension module that has to know about separate implementations, > life is more complicated than it should be. > It's how you intend to 'mix' these that I have no clue about. > Operations > > I'm going to have to write my own C extension modules for some high > performance operations. All I need to get this done is a void* > pointer, > the shape, stride, itemsize, itemtype, and maybe some other things to > get off and running. You have a growing framework, and you have > already > indicated that you think of your hidden variables as private. I don't > think I or my users should have to understand the whole UFunc > framework > and API just to create an extension that manipulates a pointer to an > array of doubles. > Sigh. No one said you had to understand the ufunc framework to do so. We are working on an C API that just gives you a simple pointer (it's actually available now, but we aren't going to tout it until we have better documentation). > Arrays are simpler than UFuncs. I consider them to be pretty > seperable > parts of your design. If you keep it this way, and it becomes the > standard, it seems that I and everyone else will have to understand > both parts in order to create an extension module. > Wrong. > Flexibility > > Numarray is going to make a choice of how to implement slicing. > My guess > is that it will be one of "copy contiguous", "copy on write", "copy by > reference". I don't know what the correct choice is, but I know that > someone else will need something different based on context. > Things like > UFuncs and other extension modules that do fast C level calculations > typically don't need to concern themselves with slicing behaviour. > And they don't. > Design > > Your implementation would be similar to having the 'pickle' module > require you to derive from a 'Pickleable' base class - instead of simply > providing __getstate__ and __setstate__ methods. > > It's an artificial constraint, and those are usually bad. > You say. You are quite welcome do your own implementation that doesn't have this 'artificial' constraint. After all your text I *still* don't understand how you intend to use the 'interface' of the private attributes. You haven't provided any example (let alone a compelling one) of why we should accept any object that provides those attributes. Shoudn't the object also provide all the public methods. Shouldn't also provide indexing and so forth. All in all you are talking about checking quite a few attributes to make sure the object has the interface. And even if it does, *why* in the world would we presume that the C functions used by numarray would work properly with the object you provide. I really don't have a clue as to what you are getting at here, and without some real concrete example illustrating this point, I don't think there is any point to continuing this discussion. > > > > All good in principle, but I haven't yet seen a reason to change > > numarray. As far as I can tell, it provides all you need exactly > > as it is. If you could give an example that demonstrated otherwise... > > > > Maybe you're right. I suspect you as the author will come up with the > quick example that shows how to implement my bizarre quaternion example > above. I'm not sure if this makes either of us right or wrong, but if > you're not buying any of this, then it's probably time for me to chock > this off to a difference in opinion and move on. > > Truthfully this is taking me pretty far from my original tack. Originally > I had simply hoped to hack a couple of things into arraymodule.c, and here > I am now trying to get a simpler standard in place. I'll try one > last time > to convince you with the following two statements: > > - Changing such that you only require the interface is a subtle, > but noticeable, improvement to your otherwise very good design. > > - It's not a difficult change. > > > If that doesn't compel you, at least I can walk away knowing I tried. For > the volumes I've written, this will probably be my last pesky message if > you really don't want to budge on this issue. > We're not going to budge until you show us what the hell you are talking about. > > The alternative of coming up with a different specifier for > records/structs > is probably a mistake now that the struct module already has it's (terse) > format specification. Once that is taken into consideration, > following all > the leads of the struct module makes sense to me. > Again, you are free to do your own, or fork our numarray and do it the way you want. Or do your own from scratch. Or whatever. > [...] > Also, just mmaping the whole file puts all of the memory use at the > discretion of the OS. I might have a gig or two to work with, but if mmap > takes them all, other threads will have to contend for memory. The system > (application) as a whole might very well run better if I can retain some > control over this. > > > I'm not married to the windowing suggestion. I think it's something to > consider, but it might not be a common enough case to try and make a > standard mechanism for. If there isn't a way to do it without a kluge, > then I'll drop it. Likewise if a simple strategy can't meet anyone's real > needs. > You can forget our doing it. It's out of the question for us. > > > > If the 32 bit address is your problem, you are far, far better off > > using a 64-bit processor and operating system than trying to kludge up > > a windowing memory mechanism. > > > > We don't always get to specify what platform we want to run on. Our > customer has other needs, and sometimes hardware support for > exotic devices > dictate what we'll be using. Frequently it is on 64 bit Alphas, but > sometimes the requirement is x86 Linux, or 32 bit Solaris. > > Finally, our most frustrating piece of legacy software was written in > Fortran assuming you could stuff a pointer into an INT*4 and now requires > the -taso flag to the compiler for all new code (which turns a sexy 64 bit > Alpha into a 32 bit kluge...). > You may have customers with unreasonable demands. We don't have to let them cause an incredible complication in the underlying machinery. (And we won't). And we won't make it work on Windows 3.1 either. We have to draw the line somewhere. Your customers will pay dearly (and you will benefit :-). > Also, much of our data comes on tapes. It's not easy to memory map those. > Your point being? > > > [...] This doesn't seem to be going anywhere. If you can give us a better idea of how your interface needs would be used, at least we could respond to the specific issues. But we don't understand and although we are considering some changes, I'm not going to fold in your requests until we do understand. You may not be happy with the progress we are making either. Sorry, I can't help that. If you need something sooner, you'll need to do something else. Come up with your own system and try to get it into Python. Take numarray and do it the way you think it ought to be done and at the rate you think it should be done. You're welcome to. Take the array module and use that as a basis. We'd like numarray to be part of the standard. We'd like it to be the standard package in the Numeric community. But if neither happened, we'd still be working on it. We need it for our own work. Numeric doesn't give us the capabilities that we need. We are using it for our software development and it is being used to reduce HST data now. We are continuing on this regardless. Perry |