You can subscribe to this list here.
2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
From: Scott G. <xs...@ya...> - 2002-04-13 10:08:25
|
--- Perry Greenfield <pe...@st...> wrote: > Scott Gilbert writes: [...] > > > > very_cool = numarray.add(n, s) > > > But why not (I may have some details wrong, I'm doing this > from memory, and I haven't worked on it myself in a bit): > [...] > > maybe_not_quite_so_cool_but_just_as_functional = n + s > [...] > > From everything I've seen so far, I don't see why you can't > just create a NumArray object directly. You can subclass it > (and use multiple inheritance if you need to subclass a different > object as well) and add whatever customized behavior you want. > You can create new kinds of objects as buffers just so long > as you satisfy the buffer interface. > Your point about the optional buffer parameter to the NumArray is well taken. I had seen that when looking through the code, but it slipped my mind for that example. I could very well be wrong about some of these other reasons too... I have a number of reasons listed below for wanting the standard that Python adopts to specify only the interface and not the implementation. You may not find all of these pursuasive, and I apologize in advance if any looks like a criticism. (In my limited years as a professional software developer, I've found that the majority of people can be very defensive and protective of their code. I've been trying to tread lightly, but I don't know if I'm succeeding.) However if any of these reasons is persuasive, keep in mind that the actual changes I'm proposing are pretty minimal in scope. And that I'd be willing to submit patches so as to reduce any inconvenience to you. (Not that you have any reason to believe I can code my way out of a box... :-) Ok, here's my list: Philosophical You have a proposal in to the Python guys to make Numarray into the standard _implementation_. I think standards like this should specify an _interface_, not an implementation. Simplicity I can give my users a single XArray.py file, and they can be off and running with something that works right then and there, and it could in many ways be compatible with Numarray (with some slight modifications) when they decide they want the extra functionality of extension modules that you or anyone else who follows your standard provides. But they don't have to compile anything until they really need to. Your implementation leaves me with all or nothing. I'll have to build and use numarray, or I've got an in house only solution. Expediency I want to see a usable standard arise quickly. If you maintain the stance that we should all use the Numarray implementation, instead of just defining a good Numarray interface, everyone has to wait for you to finish things enough to get them accepted by the Python group. Your implementation is complicated, and I suspect they will have many things that they will want you to change before they accept it into their baseline. (If you think my list of suggestions is annoying, wait until you see theirs!) If a simple interface protocol is presented, and a simple pure Python module that implements it. The PEP acceptance process might move along quickly, but you could take your time with implementing your code. Pragmatic You guys aren't finished yet, and I need to give my users an array module ASAP. As such a new project, there are likely to be many bugs floating around in there. I think that when you are done, you will probably have a very good library. Moreover, I'm grateful that you are making it open source. That's very generous of you, and the fact that you are tolerating this discussion is definitely appreciated. Still, I can't put off my projects, and I can't task you to work faster. However, I do think we could agree in a very short term that your design for the interface is a good one. I also think that we (or just me if you like) could make a much smaller PEP that would be more readily accepted. Then everyone in this community could proceed at their own pace - knowing that if we followed the simple standard we would have inter operability with each other. Social Normally I wouldn't expect you to care about any of my special issues. You have your own problems to solve. As I said above, it's generous of you to even offer your source code. However, you are (or at least were) trying to push for this to become a standard. As such, considering how to be more general and apply to a wider class of problems should be on your agenda. If it's not, then you shouldn't be creating the standard. If you don't care about numarray becoming standard, I would like to try my hand at submitting the slightly modified version of your design. I won't be compatible with your stuff, but hopefully others will follow suit. Functionality Data Types I have needs for other types of data that you probably have little use for. If I can't coerce you to make a minor change in specification, I really don't think I could coerce you to support brand new data types (complex ints is the one I've beaten to death, because I could use that one in the short term). What happens when someone at my company wants quaternions? I suspect that you won't have direct support for those. I know that numarray is supposed to be extensible, but the following raises an exception: from numarray import * class QuaternionType(NumericType): def __init__(self): NumericType.__init__(self, "Quaternion", 4*8, 0) Quaternion = QuaternionType() # BOOM! q = array(shape=(10, 10), type=Quaternion) Maybe I'm just doing something wrong, but it looks like your code wants "Quaternion" to be in your (private?) typeConverters dictionary. Ok, try two: from numarray import * q = NDArray(shape=(10, 10), itemsize=4*8) if a[5][5] is None: print "No boom, but what can I do with it?" Maybe this is just a documentation problem. On the other hand, I can do the following pretty readily: import array class Quat2D: def __init__(self, *shape): assert len(shape) == 2 self._buffer = array.array('d', [0])*shape[0]*shape[1]*4 self._shape, self._stride = tuple(shape), (4*shape[0], 4) self._itemsize = 4*8 def __getitem__(self, sub): assert isinstance(sub, tuple) and len(sub) == 2 offset = sub[0]*self._stride[0] + sub[1]*self._stride[1] return tuple([self._buffer[offset + i] for i in range(4)]) def __setitem__(self, sub, val): assert isinstance(sub, tuple) and len(sub) == 2 offset = sub[0]*self._stride[0] + sub[1]*self._stride[1] for i in range(4): self._buffer[offset + i] = val[i] return val q = Quat2D(10, 10) q[5, 5] = (1, 2, 3, 4) print q[5, 5] This isn't very general, but it is short, and it makes a good example. If they get half of their data from calculations using Numarray, and half from whatever I provide them, and then try to mix the results in an extension module that has to know about separate implementations, life is more complicated than it should be. Operations I'm going to have to write my own C extension modules for some high performance operations. All I need to get this done is a void* pointer, the shape, stride, itemsize, itemtype, and maybe some other things to get off and running. You have a growing framework, and you have already indicated that you think of your hidden variables as private. I don't think I or my users should have to understand the whole UFunc framework and API just to create an extension that manipulates a pointer to an array of doubles. Arrays are simpler than UFuncs. I consider them to be pretty seperable parts of your design. If you keep it this way, and it becomes the standard, it seems that I and everyone else will have to understand both parts in order to create an extension module. Flexibility Numarray is going to make a choice of how to implement slicing. My guess is that it will be one of "copy contiguous", "copy on write", "copy by reference". I don't know what the correct choice is, but I know that someone else will need something different based on context. Things like UFuncs and other extension modules that do fast C level calculations typically don't need to concern themselves with slicing behaviour. Design Your implementation would be similar to having the 'pickle' module require you to derive from a 'Pickleable' base class - instead of simply providing __getstate__ and __setstate__ methods. It's an artificial constraint, and those are usually bad. > > All good in principle, but I haven't yet seen a reason to change > numarray. As far as I can tell, it provides all you need exactly > as it is. If you could give an example that demonstrated otherwise... > Maybe you're right. I suspect you as the author will come up with the quick example that shows how to implement my bizarre quaternion example above. I'm not sure if this makes either of us right or wrong, but if you're not buying any of this, then it's probably time for me to chock this off to a difference in opinion and move on. Truthfully this is taking me pretty far from my original tack. Originally I had simply hoped to hack a couple of things into arraymodule.c, and here I am now trying to get a simpler standard in place. I'll try one last time to convince you with the following two statements: - Changing such that you only require the interface is a subtle, but noticeable, improvement to your otherwise very good design. - It's not a difficult change. If that doesn't compel you, at least I can walk away knowing I tried. For the volumes I've written, this will probably be my last pesky message if you really don't want to budge on this issue. > > To tell you the truth, I'm not crazy about how the struct module > handles types or attributes. It's generally far too cryptic for > my tastes. Other than providing backward compatibility, we aren't > interested in it emulating struct. > I consider it a lot like regular expressions. I cringe when I see someone else's, but I don't have much difficulty putting them together. The alternative of coming up with a different specifier for records/structs is probably a mistake now that the struct module already has it's (terse) format specification. Once that is taken into consideration, following all the leads of the struct module makes sense to me. > > I could well misunderstand, but I thought that if you mmap a file > in unix in write mode, you do not use up the virtual memory as > limited by the physical memory and the paging file. Your only > limit becomes the virtual address space available to the processor. > Regarding efficiency, it depends on the implementations, which vary greatly, and there are other subtleties. I've already written a book above, so I won't tire you with details. I will say that closing a large memory mapped file on top of NFS can be dreadful. It probably takes the same amount of total time or less, but from an interactive analysys point of view it's pretty unpleasant on Tru64 at least. Also, just mmaping the whole file puts all of the memory use at the discretion of the OS. I might have a gig or two to work with, but if mmap takes them all, other threads will have to contend for memory. The system (application) as a whole might very well run better if I can retain some control over this. I'm not married to the windowing suggestion. I think it's something to consider, but it might not be a common enough case to try and make a standard mechanism for. If there isn't a way to do it without a kluge, then I'll drop it. Likewise if a simple strategy can't meet anyone's real needs. > > If the 32 bit address is your problem, you are far, far better off > using a 64-bit processor and operating system than trying to kludge up > a windowing memory mechanism. > We don't always get to specify what platform we want to run on. Our customer has other needs, and sometimes hardware support for exotic devices dictate what we'll be using. Frequently it is on 64 bit Alphas, but sometimes the requirement is x86 Linux, or 32 bit Solaris. Finally, our most frustrating piece of legacy software was written in Fortran assuming you could stuff a pointer into an INT*4 and now requires the -taso flag to the compiler for all new code (which turns a sexy 64 bit Alpha into a 32 bit kluge...). Also, much of our data comes on tapes. It's not easy to memory map those. > > I could see a way of doing it for > ufuncs, but the numeric world (and I would think the DSP world > as well) needs far more than element-by-element array functionality. > providing a usable C-api for that kind of memory model would be > a nightmare. But I'm not sure if this or the page file is your > limitation. > I would suggest that any extension module which is not interested in this feature simply raise a NotImplemented exception of some sort. UFuncs could fall into this camp without any criticism from me. All it would have to do is check if the 'window_get' attribute is a callable, and punt an exception. My proposal wasn't necessarily to map in a single element at a time. If the C extension was willing to work these beasts at all, it would check to see if the offset it wanted was between window_min and window_max. If it wasn't, then it would call ob.window_get(offset), and the Python object could update window_min and window_max however it sees fit. For instance by remapping 10 or 20 megabytes on both sides. This particular implementation would allow us to do correlations of a small (mega sample) chunk of data against a HUGE (giga sample) file. This might be the wrong interface, and I'm willing to listen to a better suggestion. It might also be too special of a need to detract from a simpler overall design. Also, there are other uses for things like this. It could possibly be used to implement sparse arrays. It's probably not the best implementation of that, but it could hide a dict of set data points, and present it to an extension module as a complete array. Cheers, -Scott Gilbert __________________________________________________ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/ |
From: <kr...@po...> - 2002-04-13 07:24:36
|
(All of the below is with regard to Numeric 20.2.0.) For a consulting client, I wrote a extension module that does the equivalent of sum(take(a, b)), but without the temporary result in between. I was surprised that when I tried to .resize() the result of this routine, I got a segmentation fault and a core dump. It was crashing at this line in arrayobject.c: if (memcmp(self->descr->zero, all_zero, elsize) == 0) { self->descr, in this case, was the type description for arrays of type "double". It seems that self->descr->zero was 0, as in a null pointer, not a pointer to a location containing (double)0, and this was causing it to crash. It looks like the .zero fields of the type descriptions (which live in arraytypes.c and _numpy.so) are initialized to be null pointers, and only when the initmultiarray() function in multiarraymodule.c is run are these pointers set to point to actual zeroes somewhere in allocated memory. I guess Numeric.py imports multiarray.so, which calls initmultiarray(), so the solution for me was to make sure I import Numeric before importing my module (or at least before resizing arrays produced by my module). But, to my mind, this segfault is a bug --- importing a module that follows all the rules shouldn't put Python in a state that's so dangerously inconsistent that innocent things like .resize() can crash it. Maybe the same .so file that includes the actual data items should be responsible for initializing them --- especially since import_array() imports _numpy without importing multiarray. (I assume there's a reason it wasn't done this way in the first place.) What do other people think? -- /* By Kragen Sitaker, http://pobox.com/~kragen/puzzle4.html */ char b[2][10000],*s,*t=b,*d,*e=b+1,**p;main(int c,char**v){int n=atoi(v[1]); strcpy(b,v[2]);while(n--){for(s=t,d=e;*s;s++){for(p=v+3;*p;p++)if(**p==*s){ strcpy(d,*p+2);d+=strlen(d);goto x;}*d++=*s;x:}s=t;t=e;e=s;*d++=0;}puts(t);} |
From: Perry G. <pe...@st...> - 2002-04-13 00:43:32
|
Scott Gilbert writes: > import array > class ScottArray: > def __init__(self): > self.ndarray_buffer = array.array('d', [0]*100) > self.ndarray_shape = (10, 10) > self.ndarray_stride = (80, 8) > self.ndarray_itemsize = 8 > self.ndarray_itemtype = 'Float64' > > import numarray > > n = numarray.numarray((10, 10), type='Float64') > s = ScottArray() > > very_cool = numarray.add(n, s) > But why not (I may have some details wrong, I'm doing this from memory, and I haven't worked on it myself in a bit): import array import numarray import memory # comes with numarray class ScottArray(NumArray): def __init__(self): # create necessary buffer obj buf = memory.writeable_buffer(array.array('d', [0]*100)) Numarray.__init__(self, shape=(10, 10), type=numarray.Float64 buffer=buf) # _strides not settable from constructor yet, but currently # if you needed to set it: # self._strides = (80, 8) # But for this case it would be computed automatically from # the supplied shape n = numarray.numarray((10, 10), type='Float64') s = ScottArray() maybe_not_quite_so_cool_but_just_as_functional = n + s > This example is kind of silly. I mean, why wouldn't I just use > numarray for > all of my array needs? Well, that's where my world is a little > different than > yours I think. Instead of using 'array.array()' above, there are > times where > I'll need to use 'whizbang.array()' to get a different > PyBufferProcs supporting > object. Or where I'll need to work with a crazy type in one part > of the code, > but I'd like to pass it to an extension that combines your types and mine. > > In these cases where I need "special memory" or "special types" I > could try and > get you guys to accept a patch, but this would just pollute your > project and > probably annoy you in general. A better solution is to create a general > standard mechanism for implementing NDArray types, and let me make my own. > From everything I've seen so far, I don't see why you can't just create a NumArray object directly. You can subclass it (and use multiple inheritance if you need to subclass a different object as well) and add whatever customized behavior you want. You can create new kinds of objects as buffers just so long as you satisfy the buffer interface. > > In the above example, we could have completely different NDArray > implementations working interoperably inside of one UFunc. It > seems to me that > all it really takes to be an NDArray can be specified by a list > of attributes > like the one above. (Probably need a few more attributes to be > really general: > 'ndarray_endian', etc...) In the end, NDArrays are just pointers > to a buffer, > and descriptors for indexing. > Again, why not just create an NDArray object with the appropriate buffer object and attributes (subclassing if necessary). > > I don't believe this would have any significant affect on the > performance of > numarray. (The efficient fast C code still gets a pointer to > work with.) More > over, I'd be very willing to contribute patches to make this happen. > > > If you agree, and we can flesh out what this "attribute > interface" should be, > then I can start distributing my own array module to the > engineers where I work > without too much fear that they'll be screwed once numarray is > stable and they > want to mix and match. > > Code always lives a lot longer than I want it to, and if I give > them something > now which doesn't work with your end product, I'll have done them > a disservice. > All good in principle, but I haven't yet seen a reason to change numarray. As far as I can tell, it provides all you need exactly as it is. If you could give an example that demonstrated otherwise... > > It's all open for discussion, but I would propose that > ndarray_endian be one > of: > > '>' - big endian > '<' - little endian > > This is how the standard Python struct module specifies endian, > and I've been > trying to stay consistant with the baseline when possible. > To tell you the truth, I'm not crazy about how the struct module handles types or attributes. It's generally far too cryptic for my tastes. Other than providing backward compatibility, we aren't interested in it emulating struct. > > > > The above scheme is needed for our purposes because many of our > data files > > contain multiple data arrays and we need a means of creating a numarray > > object for each one. Most of this machinery has already been > implemented, > > but we haven't released it since our I/O package (for astronomical FITS > > files) is not yet at the point of being able to use it. > > > > I could well misundertand, but I thought that if you mmap a file in unix in write mode, you do not use up the virtual memory as limited by the physical memory and the paging file. Your only limit becomes the virtual address space available to the processor. If the 32 bit address is your problem, you are far, far better off using a 64-bit processor and operating system than trying to kludge up a windowing memory mechanism. I could see a way of doing it for ufuncs, but the numeric world (and I would think the DSP world as well) needs far more than element-by-element array functionality. providing a usable C-api for that kind of memory model would be a nightmare. But I'm not sure if this or the page file is your limitation. Perry |
From: Scott G. <xs...@ya...> - 2002-04-12 04:45:56
|
--- Perry Greenfield <pe...@st...> wrote: > > I guess we are not sure we understand what you mean by interface. > In particular, we don't understand why sharing the same object > attributes (the private ones you list above) is a benefit to the > code you are writing if you aren't also using the low level > implementation. The above attributes are private and nothing > external to the Class should depend on or even know about them. > Could you elaborate on what you mean by interface and the > relationship between your arrays and numarrays? > There are several places in your code that check to see if you are working with a valid type for NDArrays. Currently this check consists of asking the following questions: 'Is it a tuple or list?' 'Is it a scalar of some sort?' 'Does it derive from our NDArray class?' If any of these questions answer true, it does the right thing and moves on. If none of these is true, it raises an exception. I suppose this is fine if you are only concerned about working with your own implementation of an array type, but I hope you'll consider the following as a minor change that opens up the possibility for other compatible array implementations to work interoperably. Instead have the code ask the following questions: 'Is it a tuple or list?' 'Is it a scalar of some sort?' 'Does it support the attributes necessary to be like an NDArray object?' This change is very similar to how you can pass in any Python object to the "pickle.dump()" function, and if it supports the "write()" method it will be called: >>> class WhoKnows: ... def write(self, x): ... print x >>> >>> import pickle >>> >>> w = WhoKnows() >>> >>> pickle.dump('some data', w) S'some data' p1 . Until reading your response above, I didn't realize that you consider your single underscore attributes to be totally private. In general, I try to use a single underscore to mean protected (meaning you can use them if you REALLY know what you are doing), hence my confusion. With that in mind, pretend that I suggested the following instead: The specification of an NDArray is that it has the following attributes ndarray_buffer - a PyObject which has PyBufferProcs ndarray_shape - a tuple specifying the shape of the array ndarray_stride - a tuple specifyinf the index multipliers ndarray_itemsize - an int/long stating the size of items ndarray_itemtype - some representation of type This would be a very minor change to your functions like inputarray(), getNDInfo(), getNDArray(), but it would allow your UFuncs to work with other implementations of arrays. As an example similar to the pickle example above: import array class ScottArray: def __init__(self): self.ndarray_buffer = array.array('d', [0]*100) self.ndarray_shape = (10, 10) self.ndarray_stride = (80, 8) self.ndarray_itemsize = 8 self.ndarray_itemtype = 'Float64' import numarray n = numarray.numarray((10, 10), type='Float64') s = ScottArray() very_cool = numarray.add(n, s) This example is kind of silly. I mean, why wouldn't I just use numarray for all of my array needs? Well, that's where my world is a little different than yours I think. Instead of using 'array.array()' above, there are times where I'll need to use 'whizbang.array()' to get a different PyBufferProcs supporting object. Or where I'll need to work with a crazy type in one part of the code, but I'd like to pass it to an extension that combines your types and mine. In these cases where I need "special memory" or "special types" I could try and get you guys to accept a patch, but this would just pollute your project and probably annoy you in general. A better solution is to create a general standard mechanism for implementing NDArray types, and let me make my own. In the above example, we could have completely different NDArray implementations working interoperably inside of one UFunc. It seems to me that all it really takes to be an NDArray can be specified by a list of attributes like the one above. (Probably need a few more attributes to be really general: 'ndarray_endian', etc...) In the end, NDArrays are just pointers to a buffer, and descriptors for indexing. I don't believe this would have any significant affect on the performance of numarray. (The efficient fast C code still gets a pointer to work with.) More over, I'd be very willing to contribute patches to make this happen. If you agree, and we can flesh out what this "attribute interface" should be, then I can start distributing my own array module to the engineers where I work without too much fear that they'll be screwed once numarray is stable and they want to mix and match. Code always lives a lot longer than I want it to, and if I give them something now which doesn't work with your end product, I'll have done them a disservice. BTW: Allowing other types to fill in as NDArrays also allows other types to implement things like slicing as they see fit (slice and copy contiguious, slice and copy on write, slice and copy by reference, etc...). > > We are hoping to get numarray into the distribution [it won't be the > end of the world for us if it doesn't happen]. I'll warn you that the > PEP is out of date. We are likely to update it only after we feel > we are close to having the implementation ready for consideration > for including into the standard distribution. I would refer to the > actual implementation and the design notes for the time being. > Yeah, I recognize that the PEP is gathering dust at the moment. I'm not having too much trouble following through the source and design docs. It took me a few days to "get it", but that's probably because I'm slower than your average bear. :-) Regarding the PEP, what I would like to see happen is that if we agree that the "attribute interface" stuff above is the right way to go about things, I would (or we would) submit a milder interim PEP specifying what those attributes are, how they are to be interpreted, and a simple Python module implementing a general NDArray class for consumption. Hopefully this PEP would specify a canonical list of type names as well. Then we could make updates to the other PEP if necessary. > > Some of the name changes are worth considering (like replacing ._byteswap > with an endian indicator, though I find _endian completely opaque as to > what it would mean--1 means what? little or big?). (BTW, we already have > _itemsize). _contiguous and _aligned are things we have been considering > changing, but I would have to think about it carefully to determine if > they really are redundant. > It's all open for discussion, but I would propose that ndarray_endian be one of: '>' - big endian '<' - little endian This is how the standard Python struct module specifies endian, and I've been trying to stay consistant with the baseline when possible. > > It looks like you are trying to deal with records with these "structs". > We deal with records (efficiently) in a completely different way. Take > a look at the recarray module. > Will definitely do. I've called them structs simply because they borrow their format string from the struct module that ships with Python. I'm not hung up on the name, and I wouldn't object to an alias. Too early for me to tell if there is even a difference in the underlying memory, but maybe we'll end up with 'structs' for my notion of things, and 'records' for yours. > > We deal with memory mapping a completely different way. It's a bit late > for me to go into it in great detail, but we wrap the standard library > mmap module with a module that lets us manage memory mapped files. > This module basically memory maps an entire file and then in effect > mallocs segments of that file as buffer objects. This allocation of > subsets is needed to ensure that overlapping memory maps buffers > don't happen. One can basically reserve part of the memory mapped file > as a buffer. Once that is done, nothing else can use that part of the > file for another buffer. We do not intend to handle memory maps as a > way of sequentially mapping parts of the file to provide windowed views > as your code segment above suggests. If you want a buffer that is the > whole (large) file, you just get a mapped buffer to the whole thing. > (Why wouldn't you?) > I think the idea of taking a 500 megabyte (or 5 gigabyte) file, and windowing 1 meg of actual memory at time pretty attractive. Sometimes we do very large correlations, and there just isn't enough memory to mmap the whole file (much less two files for correlation). Any library that doesn't want to support this business could just raise a NotImplemented error on encountering them. Maybe I shouldn't be calling this "memory mapping". Even though it could be implemented on top of mmap, truthfully I just want to support a "windowing" interface. If we could specify the windowing attributes and indicate the standard usage that would be great. Maybe: ndarray_window(self, offset) ndarray_winmin ndarray_winmax > > The above scheme is needed for our purposes because many of our data files > contain multiple data arrays and we need a means of creating a numarray > object for each one. Most of this machinery has already been implemented, > but we haven't released it since our I/O package (for astronomical FITS > files) is not yet at the point of being able to use it. > There is a group at my company that is using FITS for some stuff. I don't know enough about it to comment though... Cheers, -Scott __________________________________________________ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/ |
From: Travis O. <oli...@ee...> - 2002-04-11 22:44:34
|
> Looking at the code to PyArray_Free, I agree with Chris. Called to > free a 2D > array, I think that PyArray_Free leaks all of the row storage because > ap->nd == 2, not 3: > > * {%c++%} */ > extern int PyArray_Free(PyObject *op, char *ptr) { > PyArrayObject *ap = (PyArrayObject *)op; > int i, n; > > if (ap->nd > 2) return -1; > if (ap->nd == 3) { > n = ap->dimensions[0]; > for (i=0; i<n; i++) { > free(((char **)ptr)[i]); > } > } > if (ap->nd >= 2) { > free(ptr); > } > Py_DECREF(ap); > return 0; > } > /* {%c++%} */ > > This has been broken since the beginning. I believe the documentation says as much. I've never used it because I always think of 2-D arrays as a block of data not as rows of pointers. It should be fixed, but no one's ever been interested enough to do it. -Travis Oliphant |
From: Perry G. <pe...@st...> - 2002-04-11 21:56:49
|
> [mailto:num...@li...]On Behalf Of Scott > Gilbert > Subject: [Numpy-discussion] Introduction > > > Hello All. > > I'm interested in this project, and am curious to what level you are > willing to accept outside contribution. I just tried to subscribe to > the developers list, but I didn't realize that required admin approval. > Hopefully it doesn't look like I was shaking the door without knocking > first. > > Is this list active? Is this the correct place to talk about Numarray? Sure. > > Following your design for the Array stuff, I've been able to implement > a pretty usable array class that supports the bazillion array types I > need (Bit, Complex Integer, etc...). This gets me past my core > requirements without polluting your world, but unfortunately my new > XArray type doesn't play so well with your UFuncs. I think my users > will definitely want to use your UFuncs when the time comes, so I want > to remedy this situation. > > The first change I would like to make is to rework your code that > verifies that an object is a "usable" array. I think NumArray should > only check for the interface required, not the actual type hierarchy. > By this I mean that the minimum required to be a supported array type > is that it support the correct attributes, not that it actually inherit > from NDArray: > > (quoting from your paper) something like: > > _data > _shape > _strides > _byteoffset > _aligned > _contiguous > _type > _byteswap > > Most of these are just integer fields, or tuples of integers. Ignoring > _type for the moment, it appears that the interface required to be a > NumArray is much less strict than actually requiring it to derive from > NumArray. If you allow me to change a few functions (inputarray() in > numarray.py is one small example), I could use my independant XArray > class almost as is, and moreover I can implement new array objects > (possibly as extension types) for crazy things like working with page > aligned memory, memory mapping etc... > I guess we are not sure we understand what you mean by interface. In particular, we don't understand why sharing the same object attributes (the private ones you list above) is a benefit to the code you are writing if you aren't also using the low level implementation. The above attributes are private and nothing external to the Class should depend on or even know about them. Could you elaborate on what you mean by interface and the relationship between your arrays and numarrays? > > Well, that's almost enough. The _type field poses a small problem of > sorts. It looks like you don't require a _type to be derived from > NumericType, and this is a good thing since it allows me (and others) > to implement NumArray compatible arrays without actually requiring > NumArray to be present. > What do you mean by NumArray compatible? [some issues snipped since we need to understand the interface issue first] > I don't know if you're trying to get all of NumArray into the Python > distribution or not, but I suspect a good interim step would be to have > a PEP that specifies what it means to be a NumArray or NDArray in > minimal terms. Perhaps supplying an Array only module in Python that > implements this interface. Again, I'd be willing to help with all of > this. > We are hoping to get numarray into the distribution [it won't be the end of the world for us if it doesn't happen]. I'll warn you that the PEP is out of date. We are likely to update it only after we feel we are close to having the implementation ready for consideration for including into the standard distribution. I would refer to the actual implementation and the design notes for the time being. > > ------------------------- > > Ok, other suggestions... > > Here is the list of things that your design document indicates are > required to be a NumArray: > > _data > _shape > _strides > _byteoffset > _aligned > _contiguous > _type > _byteswap > > I believe that one could calculate the values for _aligned and > _contiguous from the other fields. So they shouldn't really be part of > the interface required. I suspect it is useful for the C > implementation of UFuncs to have this information in the NDINfo struct > though, so while I would drop them from attribute interface, I would > delegate the task of calculating these values to getNDInfo() and/or > getNumInfo(). > > I also notice that you chose _byteswap to indicate byteswapping is > needed. I think a better choice would be to specify the endian-ness of > the data (with an _endian attr), and have getNDInfo() and getNumInfo() > calculte the _byteswap value for the NDInfo struct. > > In my implementation, I came up with a slightly different list: > > self._endian > self._offset > self._shape > self._stride > self._itemtype > self._itemsize > self._itemformat > self._buffer > Some of the name changes are worth considering (like replacing ._byteswap with an endian indicator, though I find _endian completely opaque as to what it would mean--1 means what? little or big?). (BTW, we already have _itemsize). _contiguous and _aligned are things we have been considering changing, but I would have to think about it carefully to determine if they really are redundant. > The only minimal differences are that _itemsize allows me to work with > arrays of bytes without having any clue what the underlying type is (in > some cases, _itemtype is "Unknown".) Secondly, I implemented a > "Struct" _itemtype, and _itemformat is useful for for this case. (It's > the same format string that the struct module in Python uses.) > It looks like you are trying to deal with records with these "structs". We deal with records (efficiently) in a completely different way. Take a look at the recarray module. > Also, I specified 0 for _itemsize when the actual items aren't byte > addressable. In my module, this only occurred with the Bit type. I > figured specifying 0 like this could keep a UFunc that isn't Bit aware > from stepping on memory that it isn't allowed to. > Again, we aren't sure how this works with numarray. > ------------------------- > > Next thought: Memory Mapping > > I really like the idea of having Python objects that map huge files a > piece at time without using all of available memory. I've seen this in > NumArray's charter as part of the reason for breaking away from > Numeric, and I'm curious how you intend to address it. > > Right now, the only requirement for _data seems to be that it implement > the PyBufferProcs. For memory mapping something else is needed... > > I haven't implemented this, so take it as just my rambling thoughts: > > With the addition of 3 new, optional, attributes to the NumArray object > interface, I think this could be efficiently accomplished: > > _mapproc > _mapmin > _mapmax > > If _mapproc is present and not None, then it points to a function who's > responsibility it is to set _mapmin and _mapmax appropriately. > _mapproc takes one argument which is the desired byte offset into the > virtual array. This is probably easier to describe with code: > > def _mapproc(self, offset): > unmap_the_old_range() > mmap_a_new_range_that_includes_byteoffset() > self._mapmin = minimum_of_new_range() > self._mapmax = maximum_of_new_range() > > In this way, when the delta between _mapmin and _mapmax is large > enough, the UFuncs could act over a large contiguous portion of the > _data array at a time before another remapping is necessary. If the > byteoffset that a UFunc needs to work with is outside of _mapmin and > _mapmax, it must call _mapproc to remedy the situation. > > This puts a lot of work into UFuncs that choose to support this. I > suppose that is tough to avoid though. > We deal with memory mapping a completely differnent way. It's a bit late for me to go into it in great detail, but we wrap the standard library mmap module with a module that lets us manage memory mapped files. This module basically memory maps an entire file and then in effect mallocs segments of that file as buffer objects. This allocation of subsets is needed to ensure that overlapping memory maps buffers don't happen. One can basically reserve part of the memory mapped file as a buffer. Once that is done, nothing else can use that part of the file for another buffer. We do not intend to handle memory maps as a way of sequentially mapping parts of the file to provide windowed views as your code segment above suggests. If you want a buffer that is the whole (large) file, you just get a mapped buffer to the whole thing. (Why wouldn't you?) The above scheme is needed for our purposes because many of our data files contain multiple data arrays and we need a means of creating a numarray object for each one. Most of this machinery has already been implemented, but we haven't released it since our I/O package (for astronomical FITS files) is not yet at the point of being able to use it. > Also, there are threading issues to think about here. I don't know if > UFuncs are going to release the Global Interpreter Lock, but if they do > it's possible that multiple threads could have the same PyObject and > try to _mapproc different offsets at different times. > To tell you the truth, we haven't dealt with the threading issue much. We think about it occasionally, but have deferred dealing with it until we have finished other aspects first. We do want to make it thread safe though. Perry Greenfield |
From: Todd M. <jm...@st...> - 2002-04-11 21:35:34
|
cl...@sp... wrote: > >cl...@sp... writes: > > > > Hello, > > I'm trying to track down a segv when I do the B[:] operation on an > > array, "B", a that I've built in as a view on external data. During... > > [snip] > >To clarify my own somewhat non-sensical post: When I started composing >my message, I was trying to figure out a bug in my own code that >caused a crash while doing slice_array. I've since fixed that bug. >However, in the process of figuring out what I was doing wrong I >was browsing the Numeric source code. While examining >PyArray_Free(..) in arrayobject.c, I saw that returns -1 whenever the >number of dimensions is greater than 2, yet it has code that tests for >when the number of dimensions equals 3. > >So utimately, my post is just an alert, that I think there might be >some code that needs to be cleaned up. > >Thanks, > lacking-caffeine-ly yours > -chris > >_______________________________________________ >Numpy-discussion mailing list >Num...@li... >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > Looking at the code to PyArray_Free, I agree with Chris. Called to free a 2D array, I think that PyArray_Free leaks all of the row storage because ap->nd == 2, not 3: * {%c++%} */ extern int PyArray_Free(PyObject *op, char *ptr) { PyArrayObject *ap = (PyArrayObject *)op; int i, n; if (ap->nd > 2) return -1; if (ap->nd == 3) { n = ap->dimensions[0]; for (i=0; i<n; i++) { free(((char **)ptr)[i]); } } if (ap->nd >= 2) { free(ptr); } Py_DECREF(ap); return 0; } /* {%c++%} */ Other opinions? Todd -- Todd Miller jm...@st... STSCI / SSG (410) 338 4576 |
From: Perry G. <pe...@st...> - 2002-04-11 16:01:32
|
Hi Scott, I've printed out your message and will try to read and understand it today. It may be a couple days before we can respond, so don't take a lack of an immediate response as disinterest. Thanks, Perry |
From: Scott G. <xs...@ya...> - 2002-04-11 11:31:53
|
Hello All. I'm interested in this project, and am curious to what level you are willing to accept outside contribution. I just tried to subscribe to the developers list, but I didn't realize that required admin approval. Hopefully it doesn't look like I was shaking the door without knocking first. Is this list active? Is this the correct place to talk about Numarray? A little about me: My name is Scott Gilbert, and I work as a software developer for a company called Rincon Research in Tucson Arizona. We do a lot digital signal processing/analysis among other things. In the last year or so, we've started to use Python in various capacities, and we're hoping to use it for more things. We need a good array module for various things. Some are similar to what it looks like Numarray is targeted at (fft, convolutions, etc...), and others are pretty different (providing buffers for reading data from specialized hardware etc...) About a week ago, I noticed that Guido over in Python developer land was willing to accept patches to the standard array module. As such, I thought I would take that opportunity to try and wedge some desirements and requirements I have into that baseline. Bummer for me, but they weren't exactly exited about bloating out arraymodule.c to meet my needs, and in retrospect that does make good sense. A number of people suggested that this might be a better place to try and get what I need. So here I am, poking around and wondering if I can play in your sandbox. If you're willing to let me contribute, my specific itches that I need to scratch are below. Otherwise - bummer, and I hope you all catch crabs... :-) ----------------------------------- It's taken me a couple of days to understand what's going on in the source. I've read through the design docs, and the PEP, but it wasn't until I tried to re-implement it that it really clicked. My re-implementation of the array portion of what you're doing is attached. There are still some holes to fill in, but it's fairly complete and supports a whole bunch of things which yours does not (Some of which you might even find useful: Pickling, Bit type). I'm pretty proud of it for only 400 lines of Python (Most of which is the bazillion type declarations). It's probably riddled with bugs as it's less than a day old... After initially thinking that you guys were getting too clever, I've come to realize it's a pretty good design overall. Still I have some changes I would like to make if you'll let me. (Both to the design and the implementation) ------------------------- Following your design for the Array stuff, I've been able to implement a pretty usable array class that supports the bazillion array types I need (Bit, Complex Integer, etc...). This gets me past my core requirements without polluting your world, but unfortunately my new XArray type doesn't play so well with your UFuncs. I think my users will definitely want to use your UFuncs when the time comes, so I want to remedy this situation. The first change I would like to make is to rework your code that verifies that an object is a "usable" array. I think NumArray should only check for the interface required, not the actual type hierarchy. By this I mean that the minimum required to be a supported array type is that it support the correct attributes, not that it actually inherit from NDArray: (quoting from your paper) something like: _data _shape _strides _byteoffset _aligned _contiguous _type _byteswap Most of these are just integer fields, or tuples of integers. Ignoring _type for the moment, it appears that the interface required to be a NumArray is much less strict than actually requiring it to derive from NumArray. If you allow me to change a few functions (inputarray() in numarray.py is one small example), I could use my independant XArray class almost as is, and moreover I can implement new array objects (possibly as extension types) for crazy things like working with page aligned memory, memory mapping etc... Well, that's almost enough. The _type field poses a small problem of sorts. It looks like you don't require a _type to be derived from NumericType, and this is a good thing since it allows me (and others) to implement NumArray compatible arrays without actually requiring NumArray to be present. However, it would be nice if you declared a more comprehensive list of typenames - even if they aren't all implemented in NumArray proper. Who knows, maybe the SciPy guys have a use for complex integers or bit arrays. If you make a reasonable canonical list, our data could be passed back and forth even if NumArray doesn't know what to do with it. See my attached module for the types of things I'm thinking of. I'm not so concerned about the "Native Types" that are in there, but I think committing a list of named standard types. (I suspect there are others that are interested in standard C types even if the size changes between machines...) If you were to specify a minimal interface like this in the short term, I could begin propagating my array module to my users. I could get my work done now, knowing that I'll be compatible with NumArray proper once it matures. I'd be willing to participate in making these changes if necessary. Looking at the big picture, I think it's desirable that there really only be one official standard for ND arrays in the Python world. That way, the various independent groups can all share their independent work. You guys are the heir-apparent, so to speak, from the Python guys point of view. I don't know if you're trying to get all of NumArray into the Python distribution or not, but I suspect a good interim step would be to have a PEP that specifies what it means to be a NumArray or NDArray in minimal terms. Perhaps supplying an Array only module in Python that implements this interface. Again, I'd be willing to help with all of this. ------------------------- Ok, other suggestions... Here is the list of things that your design document indicates are required to be a NumArray: _data _shape _strides _byteoffset _aligned _contiguous _type _byteswap I believe that one could calculate the values for _aligned and _contiguous from the other fields. So they shouldn't really be part of the interface required. I suspect it is useful for the C implementation of UFuncs to have this information in the NDINfo struct though, so while I would drop them from attribute interface, I would delegate the task of calculating these values to getNDInfo() and/or getNumInfo(). I also notice that you chose _byteswap to indicate byteswapping is needed. I think a better choice would be to specify the endian-ness of the data (with an _endian attr), and have getNDInfo() and getNumInfo() calculte the _byteswap value for the NDInfo struct. In my implementation, I came up with a slightly different list: self._endian self._offset self._shape self._stride self._itemtype self._itemsize self._itemformat self._buffer The only minimal differences are that _itemsize allows me to work with arrays of bytes without having any clue what the underlying type is (in some cases, _itemtype is "Unknown".) Secondly, I implemented a "Struct" _itemtype, and _itemformat is useful for for this case. (It's the same format string that the struct module in Python uses.) Also, I specified 0 for _itemsize when the actual items aren't byte addressable. In my module, this only occurred with the Bit type. I figured specifying 0 like this could keep a UFunc that isn't Bit aware from stepping on memory that it isn't allowed to. ------------------------- Next thought: Memory Mapping I really like the idea of having Python objects that map huge files a piece at time without using all of available memory. I've seen this in NumArray's charter as part of the reason for breaking away from Numeric, and I'm curious how you intend to address it. Right now, the only requirement for _data seems to be that it implement the PyBufferProcs. For memory mapping something else is needed... I haven't implemented this, so take it as just my rambling thoughts: With the addition of 3 new, optional, attributes to the NumArray object interface, I think this could be efficiently accomplished: _mapproc _mapmin _mapmax If _mapproc is present and not None, then it points to a function who's responsibility it is to set _mapmin and _mapmax appropriately. _mapproc takes one argument which is the desired byte offset into the virtual array. This is probably easier to describe with code: def _mapproc(self, offset): unmap_the_old_range() mmap_a_new_range_that_includes_byteoffset() self._mapmin = minimum_of_new_range() self._mapmax = maximum_of_new_range() In this way, when the delta between _mapmin and _mapmax is large enough, the UFuncs could act over a large contiguous portion of the _data array at a time before another remapping is necessary. If the byteoffset that a UFunc needs to work with is outside of _mapmin and _mapmax, it must call _mapproc to remedy the situation. This puts a lot of work into UFuncs that choose to support this. I suppose that is tough to avoid though. Also, there are threading issues to think about here. I don't know if UFuncs are going to release the Global Interpreter Lock, but if they do it's possible that multiple threads could have the same PyObject and try to _mapproc different offsets at different times. It is possible to implement a mutex for the NumArray without requiring anything special from the PyObject that implements it... ----------------------------- Ok. That's probably way too much content for an Introductory email. I do have more thoughts on this stuff though. They'll just have to wait for another time. Nice to meet you all, -Scott Gilbert __________________________________________________ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/ |
From: Tim C. <tc...@op...> - 2002-04-08 20:27:30
|
Andrew McNamara wrote: > > The behavior I'm seeing with zero length Numeric arrays is not what I > would have expected: > > >>> from Numeric import * > >>> array([5]) != array([]) > zeros((0,), 'l') > >>> array([]) == array([]) > zeros((0,), 'l') > >>> allclose(array([5]), array([])) > 1 The Numpy docs point out that == and != are implemented via the logical ufuncs, and that: "The ``logical'' ufuncs also perform their operations on arrays in elementwise fashion, just like the ``mathematical'' ones." I think this explains the results you are seeing: if you do an element-wise comparison of a length-one array with a zero-length array, the Numpy recycling rule means that you should always get a zero-length result. Note that zeros((0,),'l') is not zero, it is zero zeros. So although the results are surprising (at least to me, and you), I think the observed results are logically correct, although surprising. But, if that is the case, why does this hold (which I suspect reflects what you originally expected)?: >>> from Numeric import * >>> array([5,6]) != array([]) 1 >>> array([5,6]) == array([]) 0 Tim C |
From: Andrew M. <an...@ob...> - 2002-04-08 06:31:38
|
The behavior I'm seeing with zero length Numeric arrays is not what I would have expected: >>> from Numeric import * >>> array([5]) != array([]) zeros((0,), 'l') >>> array([]) == array([]) zeros((0,), 'l') >>> allclose(array([5]), array([])) 1 This is with Numeric-20.3 (and Numeric-20.2.1) - is this behavior correct, or have I stumbled across a bug? If both sides of the comparison are arrays with a length greater than zero, the comparisons work as expected: >>> array([5]) != array([6]) array([1]) >>> array([5, 5]) != array([6]) array([1, 1]) >>> array([5]) != array([5]) array([0]) The problem came up when I was writing unittests for some Numpy code: under some circumstances, the code under test is expected to return a zero length array: I was somewhat surprised when I couldn't make the test fail! 8-) -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ |
From: Jochen <jo...@un...> - 2002-04-06 04:55:18
|
The following message is a courtesy copy of an article that has been posted to comp.lang.python.announce as well. I have made a numerical intergation package available at=20 ,---- | http://python.jochen-kuepper.de/integrate `---- This is a copy of the integrate module of scipy by Travis Oliphant plus some small changes and rearrangements to make it work standalone (well, it need Numeric). All credits go to the scipy folks, esp. Travis, all errors should be blamed on me. Greetings, Jochen PS: In the long run this module will be phased out in favor of scipy, but for now it might be useful for someone... --=20 Einigkeit und Recht und Freiheit http://www.Jochen-Kuepper= .de Libert=E9, =C9galit=E9, Fraternit=E9 GnuPG key: 44BCCD= 8E Sex, drugs and rock-n-roll |
From: David A. <Da...@Ac...> - 2002-04-05 21:37:41
|
Guido van Rossum wrote: > > I would propose the following for multi-dimensional arrays: > > > > a = array.array('d', 20000, 20000) > > > > or: > > > > a = array.xarray('d', 20000, 20000) > > I just realized that multi-dimensional __getitem__ shouldn't be a big > deal. The question is, given the above declaration, what a[0] should > return: the same as a[0, 0] or a copy of a[0, 0:20000] or a reference > to a[0, 0:20000]. Or a ValueError? In the face of ambiguity, refuse the temptation to guess. IIRC, this issue caused lots of problems in the numpy world. cc'ing Paul in case he wants to jump in to fill in my rusty memory. Why does submitting a patch to arraymodule seem an easier path than modifying numarray or numpy to support what's needed? I believe that the goals of numarray aren't that different from what Scott is trying to do (memory management APIs, etc.). I'd like to see fewer multi-dimensional array objects, not more... --david ascher |
From: Daniel D. K. <ke...@fe...> - 2002-04-05 16:22:36
|
Howdy: Shoudln't line 296 in MLab.py of Numeric 21.0, which currently reads: val = squeeze(dot(transpose(m)*conjugate(y)) / fact) read: val = squeeze(dot(transpose(m),conjugate(y)) / fact) Thanks, D.Kelson Carnegie Observatories http://www.ociw.edu/~kelson |
From: Pearu P. <pe...@ce...> - 2002-04-04 21:01:45
|
On Thu, 4 Apr 2002, Ray Drew wrote: > Python 2.2, Numpy version='20.3' > > Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit (Intel)] on win32 > Type "copyright", "credits" or "license" for more information. > IDLE 0.8 -- press F1 for help > >>> from RandomArray import * > >>> normal(3., 1., (5,)) > array([-3.78572679, -3.63714516, -3.01228334, -4.80211985, -2.57420304]) > > Why am I getting negative values with Python 2.2? This happens consistently. > Any help would be appreciated. This is a bug in Numpy 20.3 and should be fixed in Numpy 21.0. Pearu |
From: Ray D. <ray...@ya...> - 2002-04-04 10:26:16
|
Hi, Can anyone explain the following? Python 2.1.1, Numpy version='20.2.0' Python 2.1.1 (#20, Jul 20 2001, 01:19:29) [MSC 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. IDLE 0.8 -- press F1 for help >>> from RandomArray import * >>> normal(3., 1., (5,)) array([ 2.19091588, 2.44682837, 2.51790264, 4.26374364, 4.56880629]) Python 2.2, Numpy version='20.3' Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. IDLE 0.8 -- press F1 for help >>> from RandomArray import * >>> normal(3., 1., (5,)) array([-3.78572679, -3.63714516, -3.01228334, -4.80211985, -2.57420304]) Why am I getting negative values with Python 2.2? This happens consistently. Any help would be appreciated. Thanks, Ray _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com |
From: Nils W. <nw...@me...> - 2002-04-03 08:32:20
|
Hi, I am looking for a suitable factorization of complex symmetric matrices. Where can I find a proper routine ? Nils |
From: <cl...@sp...> - 2002-04-01 18:58:03
|
cl...@sp... writes: > > Hello, > I'm trying to track down a segv when I do the B[:] operation on an > array, "B", a that I've built in as a view on external data. During... > [snip] To clarify my own somewhat non-sensical post: When I started composing my message, I was trying to figure out a bug in my own code that caused a crash while doing slice_array. I've since fixed that bug. However, in the process of figuring out what I was doing wrong I was browsing the Numeric source code. While examining PyArray_Free(..) in arrayobject.c, I saw that returns -1 whenever the number of dimensions is greater than 2, yet it has code that tests for when the number of dimensions equals 3. So utimately, my post is just an alert, that I think there might be some code that needs to be cleaned up. Thanks, lacking-caffeine-ly yours -chris |
From: <cl...@sp...> - 2002-04-01 17:01:22
|
Hello, I'm trying to track down a segv when I do the B[:] operation on an array, "B", a that I've built in as a view on external data. During the process I ran into the following code (Numeric-21.0): /* {%c++%} */ extern int PyArray_Free(PyObject *op, char *ptr) { PyArrayObject *ap = (PyArrayObject *)op; int i, n; if (ap->nd > 2) return -1; if (ap->nd == 3) { n = ap->dimensions[0]; for (i=0; i<n; i++) { free(((char **)ptr)[i]); } } if (ap->nd >= 2) { free(ptr); } Py_DECREF(ap); return 0; } /* {%c++%} */ The multiple, incompatible tests of ap->nd are the problem. -chris |
From: Paul F D. <pa...@pf...> - 2002-03-31 03:01:32
|
About random number generation with Numeric: a. IMHO, RNG is the right choice if you are picky about the quality of the generator. This generator has a long history of heavy use. RandomArray is in the core because someone put it there early, not because it is the best. I do not claim to be an authority on this but that is my understanding. b. The suggestion made by one correspondent, that a generator should generate and throw away one value when the seed is set, sounds correct if viewed from the point of view of the initial set of a single stream. But many users need multiple streams that are independent and reproducible. This is done by saving the state of the generator and then restoring it later. It is important that this save/restore not change the results compared to not doing it. The presence or absence of another computation, or frequency of dump/restarts, that require a save/restore, must not affect the result. Thus a decision to throw away a result must come from the application level. |
From: rob <ro...@py...> - 2002-03-30 16:59:17
|
After I post, I always see the dumb error. I am not including the 6x term in my finite difference equation. It now converges, but I get wierd looking V map. Rob. Here is the fixed code from math import * from Numeric import * # #*** ENTER DATA filename= "out" # bobfile=open(filename+".bob","w") print "\n"*30 NX=30 # number of cells NY=30 NZ=30 N=30 # size of box resmax=1e-3 # conj-grad tolerance #allocate arrays ##Ex=zeros((NX+2,NY+2,NZ+2),Float) ##Ey=zeros((NX+2,NY+2,NZ+2),Float) ##Ez=zeros((NX+2,NY+2,NZ+2),Float) p=zeros((NX+1,NY+1,NZ+1),Float) q=zeros((NX+1,NY+1,NZ+1),Float) r=zeros((NX+1,NY+1,NZ+1),Float) u=zeros((NX+1,NY+1,NZ+1),Float) u[0:N,0:N,0]=0 # box at 1V with one side at 0V u[0:N,0,0:N]=1 u[0,0:N,0:N]=1 u[0:N,0:N,N]=1 u[0:N,N,0:N]=1 u[N,0:N,0:N]=1 r[1:NX-1,1:NY-1,1:NZ-1]=(u[1:NX-1,0:NY-2,1:NZ-1]+ #initialize r matrix u[1:NX-1,2:NY,1:NZ-1]+ u[0:NX-2,1:NY-1,1:NZ-1]+ u[2:NX,1:NY-1,1:NZ-1]+ u[1:NX-1,1:NY-1,0:NZ-2]+ u[1:NX-1,1:NY-1,2:NZ]- 6*u[1:NX-1,1:NY-1,1:NZ-1]) p[...]=r[...] #initialize p matrix # #**** START ITERATIONS # N=(NX-2)*(NY-2)*(NZ-2) # left over from Jacobi solution, not used KK=0 # iteration counter res=0.0; #set residuals=0 while(1): q[1:NX-1,1:NY-1,1:NZ-1]=(6*p[1:NX-1,1:NY-1,1:NZ-1]- p[1:NX-1,0:NY-2,1:NZ-1]- # finite difference eq p[1:NX-1,2:NY,1:NZ-1]- p[0:NX-2,1:NY-1,1:NZ-1]- p[2:NX,1:NY-1,1:NZ-1]- p[1:NX-1,1:NY-1,0:NZ-2]- p[1:NX-1,1:NY-1,2:NZ]) # Calculate r dot p and p dot q rdotp = 0.0 pdotq = 0.0 rdotp = add.reduce(ravel( r[1:NX-1,1:NY-1,1:NZ-1] * p[1:NX-1,1:NY-1,1:NZ-1])) pdotq = add.reduce(ravel( p[1:NX-1,1:NY-1,1:NZ-1] * q[1:NX-1,1:NY-1,1:NZ-1])) # Set alpha value alpha = rdotp/pdotq # Update solution and residual u[1:NX-1,1:NY-1,1:NZ-1] += alpha*p[1:NX-1,1:NY-1,1:NZ-1] r[1:NX-1,1:NY-1,1:NZ-1] += - alpha*q[1:NX-1,1:NY-1,1:NZ-1] # calculate beta rdotq = 0.0 rdotq = add.reduce(ravel(r[1:NX-1,1:NY-1,1:NZ-1]*q[1:NX-1,1:NY-1,1:NZ-1])) beta = rdotq/pdotq # Set the new search direction p[1:NX-1,1:NY-1,1:NZ-1] = r[1:NX-1,1:NY-1,1:NZ-1] - beta*p[1:NX-1,1:NY-1,1:NZ-1] res = sort(ravel(r[1:NX-1,1:NY-1,1:NZ-1]))[-1] #find largest residual # resmax = max(resmax,abs(res)) KK+=1 # print "Iteration Number %d Residual %1.2e" %(KK,abs(res)) if (abs(res)<=resmax): break # if residual is small enough break out print "Number of Iterations ",KK |
From: rob <ro...@py...> - 2002-03-30 16:29:08
|
I'm experimenting with electrostatics now. I have an iterative Jacobian Laplace solver working but it can be slow. It creates a beautiful 3D Animabob image. So I decided to try out a conjugate-gradient solver, which should be an order of mag better. It runs but doesn't converge. One thing I am wondering, where is the conjugate? In my FEM code, the solver realy does use a conjugate, while this one here that I pieced together from several other programs does not. Why is it called conjugate gradient without a conjugate ? :) Here is the code: from math import * from Numeric import * # #*** ENTER DATA filename= "out" # bobfile=open(filename+".bob","w") print "\n"*30 NX=30 # number of cells NY=30 NZ=30 resmax=1e-3 # conj-grad tolerance #allocate arrays ##Ex=zeros((NX+2,NY+2,NZ+2),Float) ##Ey=zeros((NX+2,NY+2,NZ+2),Float) ##Ez=zeros((NX+2,NY+2,NZ+2),Float) p=zeros((NX+1,NY+1,NZ+1),Float) q=zeros((NX+1,NY+1,NZ+1),Float) r=zeros((NX+1,NY+1,NZ+1),Float) u=zeros((NX+1,NY+1,NZ+1),Float) u[0:30,0:30,0]=0 # box at 1V with one side at 0V u[0:30,0,0:30]=1 u[0,0:30,0:30]=1 u[0:30,0:30,30]=1 u[0:30,30,0:30]=1 u[30,0:30,0:30]=1 r[1:NX-1,1:NY-1,1:NZ-1]=(u[1:NX-1,0:NY-2,1:NZ-1]+ #initialize r matrix u[1:NX-1,2:NY,1:NZ-1]+ u[0:NX-2,1:NY-1,1:NZ-1]+ u[2:NX,1:NY-1,1:NZ-1]+ u[1:NX-1,1:NY-1,0:NZ-2]+ u[1:NX-1,1:NY-1,2:NZ]) p[...]=r[...] #initialize p matrix # #**** START ITERATIONS # N=(NX-2)*(NY-2)*(NZ-2) # left over from Jacobi solution, not used KK=0 # iteration counter res=resmax=0.0; #set residuals=0 while(1): q[1:NX-1,1:NY-1,1:NZ-1]=(p[1:NX-1,0:NY-2,1:NZ-1]+ # finite difference eq p[1:NX-1,2:NY,1:NZ-1]+ p[0:NX-2,1:NY-1,1:NZ-1]+ p[2:NX,1:NY-1,1:NZ-1]+ p[1:NX-1,1:NY-1,0:NZ-2]+ p[1:NX-1,1:NY-1,2:NZ]) # Calculate r dot p and p dot q rdotp = 0.0 pdotq = 0.0 rdotp = add.reduce(ravel( r[1:NX-1,1:NY-1,1:NZ-1] * p[1:NX-1,1:NY-1,1:NZ-1])) pdotq = add.reduce(ravel( p[1:NX-1,1:NY-1,1:NZ-1] * q[1:NX-1,1:NY-1,1:NZ-1])) # Set alpha value alpha = rdotp/pdotq # Update solution and residual u[1:NX-1,1:NY-1,1:NZ-1] += alpha*p[1:NX-1,1:NY-1,1:NZ-1] r[1:NX-1,1:NY-1,1:NZ-1] += - alpha*q[1:NX-1,1:NY-1,1:NZ-1] # calculate beta rdotq = 0.0 rdotq = add.reduce(ravel(r[1:NX-1,1:NY-1,1:NZ-1]*q[1:NX-1,1:NY-1,1:NZ-1])) beta = rdotq/pdotq # Set the new search direction p[1:NX-1,1:NY-1,1:NZ-1] = r[1:NX-1,1:NY-1,1:NZ-1] - beta*p[1:NX-1,1:NY-1,1:NZ-1] res = sort(ravel(r[1:NX-1,1:NY-1,1:NZ-1]))[-1] #find largest residual # resmax = max(resmax,abs(res)) KK+=1 # print "Iteration Number %d Residual %1.2e" %(KK,abs(res)) if (abs(res)<=resmax): break # if residual is small enough break out print "Number of Iterations ",KK -- ----------------------------- The Numeric Python EM Project www.pythonemproject.com |
From: rob <ro...@py...> - 2002-03-30 16:00:50
|
Their site is dead here. If anyone has the latest copy of Weave tar'd up that they can send me, I had a bug report to finally make. For some reason, Weave still can't find libstdc++ on FreeBSD. Rob. -- ----------------------------- The Numeric Python EM Project www.pythonemproject.com |
From: Edward C. J. <edc...@er...> - 2002-03-30 15:54:47
|
IM (pronounced with a long I) is an Python module that makes it easy to use Numeric and PIL together in programs. Typical functions in IM are: Open: Opens an image file using PIL and converts it to Numeric, PIL, or OpenCV formats. Save: Converts an array to PIL and saves it to a file. Array_ToArrayCast: Converts images between formats and between pixel types. In addition to Numeric and PIL, IM works with the Intel OpenCV computer vision system (http://www.intel.com/research/mrl/research/opencv/). OpenCV is available for Linux at the OpenCV Yahoo Group (http://groups.yahoo.com/group/OpenCV/). IM currently runs under Linux only. It should not be too difficult to port the basic IM system to Windows or Mac. The OpenCV wrapper is large and complex and uses SWIG. It will be harder to port. The IM system appears to be pretty stable. On the other hand, the OpenCV wrapper is probably very buggy. To download the software go to http://members.tripod.com/~edcjones/pycode.html and download "PyCV.032502.tgz". Edward C. Jones edc...@ho... |
From: Chad N. <cn...@ma...> - 2002-03-30 02:00:13
|
> <jo...@oh...> writes: >Take a look at the first number in each run ! That is because the random starting seed is (probably, I haven't looked at the code) set from the clock, and doesn't change all that much from run to run. You'll see similar results when you substitute: print "Clock at time:" , i, ":", RandomArray.random_integers(10) or print "Clock at time:" , i, ":", RandomArray.uniform(1, 10) into your code. The part before the decimal point is always the same on the first call of each run (assuming you run them at roughly the same time). Note that the 'seed' is really the internal state of the RNG and changes at each call. You could call the random function a few dozen times before using results, or hash the first result and use that as a new seed, etc. But basically, the generator will produce similar initial results (ie. one call) for similar seeds, which is what the time value is causing. I'd propose that the implementation, when setting the seed from the time, generate at least one dummy RNG generation before returning results. -- Chad Netzer chad.netzer at stanfordalumni.org |