You can subscribe to this list here.
2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
From: Travis O. <oli...@ee...> - 2006-07-11 21:15:22
|
Ed Schofield wrote: >Last week's discussion on rand() and randn() seemed to indicate a >sentiment that they ought to take tuples for consistency with ones, >zeros, eye, identity, and empty -- that, although they are supposed >to be convenience functions, they are inconvenient precisely because >of their inconsistency with these other functions. This issue has >been raised many times over the past several months. > >Travis made a change in r2572 to allow tuples as arguments, then took >it out again a few hours later, apparently unsure about whether this >was a good idea. > >I'd like to call for a vote on what people would prefer, and then ask >Travis to make a final pronouncement before the feature freeze. > > > > This is my thinking about the rand and randn situation: I really like the rand and randn functions. I use them all the time and want quick and easy access to them. In retrospect, the rand and randn functions probably should have been placed in a separate "user" namespace like pylab or numlab or something like that to distinguish it from "numpy the library." But, we don't have something like that in place at this point, so what to do now? I'm opposed to any suggestion to get rid of the rand(3,3) calling syntax. One reason is that I like the syntax for this function and I've used it a lot. It comes from MLab in Numeric and so it is needed to support compatibility with Numeric as well. So, we can't just get rid of it entirely. Another big reason is that there is already a function that uses ONLY the tuple syntax to do exactly the same thing. Look at the docstrings for rand and randn to get the names. If you are opposed to the rand and randn syntax then just ignore those functions and use numpy.random.random_sample(tuple) numpy.random.standard_normal(tuple) So, I'm opposed to getting rid of the *args based syntax. My feelings are weaker regarding adding the capability to rand and randn to accept a tuple. I did test it out and it does seem feasible to add this feature at the cost of an additional comparison. I know Robert is opposed to it but I'm not sure I understand completely why. Please correct me if I'm wrong, but I think it has something to do with making the random_sample and standard_normal functions irrelevant and unnecessary combined with my hypothesis that Robert doesn't like the *args-style syntax. Therefore, adding support to rand and randn for tuples, would make them the default random-number generators and there would be proliferating code that was "harder to read" because of the different usages. Personally, I'm not that opposed to "different" calling conventions, but I respect the opinions of the wider NumPy community. In sum: rand and randn have to live somewhere and accept their current calling convention. I would not be opposed at this point to taking them out of top-level numpy and putting them instead into a "numlab" name-space (for lack of a better name). However, I'm not so sure that making a "numlab" name-space really solves any of the issues that are being debated in this case. So, I'm not going to spend any time doing it... -Travis |
From: Sasha <nd...@ma...> - 2006-07-11 21:03:51
|
Here is the solution of a half of the problem: >>> a=array([1,2,3,0,40,50,60,0,7,8,9]) >>> 5+where(logical_and.accumulate(a[5:]!=0)) array([5, 6]) the rest is left as an exercise to the reader :-) Hint a[::-1] will reverse a. On 7/11/06, Mathew Yeates <my...@jp...> wrote: > I can handle the following problem by iterating through some indices but > I'm looking for a more elegant solution. > > If I have a 1d array, I want to find a contiguous nonzero region about a > given index. For example, if a=[1,2,3,0,40,50,60,0,7,8,9] and we start > with the index of 5, then I want the indices 4,5,6 > > Any gurus out there? > > Mathew > > > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Mathew Y. <my...@jp...> - 2006-07-11 20:49:29
|
I can handle the following problem by iterating through some indices but I'm looking for a more elegant solution. If I have a 1d array, I want to find a contiguous nonzero region about a given index. For example, if a=[1,2,3,0,40,50,60,0,7,8,9] and we start with the index of 5, then I want the indices 4,5,6 Any gurus out there? Mathew |
From: <ke...@ca...> - 2006-07-11 20:10:49
|
Although I'm not really up to speed on the array interface, accessing the pixel data in a PIL image isn't really that difficult in C/C++... the only challenge I would see (besides tracking the channels/padding correctly... trivial) would be getting the pointer into Python to pass it to NumPy. I've written a few modules in C that directly modify the PIL buffer data, with simple code such as attached (lines 186-214 show it clearly). (This is a module that does unsharp-masking and gaussian blur on PIL images... Fredrik is welcome to include this directly into the PIL library if he sees fit, for which I'll gladly remove ANY licensing restrictions) Kevin. ----- Original Message ----- From: "Travis Oliphant" <oli...@ie...> To: <ima...@py...> Cc: "numpy-discussion" <num...@li...> Sent: Tuesday, July 11, 2006 9:37 PM Subject: Re: [Image-SIG] Quicker image transfer, tobuffer? > > Here is a simple approach to allowing the PIL to export the array > interface. > > This allows NumPy to create a suitable array from a PIL image very easily: > > > At the top of Image.py add the following > > if sys.byteorder == 'little': > _ENDIAN = '<' > else: > _ENDIAN = '>' > > _MODE_CONV = { > > # official modes > "1": ('|b1', None), > "L": ('|u1', None), > "I": ('%si4' % _ENDIAN, None), > "F": ('%sf4' % _ENDIAN, None), > "P": ('|u1', None), > "RGB": ('|u1', 3), > "RGBX": ('|u1', 4), > "RGBA": ('|u1', 4), > "CMYK": ('|u1', 4), > "YCbCr": ('|u1', 4), > > # Experimental modes include I;16, I;16B, RGBa, BGR;15, > # and BGR;24. Use these modes only if you know exactly > # what you're doing... > > } > > def _conv_type_shape(im): > shape = im.size > typ, extra = _MODE_CONV[im.mode] > if extra is None: > return shape, typ > shape += (extra,) > return shape, typ > > > > In the Image class structure add > > def __get_array_interface__(self): > new = {} > shape, typestr = _conv_type_shape(self) > new['shape'] = shape > new['typestr'] = typestr > new['data'] = self.tostring() > return new > > __array_interface__ = property(__get_array_interface__, None, > doc="array interface") > > > > With this addition you can then do > > import Image, numpy > > im = Image.open('lena.jpg') > a = numpy.asarray(im) > > and you will get a suitable read-only array pointing to the string > produced by tostring. > > > This would be a nice thing to add to the PIL. > > -Travis Oliphant > > > _______________________________________________ > Image-SIG maillist - Ima...@py... > http://mail.python.org/mailman/listinfo/image-sig > |
From: Jonathan T. <jon...@ut...> - 2006-07-11 20:07:48
|
I agree the real problem with matrices is they seem awkward to work with compared to arrays because numpy seems so array centric. The only advantage I see is getting .T to do transposes and * to do matrix multiplication. I hope numpy reaches a point where it is as natural to use matrices as arrays. I'd also vote for the inclusion of the following two functions col and row. Inspired by R equivelents they let you do some indexing very easily such as getting the values of the upper triangle of the matrix. E.g. vals = m[row(m) > col(m)] Cheers, Jon. def col(m): """col(m) returns a matrix of the same size of m where each element contains an integer denoting which column it is in. For example, >>> m = eye(3) >>> m array([[1, 0, 0], [0, 1, 0], [0, 0, 1]]) >>> col(m) array([[0, 1, 2], [0, 1, 2], [0, 1, 2]]) """ assert len(m.shape) == 2, "should be a matrix" return N.indices(m.shape)[1] def row(m): """row(m) returns a matrix of the same size of m where each element contains an integer denoting which row it is in. For example, >>> m = eye(3) >>> m array([[1, 0, 0], [0, 1, 0], [0, 0, 1]]) >>> row(m) array([[0, 0, 0], [1, 1, 1], [2, 2, 2]]) """ assert len(m.shape) == 2, "should be a matrix" return N.indices(m.shape)[0] On 7/7/06, Ed Schofield <sch...@ft...> wrote: > Bill Baxter wrote: > > I think the thread to this point can be pretty much summarized by: > > > > while True: > > Bill: "2D transpose is common so it should have a nice syntax" > > Tim, Robert, Sasha, and Ed: "No it's not." > > > > Very well. I think it may be a self fulfilling prophecy, though. > > I.e. if matrix operations are cumbersome to use, then -- surprise > > surprise -- the large user base for matrix-like operations never > > materializes. Potential converts just give numpy the pass, and go to > > Octave or Scilab, or stick with Matlab, R or S instead. > > > > Why all the fuss about the .T? Because any changes to functions (like > > making ones() return a matrix) can easily be worked around on the user > > side, as has been pointed out. But as far as I know -- do correct me > > if I'm wrong -- there's no good way for a user to add an attribute to > > an existing class. After switching from matrices back to arrays, .T > > was the only thing I really missed from numpy.matrix. > > > > I would be all for a matrix class that was on equal footing with array > > and as easy to use as matrices in Matlab. But my experience using > > numpy.matrix was far from that, and, given the lack of enthusiasm for > > matrices around here, that seems unlikely to change. However, I'm > > anxious to see what Ed has up his sleeves in the other thread. > > Okay ... <Ed rolls up his sleeves> ... let's make this the thread ;) > I'd like to know why you, Sven, and anyone else on the list have gone > back to using arrays after trying matrices. What was inconvenient about > them? I'd like a nice juicy list. The whole purpose of the matrix > class is to simplify 2d linear algebra. Where is it failing? > > I also went back to arrays after trying out matrices for some 2d linear > algebra tasks, since I found that using matrices increased my code's > complexity. I can describe the problems I had with them later, but > first I'd like to hear of others' experiences. > > I'd like to help to make matrices more usable. Tell me what you want, > and I'll work on some patches. > > -- Ed > > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Travis O. <oli...@ie...> - 2006-07-11 19:37:21
|
Here is a simple approach to allowing the PIL to export the array interface. This allows NumPy to create a suitable array from a PIL image very easily: At the top of Image.py add the following if sys.byteorder == 'little': _ENDIAN = '<' else: _ENDIAN = '>' _MODE_CONV = { # official modes "1": ('|b1', None), "L": ('|u1', None), "I": ('%si4' % _ENDIAN, None), "F": ('%sf4' % _ENDIAN, None), "P": ('|u1', None), "RGB": ('|u1', 3), "RGBX": ('|u1', 4), "RGBA": ('|u1', 4), "CMYK": ('|u1', 4), "YCbCr": ('|u1', 4), # Experimental modes include I;16, I;16B, RGBa, BGR;15, # and BGR;24. Use these modes only if you know exactly # what you're doing... } def _conv_type_shape(im): shape = im.size typ, extra = _MODE_CONV[im.mode] if extra is None: return shape, typ shape += (extra,) return shape, typ In the Image class structure add def __get_array_interface__(self): new = {} shape, typestr = _conv_type_shape(self) new['shape'] = shape new['typestr'] = typestr new['data'] = self.tostring() return new __array_interface__ = property(__get_array_interface__, None, doc="array interface") With this addition you can then do import Image, numpy im = Image.open('lena.jpg') a = numpy.asarray(im) and you will get a suitable read-only array pointing to the string produced by tostring. This would be a nice thing to add to the PIL. -Travis Oliphant |
From: Perry G. <pe...@st...> - 2006-07-11 18:30:24
|
On Jul 11, 2006, at 2:04 PM, Travis Oliphant wrote: >> > We will all welcome Fredrik's comments. But, I think he is > exaggerating > the situation a bit. > > It is true that the Imaging structure of PIL is a more general memory > model for 2-d arrays. My understanding is that it basically allows > for > an array of pointers to memory where each pointer points to a > different > line (or chunk) in the image (kind of like a C-array). > > So, basically, the problem is that you may have an image that > cannot be > described as a single block of memory or a singly-strided array. I > believe, thought, that sometimes you do have a PIL image that is > basically a single chunk of memory. > > So, while a general-purpose "memory-sharing" situation might be > difficult. I think some sharing (of each chunk for example) could be > done. > > Even still, the array interface (the Python side) does technically > handle the PIL case because the default is to use the sequence > interface > to access elements of the array. > > It would be nice if Fredrik were more willing to help establish a > standard, though. Calling it "not close" but giving no alternative is > not helpful. To expand on this a bit, I think Fredrik is misreading the intent of the array interface somewhat. It's not that we are promising that the array data structure will suit his every need. Far from it. It's that it allows him (or us) to provide convenience functions that allow accessing data in PIL without jumping through lots of annoying hoops. It should be possible to convert PIL image data to arrays easily. Does that mean that all associated information is propagated as part of the array object? Of course not. But this information is obtainable by other means anyway. Even in the case where different chunks of an image are in different memory locations and there is no simple way of avoiding copying the data, so what? I still much prefer the data be copied to an array so that numpy functionality can be applied to the array and have to avoid calling a sequence of operations to convert the data to a string and then to an array. At least one copy is avoided, but avoiding copying isn't the entire justification. Think of it this way, if PIL strength is that it can convert images between many different formats (no doubt, copying data in the process) then arrays should be one of the supported formats, no? Perry |
From: David H. <dav...@gm...> - 2006-07-11 18:27:38
|
Tim Hochberg wrote: > My first question is: why? What's the attraction in returning a sorted > answer here? Returning an unsorted array is potentially faster, > depending on the algorithm chosen, and sorting after the fact is > trivial. If one was going to spend extra complexity on something, I'd > think it would be better spent on preserving the input order. There is a unique function in matlab that returns a sorted vector. I think a lot of people will expect a numpy and matlab functions with identical names to behave similarly. If we want to preserve the input order, we'd have to choose a convention about whose value's order is retained: do we keep the order of the first value found or the last one ? Here is the benchmark. Sorry Norbert for not including your code the first time, it turns out that with Alain's suggestion, its the fastest one both for lists and arrays. x = rand(100000)*100 x = x.astype('i') l = list(x) For array x: In [166]: timeit unique_alan(x) # with set instead of dict 100 loops, best of 3: 8.8 ms per loop In [167]: timeit unique_norbert(x) 100 loops, best of 3: 8.8 ms per loop In [168]: timeit unique_sasha(x) 100 loops, best of 3: 10.8 ms per loop In [169]: timeit unique(x) 10 loops, best of 3: 50.4 ms per loop In [170]: timeit unique1d(x) 10 loops, best of 3: 13.2 ms per loop For list l: In [196]: timeit unique_norbert(l) 10 loops, best of 3: 29 ms per loop In [197]: timeit unique_alan(l) # with set instead of dict 10 loops, best of 3: 14.5 ms per loop In [193]: timeit unique(l) 10 loops, best of 3: 29.6 ms per loop Note: In Norbert function, setting sort=False for flattenable objects returns a sorted array anyway. So I'd suggest to remove the sort keyword, sort if the datatype is sortable, and don't sort if its not. David |
From: Travis O. <oli...@ee...> - 2006-07-11 18:04:40
|
Christopher Barker wrote: >Hi all, > >Over on the PIL mailing list, someone asked about some possible >additions to PIL to facilitate copy-free moving of data between PyGame >and PIL. I sent a note suggesting that the array interface might be just >the ticket. These were the responses: > >Pete Shinners wrote: > > >>Yeah, this would be an ideal solution. I hope more of the base array stuff can >>get into Python 2.6. >> >>We did look at the array proposal, but haven't been able to jump on it yet >>because of adoption. I guess it needs another package or two to get the ball >>rolling. >> >> > >So that's an advocacy issue, and an illustration of why getting it into >the standard lib is critical. > >Fredrik Lundh wrote: > > >>unfortunately, the "array interface" model isn't even close to be able >>to describe a PIL image memory (the "Imaging" structure)... >> >> > > > We will all welcome Fredrik's comments. But, I think he is exaggerating the situation a bit. It is true that the Imaging structure of PIL is a more general memory model for 2-d arrays. My understanding is that it basically allows for an array of pointers to memory where each pointer points to a different line (or chunk) in the image (kind of like a C-array). So, basically, the problem is that you may have an image that cannot be described as a single block of memory or a singly-strided array. I believe, thought, that sometimes you do have a PIL image that is basically a single chunk of memory. So, while a general-purpose "memory-sharing" situation might be difficult. I think some sharing (of each chunk for example) could be done. Even still, the array interface (the Python side) does technically handle the PIL case because the default is to use the sequence interface to access elements of the array. It would be nice if Fredrik were more willing to help establish a standard, though. Calling it "not close" but giving no alternative is not helpful. >Darn. It was my understanding that we thought that it was close to being >able to describe an image... so what have we missed. I'm out of my >technical depth, but I've encouraged Fredrik to bring the discussion here. > > -Travis |
From: Christopher B. <Chr...@no...> - 2006-07-11 17:46:58
|
Hi all, Over on the PIL mailing list, someone asked about some possible additions to PIL to facilitate copy-free moving of data between PyGame and PIL. I sent a note suggesting that the array interface might be just the ticket. These were the responses: Pete Shinners wrote: > Yeah, this would be an ideal solution. I hope more of the base array stuff can > get into Python 2.6. > > We did look at the array proposal, but haven't been able to jump on it yet > because of adoption. I guess it needs another package or two to get the ball > rolling. So that's an advocacy issue, and an illustration of why getting it into the standard lib is critical. Fredrik Lundh wrote: > unfortunately, the "array interface" model isn't even close to be able > to describe a PIL image memory (the "Imaging" structure)... Darn. It was my understanding that we thought that it was close to being able to describe an image... so what have we missed. I'm out of my technical depth, but I've encouraged Fredrik to bring the discussion here. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chr...@no... |
From: Robert C. <cim...@nt...> - 2006-07-11 16:49:42
|
Norbert Nemec wrote: > unique1d is based on ediff1d, so it really calculates many differences > and compares those to 0.0 > > This is inefficient, even though this is hidden by the general > inefficiency of Python (It might be the reason for the two milliseconds, > though) > > What is more: subtraction works only for numbers, while the various > proposed versions use only comparison which works for any data type (as > long as it can be sorted) I agree that unique1d works only for numbers, but that is what it was meant for... well for integers only, in fact - everyone here surely knows, that comparing floats with != does not work well. Note also that it was written before logical indexing and other neat stuff were not possible in numpy - every improvement is welcome! (btw. I cannot recall why I used subtraction and testing for zero instead of just comparisons - maybe remnants from my old matlab days and its logical arrays - ediff1d should disappear from the other functions in arraysetops) > My own version tried to capture all possible cases that the current > unique captures. > > Sasha's version only works for numpy arrays and has a problem for arrays > with all identical entries. > > David's version only works for numpy arrays of types that can be > converted to float. comparing floats... > I would once more propose to use my own version as given before: > > def unique(arr,sort=True): > if hasattr(arr,'flatten'): > tmp = arr.flatten() > tmp.sort() > idx = concatenate([True],tmp[1:]!=tmp[:-1]) > return tmp[idx] > else: # for compatibility: > set = {} > for item in inseq: > set[item] = None > if sort: > return asarray(sorted(set.keys())) > else: > return asarray(set.keys()) Have you considered using set instead of dict? Just curious :-) r. |
From: Pau G. <pau...@gm...> - 2006-07-11 16:24:36
|
On 7/11/06, Travis Oliphant <oli...@ie...> wrote: > Pau Gargallo wrote: > > hi, > > > > looking at the upcasting table at > > http://www.scipy.org/Tentative_NumPy_Tutorial#head-4c1d53fe504adc97baf27b65513b4b97586a4fc5 > > I saw that int's are sometimes casted to uint's. > > > > In [3]: a = array([3],int16) > > In [5]: b = array([4],uint32) > > In [7]: a+b > > Out[7]: array([7], dtype=uint32) > > > > is that intended? > > > It's a bug. The result should be int64. I've fixed it in SVN. > Thanks!! |
From: Robert K. <rob...@gm...> - 2006-07-11 16:24:07
|
Tim Hochberg wrote: > Norbert Nemec wrote: >> unique1d is based on ediff1d, so it really calculates many differences >> and compares those to 0.0 >> >> This is inefficient, even though this is hidden by the general >> inefficiency of Python (It might be the reason for the two milliseconds, >> though) >> >> What is more: subtraction works only for numbers, while the various >> proposed versions use only comparison which works for any data type (as >> long as it can be sorted) >> > My first question is: why? What's the attraction in returning a sorted > answer here? Returning an unsorted array is potentially faster, > depending on the algorithm chosen, and sorting after the fact is > trivial. If one was going to spend extra complexity on something, I'd > think it would be better spent on preserving the input order. One issue is that uniquifying numpy arrays using Python dicts, while expected O(n) in terms of complexity, is really slow compared to sorting because of the overhead in getting the elements out of the numpy arrays and into Python objects. For the cases where sorting works (your caveat below is quite correct), it's really quite good for numpy arrays. OTOH, if one were to implement a hash table in C, that might potentially be faster and more robust, but that is spending a *large* amount of extra complexity. > Second, some objects can be compared for equality and hashed, but not > sorted (Python's complex number's come to mind). If one is going to > worry about subtraction so as to keep things general, it makes sense to > also avoid sorting as well Sasha's slick algorithm not withstanding. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: Tim H. <tim...@co...> - 2006-07-11 16:06:37
|
Tim Hochberg wrote: > Norbert Nemec wrote: > >> unique1d is based on ediff1d, so it really calculates many differences >> and compares those to 0.0 >> >> This is inefficient, even though this is hidden by the general >> inefficiency of Python (It might be the reason for the two milliseconds, >> though) >> >> What is more: subtraction works only for numbers, while the various >> proposed versions use only comparison which works for any data type (as >> long as it can be sorted) >> >> > My first question is: why? What's the attraction in returning a sorted > answer here? Returning an unsorted array is potentially faster, > depending on the algorithm chosen, and sorting after the fact is > trivial. If one was going to spend extra complexity on something, I'd > think it would be better spent on preserving the input order. > > Second, some objects can be compared for equality and hashed, but not > sorted (Python's complex number's come to mind). If one is going to > worry about subtraction so as to keep things general, it makes sense to > also avoid sorting as well Sasha's slick algorithm not withstanding. > > Third, I propose that whatever the outcome of the sorting issue, I would > propose that unique have the same interface as the other structural > array operations. That is: > > unique(anarray, axis=0): > ... > > The default axis=0 is for compatibility with the other, somewhat similar > functions. Axis=None would return the flattened, uniquified data, > axis=# would uniquify the result along that axis. > Hmmm. Of course that precludes it returning an actual array for axis!=None. That might be considered suboptimal... -tim > Regards, > > -tim > > >> My own version tried to capture all possible cases that the current >> unique captures. >> >> Sasha's version only works for numpy arrays and has a problem for arrays >> with all identical entries. >> >> David's version only works for numpy arrays of types that can be >> converted to float. >> >> I would once more propose to use my own version as given before: >> >> def unique(arr,sort=True): >> if hasattr(arr,'flatten'): >> tmp = arr.flatten() >> tmp.sort() >> idx = concatenate([True],tmp[1:]!=tmp[:-1]) >> return tmp[idx] >> else: # for compatibility: >> set = {} >> for item in inseq: >> set[item] = None >> if sort: >> return asarray(sorted(set.keys())) >> else: >> return asarray(set.keys()) >> >> >> Greetings, >> Norbert >> >> >> >> ------------------------------------------------------------------------- >> Using Tomcat but need to do more? Need to support web services, security? >> Get stuff done quickly with pre-integrated technology to make your job easier >> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo >> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 >> _______________________________________________ >> Numpy-discussion mailing list >> Num...@li... >> https://lists.sourceforge.net/lists/listinfo/numpy-discussion >> >> >> >> > > > > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > |
From: Travis O. <oli...@ie...> - 2006-07-11 16:03:50
|
Pau Gargallo wrote: > hi, > > looking at the upcasting table at > http://www.scipy.org/Tentative_NumPy_Tutorial#head-4c1d53fe504adc97baf27b65513b4b97586a4fc5 > I saw that int's are sometimes casted to uint's. > > In [3]: a = array([3],int16) > In [5]: b = array([4],uint32) > In [7]: a+b > Out[7]: array([7], dtype=uint32) > > is that intended? > It's a bug. The result should be int64. I've fixed it in SVN. -Travis |
From: Tim H. <tim...@co...> - 2006-07-11 16:02:33
|
Norbert Nemec wrote: > unique1d is based on ediff1d, so it really calculates many differences > and compares those to 0.0 > > This is inefficient, even though this is hidden by the general > inefficiency of Python (It might be the reason for the two milliseconds, > though) > > What is more: subtraction works only for numbers, while the various > proposed versions use only comparison which works for any data type (as > long as it can be sorted) > My first question is: why? What's the attraction in returning a sorted answer here? Returning an unsorted array is potentially faster, depending on the algorithm chosen, and sorting after the fact is trivial. If one was going to spend extra complexity on something, I'd think it would be better spent on preserving the input order. Second, some objects can be compared for equality and hashed, but not sorted (Python's complex number's come to mind). If one is going to worry about subtraction so as to keep things general, it makes sense to also avoid sorting as well Sasha's slick algorithm not withstanding. Third, I propose that whatever the outcome of the sorting issue, I would propose that unique have the same interface as the other structural array operations. That is: unique(anarray, axis=0): ... The default axis=0 is for compatibility with the other, somewhat similar functions. Axis=None would return the flattened, uniquified data, axis=# would uniquify the result along that axis. Regards, -tim > My own version tried to capture all possible cases that the current > unique captures. > > Sasha's version only works for numpy arrays and has a problem for arrays > with all identical entries. > > David's version only works for numpy arrays of types that can be > converted to float. > > I would once more propose to use my own version as given before: > > def unique(arr,sort=True): > if hasattr(arr,'flatten'): > tmp = arr.flatten() > tmp.sort() > idx = concatenate([True],tmp[1:]!=tmp[:-1]) > return tmp[idx] > else: # for compatibility: > set = {} > for item in inseq: > set[item] = None > if sort: > return asarray(sorted(set.keys())) > else: > return asarray(set.keys()) > > > Greetings, > Norbert > > > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > |
From: Alan G I. <ai...@am...> - 2006-07-11 15:39:34
|
On Tue, 11 Jul 2006, Norbert Nemec apparently wrote:=20 > else: # for compatibility:=20 > set =3D {}=20 > for item in inseq:=20 > set[item] =3D None=20 > if sort:=20 > return asarray(sorted(set.keys()))=20 > else:=20 > return asarray(set.keys())=20 I'm currently in major caffeine deficit, but aside from backward compatability, how is this better than: else: #for compatability items =3D list(set(inseq)) if sort: items.sort() return asarray(items) Alan Isaac PS Actually, making a list of a set may already sort? No time to check now ... PPS For to 2.3, need set =3D sets.Set |
From: Norbert N. <Nor...@gm...> - 2006-07-11 15:01:07
|
unique1d is based on ediff1d, so it really calculates many differences and compares those to 0.0 This is inefficient, even though this is hidden by the general inefficiency of Python (It might be the reason for the two milliseconds, though) What is more: subtraction works only for numbers, while the various proposed versions use only comparison which works for any data type (as long as it can be sorted) My own version tried to capture all possible cases that the current unique captures. Sasha's version only works for numpy arrays and has a problem for arrays with all identical entries. David's version only works for numpy arrays of types that can be converted to float. I would once more propose to use my own version as given before: def unique(arr,sort=True): if hasattr(arr,'flatten'): tmp = arr.flatten() tmp.sort() idx = concatenate([True],tmp[1:]!=tmp[:-1]) return tmp[idx] else: # for compatibility: set = {} for item in inseq: set[item] = None if sort: return asarray(sorted(set.keys())) else: return asarray(set.keys()) Greetings, Norbert |
From: Stefan v. d. W. <st...@su...> - 2006-07-11 11:11:12
|
On Tue, Jul 11, 2006 at 12:37:23PM +0200, Pau Gargallo wrote: > > Something's not quite right here. The argsort docstring states that: > > > > argsort(a,axis=3D-1) return the indices into a of the sorted arra= y > > along the given axis, so that take(a,result,axis) is the sorted a= rray. > > > > But > > > > N.take(A,A.argsort()) breaks. Either this is a bug, or the docstring > > needs to be updated. > > > > Cheers > > St=E9fan > > >=20 > I think the docstring is wrong, because take doesn't do that. > if you N.take(A,A.argsort(1), 1), it doesn't break but it doesn't sort > A neither. >=20 > Take seems to peek entire columns, but take's docstring is missing. >=20 > For the argsort docstring, it may be usefull to indicate that if one do > >>> ind =3D indices(A.shape) > >>> ind[ax] =3D A.argsort(axis=3Dax) > then A[ind] is the sorted array. We can always adapt the documentation at http://numeric.scipy.org/numpydoc/numpy-9.html#pgfId-36425 into a docstring. I'll file a ticket. Cheers St=E9fan |
From: Pau G. <pau...@gm...> - 2006-07-11 10:54:24
|
hi, looking at the upcasting table at http://www.scipy.org/Tentative_NumPy_Tutorial#head-4c1d53fe504adc97baf27b65513b4b97586a4fc5 I saw that int's are sometimes casted to uint's. In [3]: a = array([3],int16) In [5]: b = array([4],uint32) In [7]: a+b Out[7]: array([7], dtype=uint32) is that intended? pau |
From: Ed S. <sch...@ft...> - 2006-07-11 10:38:33
|
John Hassler wrote: > Ed Schofield <schofield <at> ftw.at> writes: > > >> Hmmm ... it could be an ATLAS problem. What's your processor? I built >> the SciPy 0.4.9 binaries against Pearu's ATLAS binaries for Pentium 2, >> thinking that this would give maximum compatibility ... >> >> Or perhaps it's something else. Could someone with this problem please >> post a backtrace? >> > > > This computer is an AMD Athlon 1600+ running Windows XP. > > <snip> > > All of the versions of scipy using numpy crash with XP whenever I access > any of the functions in "optimize" or "integrate" which (I assume) call the > Fortran libraries. > > In the current version, running scipy.test() gives an "unhandled > exception." Debug shows a pointer to: > 020CA9C3 xorps xmm6,xmm6 > > <snip> > > Some other information: > >>>> scipy.__version__ '0.4.9' >>>> > > >>>> scipy.__numpy_version__ '0.9.8' >>>> > > >>>> scipy.show_numpy_config() >>>> > > atlas_threads_info: NOT AVAILABLE > > blas_opt_info: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = > ['C:\\Libraries\\ATLAS_3.6.0_WIN_P4'] define_macros = [('ATLAS_INFO', > '"\\"3.6.0\\""')] language = c include_dirs = > ['C:\\Libraries\\ATLAS_3.6.0_WIN_P4'] > > plus similar stuff. Probably the important thing is the "ATLAS ... P4" > line. > Thanks, John -- this is helpful. (Hans too, thanks for your testing). This looks like a problem with the NumPy build. Travis, could you have compiled the Win32 binaries accidentally against the P4/SSE2 ATLAS library? -- Ed |
From: Pau G. <pau...@gm...> - 2006-07-11 10:37:25
|
On 7/11/06, Stefan van der Walt <st...@su...> wrote: > On Tue, Jul 11, 2006 at 11:32:48AM +0200, Emanuele Olivetti wrote: > > Hi, > > I don't understand how to use argsort results. I have a 2D matrix and > > I want to sort values in each row and obtain the index array of that > > sorting. Argsort(1) is what I need, but the problem is how to use its > > result in order to obtain a sorted matrix. Here is the simple example: > > > > A =3D array([[2,3,1],[5,4,6]]) > > indexes =3D a.argsort(1) > > > > now indexes is: > > array([[2, 0, 1], > > [1, 0, 2]]) > > > > I'd like to apply indexes to A and obtain: > > array([[1, 2, 3], > > [4, 5, 6]]) > > > > or better, I'm interested both in a subset of indexes, i.e. indexes[:,1= :], and > > the related values of A matrix. > > > > How can I do this? If I simpy say: A[indexes] I get an IndexError. > > Something's not quite right here. The argsort docstring states that: > > argsort(a,axis=3D-1) return the indices into a of the sorted array > along the given axis, so that take(a,result,axis) is the sorted array= . > > But > > N.take(A,A.argsort()) breaks. Either this is a bug, or the docstring > needs to be updated. > > Cheers > St=E9fan > I think the docstring is wrong, because take doesn't do that. if you N.take(A,A.argsort(1), 1), it doesn't break but it doesn't sort A neither. Take seems to peek entire columns, but take's docstring is missing. For the argsort docstring, it may be usefull to indicate that if one do >>> ind =3D indices(A.shape) >>> ind[ax] =3D A.argsort(axis=3Dax) then A[ind] is the sorted array. pau |
From: Stefan v. d. W. <st...@su...> - 2006-07-11 10:20:23
|
On Tue, Jul 11, 2006 at 11:32:48AM +0200, Emanuele Olivetti wrote: > Hi, > I don't understand how to use argsort results. I have a 2D matrix and > I want to sort values in each row and obtain the index array of that > sorting. Argsort(1) is what I need, but the problem is how to use its > result in order to obtain a sorted matrix. Here is the simple example: >=20 > A =3D array([[2,3,1],[5,4,6]]) > indexes =3D a.argsort(1) >=20 > now indexes is: > array([[2, 0, 1], > [1, 0, 2]]) >=20 > I'd like to apply indexes to A and obtain: > array([[1, 2, 3], > [4, 5, 6]]) >=20 > or better, I'm interested both in a subset of indexes, i.e. indexes[:,1= :], and > the related values of A matrix. >=20 > How can I do this? If I simpy say: A[indexes] I get an IndexError. Something's not quite right here. The argsort docstring states that: argsort(a,axis=3D-1) return the indices into a of the sorted array along the given axis, so that take(a,result,axis) is the sorted array= . But N.take(A,A.argsort()) breaks. Either this is a bug, or the docstring needs to be updated. Cheers St=E9fan |
From: Emanuele O. <oli...@it...> - 2006-07-11 09:57:33
|
Wow. I have to study much more indexing. It works pretty well. Just to help indexing newbie like on using your advice: A[arange(A.shape[0])[:,newaxis],indexes] Thanks a lot! Emanuele Pau Gargallo wrote: > here goes a first try: > > A[arange(2)[:,newaxis],indexes] |
From: Pau G. <pau...@gm...> - 2006-07-11 09:40:46
|
here goes a first try: A[arange(2)[:,newaxis],indexes] pau On 7/11/06, Emanuele Olivetti <oli...@it...> wrote: > Hi, > I don't understand how to use argsort results. I have a 2D matrix and > I want to sort values in each row and obtain the index array of that > sorting. Argsort(1) is what I need, but the problem is how to use its > result in order to obtain a sorted matrix. Here is the simple example: > > A = array([[2,3,1],[5,4,6]]) > indexes = a.argsort(1) > > now indexes is: > array([[2, 0, 1], > [1, 0, 2]]) > > I'd like to apply indexes to A and obtain: > array([[1, 2, 3], > [4, 5, 6]]) > > or better, I'm interested both in a subset of indexes, i.e. indexes[:,1:], and > the related values of A matrix. > > How can I do this? If I simpy say: A[indexes] I get an IndexError. > > Thanks in advance, > > Emanuele > > P.S. numpy.__version__ is '0.9.8'. > > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |