You can subscribe to this list here.
2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
From: Sasha <nd...@ma...> - 2006-07-03 18:07:11
|
On 7/3/06, Alan G Isaac <ai...@am...> wrote: > .... > Consistency! That is exactly the issue, > especially for those who wish to teach with numpy. > > I do not want to tell my students to write > ones([5,5]) > but > rand(5,5) > and although relatively new to Python > I actually like the practice of providing > dimensions in a list or tuple. > Consistency is already lost because 1d case allows both ones(5) and ones([5]) (and even ones((5,)) if anyone can tolerate that abomination). I don't think those who argue for sequence only are willing to require ones([5]). Remember, "A Foolish Consistency is the Hobgoblin of Little Minds" (Ralph Waldo Emerson (1803=961882), adopted without attribution as a section heading in PEP 8 <http://www.python.org/dev/peps/pep-0008>). I think the current situation strikes the right balance between convenience and consistency. |
From: David H. <dav...@gm...> - 2006-07-03 17:00:38
|
Here is a quick benchmark between numpy's unique, unique1d and sasha's unique: x = rand(100000)*100 x = x.astype('i') %timeit unique(x) 10 loops, best of 3: 525 ms per loop %timeit unique_sasha(x) 100 loops, best of 3: 10.7 ms per loop timeit unique1d(x) 100 loops, best of 3: 12.6 ms per loop So I wonder what is the added value of unique? Could unique1d simply become unique ? Cheers, David P.S. I modified sasha's version to account for the case where all elements are identical, which returned an empty array. def unique_sasha(x): s = sort(x) r = empty(s.shape, float) r[:-1] = s[1:] r[-1] = NaN return s[r != s] 2006/7/3, Robert Cimrman <cim...@nt...>: > > Sasha wrote: > > On 7/2/06, Norbert Nemec <Nor...@gm...> wrote: > >> ... > >> Does anybody know about the internals of the python "set"? How is > >> .keys() implemented? I somehow have really doubts about the efficiency > >> of this method. > >> > > Set implementation (Objects/setobject.c) is a copy and paste job from > > dictobject with values removed. As a result it is heavily optimized > > for the case of string valued keys - a case that is almost irrelevant > > for numpy. > > > > I think something like the following (untested, 1d only) will probably > > be much faster and sorted: > > > > def unique(x): > > s = sort(x) > > r = empty_like(s) > > r[:-1] = s[1:] > > r[-1] = s[0] > > return s[r != s] > > There are 1d array set operations like this already in numpy > (numpy/lib/arraysetops.py - unique1d, ...) > > r. > > > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job > easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Pierre GM <pgm...@ma...> - 2006-07-03 16:40:41
|
> I was also a bit surprised at the following behavior: >>> a = numpy.asarray([1,1]) >>> a array([1, 1]) >>> a[0]=numpy.nan >>> a array([0, 1]) Seems to affect only the int_ arrays: >>> a = numpy.asarray([1,1], dtype=float_) >>> a array([1., 1.]) >>> a[0]=numpy.nan >>> a array([ nan, 1. ]) |
From: John C. <jn...@ec...> - 2006-07-03 15:17:51
|
Hi, I'm getting errors when I try and build numpy from the svn. I've followed the instructions on Installing Sci/Py/Windows at http://www.scipy.org/Installing_SciPy/Windows?highlight=%28%28----%28-%2A%29%28%5Cr%29%3F%5Cn%29%28.%2A%29CategoryInstallation%5Cb%29 I've downloaded, built and tested Atlas, Lapack, etc My computer is set up as follows: MinGW 3.4.2 Cygwin 3.4.4 (used for Atlas and Lapack) Win XP SP2 I've tried building using Cygwin instead MinGW with similar results. (not using cygwin python) I also have VC++ 6 and VC++ Express installed, but neither is on the path when I'm attempting to build numpy. I normally have no problems building my own python extensions using numarray, numpy or PIL. I use pyrex or home rolled code. These all work with 2.3 and 2.4. I'd be grateful of any pointers as to what might be wrong, Thanks in advance John =================================================================================== Using Python 2.3 D:.\numpy> setup.py config --compiler=mingwg2 build --compiler=mingw32 bdist_wininst ...... compile options: '-DNO_ATLAS_INFO=2 -Id:\work\Programming\numerical\numpy\numpy\core\include -Ibuild\src.win32-2.3\nump \core -Id:\work\Programming\numerical\numpy\numpy\core\src -Id:\work\Programming\numerical\numpy\numpy\core\include -IC \PYTHON23\include -IC:\PYTHON23\PC -c' C:\MINGW\BIN\g77.exe -shared build\temp.win32-2.3\Release\work\programming\numerical\numpy\numpy\linalg\lapack_litemodu e.o -Ld:\work\Programming\numerical\libs -LC:/MINGW/BIN/../lib/gcc/mingw32/3.4.2 -LC:\PYTHON23\libs -LC:\PYTHON23\PCBui d -llapack -llapack -lf77blas -lcblas -latlas -lpython23 -lgcc -lg2c -o build\lib.win32-2.3\numpy\linalg\lapack_lite.py C:/MINGW/BIN/../lib/gcc/mingw32/3.4.2/libgcc.a(__main.o)(.text+0x4f): undefined reference to `__EH_FRAME_BEGIN__' C:/MINGW/BIN/../lib/gcc/mingw32/3.4.2/libgcc.a(__main.o)(.text+0x73): undefined reference to `__EH_FRAME_BEGIN__' collect2: ld returned 1 exit status ========================================================================================= Using Python 2.4 D:.\numpy> setup.py config --compiler=mingwg2 build --compiler=mingw32 bdist_wininst Running from numpy source directory. No module named __svn_version__ F2PY Version 2_2727 blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not find in C:\PYTHON24\lib libraries mkl,vml,guide not find in C:\ libraries mkl,vml,guide not find in C:\PYTHON24\libs NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS Setting PTATLAS=ATLAS Setting PTATLAS=ATLAS FOUND: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['d:\\work\\Programming\\numerical\\libs'] language = c No module named msvccompiler in numpy.distutils, trying from distutils.. Traceback (most recent call last): File "D:\work\Programming\numerical\numpy\setup.py", line 84, in ? setup_package() File "D:\work\Programming\numerical\numpy\setup.py", line 77, in setup_package configuration=configuration ) File "D:\work\Programming\numerical\numpy\numpy\distutils\core.py", line 144, in setup config = configuration() File "D:\work\Programming\numerical\numpy\setup.py", line 43, in configuration config.add_subpackage('numpy') File "D:\work\Programming\numerical\numpy\numpy\distutils\misc_util.py", line 740, in add_subpackage caller_level = 2) File "D:\work\Programming\numerical\numpy\numpy\distutils\misc_util.py", line 723, in get_subpackage caller_level = caller_level + 1) File "D:\work\Programming\numerical\numpy\numpy\distutils\misc_util.py", line 670, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File ".\numpy\setup.py", line 9, in configuration config.add_subpackage('core') File "D:\work\Programming\numerical\numpy\numpy\distutils\misc_util.py", line 740, in add_subpackage caller_level = 2) File "D:\work\Programming\numerical\numpy\numpy\distutils\misc_util.py", line 723, in get_subpackage caller_level = caller_level + 1) File "D:\work\Programming\numerical\numpy\numpy\distutils\misc_util.py", line 670, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "d:\work\Programming\numerical\numpy\numpy\core\setup.py", line 207, in configuration blas_info = get_info('blas_opt',0) File "D:\work\Programming\numerical\numpy\numpy\distutils\system_info.py", line 256, in get_info return cl().get_info(notfound_action) File "D:\work\Programming\numerical\numpy\numpy\distutils\system_info.py", line 397, in get_info self.calc_info() File "D:\work\Programming\numerical\numpy\numpy\distutils\system_info.py", line 1244, in calc_info atlas_version = get_atlas_version(**version_info) File "D:\work\Programming\numerical\numpy\numpy\distutils\system_info.py", line 1085, in get_atlas_version library_dirs=config.get('library_dirs', []), File "D:\work\Programming\numerical\numpy\numpy\distutils\command\config.py", line 101, in get_output self._check_compiler() File "D:\work\Programming\numerical\numpy\numpy\distutils\command\config.py", line 34, in _check_compiler old_config._check_compiler(self) File "C:\PYTHON24\lib\distutils\command\config.py", line 107, in _check_compiler dry_run=self.dry_run, force=1) File "D:\work\Programming\numerical\numpy\numpy\distutils\ccompiler.py", line 333, in new_compiler compiler = klass(None, dry_run, force) File "C:\PYTHON24\lib\distutils\msvccompiler.py", line 211, in __init__ self.__macros = MacroExpander(self.__version) File "C:\PYTHON24\lib\distutils\msvccompiler.py", line 112, in __init__ self.load_macros(version) File "C:\PYTHON24\lib\distutils\msvccompiler.py", line 133, in load_macros raise DistutilsPlatformError, \ distutils.errors.DistutilsPlatformError: The .NET Framework SDK needs to be installed before building extensions for Pyt hon. D:.\numpy> Dr. John N. Carter jn...@ec... ISIS http://www.ecs.soton.ac.uk/~jnc/ |
From: Alan G I. <ai...@am...> - 2006-07-03 14:50:17
|
On Mon, 03 Jul 2006, Sven Schreiber apparently wrote:=20 > Anything that allows me to develop a consistent habit is fine with me!=20 Consistency! That is exactly the issue, especially for those who wish to teach with numpy. I do not want to tell my students to write ones([5,5]) but rand(5,5) and although relatively new to Python I actually like the practice of providing dimensions in a list or tuple. But I care less about the choice of convention than about adherence to the convention. Cheers, Alan Isaac |
From: Tim H. <tim...@co...> - 2006-07-03 14:19:46
|
Travis Oliphant wrote: > Hmmm..... One thing that bothers me is that it seems that those > arguing *against* this behavior are relatively long-time users of Python > while those arguing *for* it are from what I can tell somewhat new to > Python/NumPy. I'm not sure what this means. > > Is the current behavior a *wart* you get used to or a clear *feature* > providing some gain in programming efficiency. > > If new users complain about something, my tendency is to listen openly > to the discussion and then discourage the implementation only if there > is a clear reason to. > > With this one, I'm not so sure of the clear reason. I can see that > "program parsers" would have a harder time with a "flexible" calling > convention. But, in my calculus, user-interface benefit outweighs > programmer benefit (all things being equal). > > It's easy as a programmer to get caught up in a very rigid system when > many users want flexibility. > > I must confess that I don't like looking at ones((5,5)) either. I much > prefer ones(5,5) or even ones([5,5]). > I'm not sure why more people don't write the second version. It's significantly easier to read since the mixed brackets stand out better. If one buys into Guido's description of what tuples and lists are for, then it's also more appropriate since a shape is homogeneous, variable length sequence (as opposed to fixed length heterogeneous collection which would be more appropriate to represent using a tuple). Not everyone buys into that, using tuples as essentially immutable lists, but it's easier to read in any event. I also find: ones([5, 5], dt) clearer than: ones(5, 5, dt) or, more dramatic, consider: ones([dx, dy], dt) versus ones(dx, dy, dt). Brrrr! A side note: one reason I'm big on specifying the dtype is that according to Kahan (many papers available at http://www.cs.berkeley.edu/~wkahan/) the single best thing you can do to check that an implementation is numerically sane is to examine the results at different precisions (say float32 and float64 since they're commonly available) and verify that the results don't go off the rails. Kahan is oft quoted by Tim Peters who seems to have one of the better grasps of the fiddly aspects of floating point in the Python community, so I give his views a lot of weight. Since I try to program with this in mind, at least for touchy numerical code, I end up parameterizing things based on dtype anyway. Then it's relatively easy to check that things are behaving simply by changing the dtype that is passed in. > But perhaps what this shows is something I've heard Robert discuss > before that ihas not received enough attention. NumPy really has at > least two "users" 1) application developers and 2) prototype developers > (the MATLAB crowd for lack of a better term). > > These two groups usually want different functionality (and in reality > most of us probably fall in both groups on different occasions). The > first clamors for more rigidity and conformity even at the expense of > user interfaces. These people usually want > > 1) long_but_explanatory_function names > 2) rigid calling syntax > One might also call it consistent calling syntax, but yes I'd put myself in this group and having consistent calling syntax catches many errors that would otherwise pass silently. It's also easier to figure out what's going one when coming back to functions written in the distant pass if their is only one calling syntax. > 3) strict name-space control > This combined with more explanatory names mentioned in 1 help make it easier to decipher code written in yesteryear. > The second group which wants to get something prototyped and working > quickly wants > > 1) short easy-to-remember names > 2) flexible calling syntax > 3) all-in-one name-space control > > My hypothesis is that when you first start with NumPy (especially with a > MATLAB or IDL history) you seem to start out in group 2 and stay there > for quick projects. Then, as code-size increases and applications get > larger, you become more like group 1. > > I think both groups have valid points to make and both styles can be > useful and one point or another. Perhaps, the solution is the one I have > barely begun to recognize, that others of you have probably already seen. > > A) Make numpy a good "library" for group 1. > B) Make a shallow "name-space" (like pylab) with the properties of group 2. > > Perhaps a good solution is to formalize the discussion and actually > place in NumPy a name-space package much like Bill has done privately. > +1 -tim |
From: David H. <dav...@gm...> - 2006-07-03 13:20:49
|
Hi Stephen, I don't know much about image analysis, but in the scipy tutorial (7.2Filtering), there is an example of an image filter that highlights the edges of an image. If I guess correctly, it finds the smoothing spline of the image and then computes the derivative of the spline. A high derivative means that the image intensity shifts rapidly, so maybe that could help you. David 2006/6/30, stephen emslie <ste...@gm...>: > > I am in the process of implementing an image processing algorithm that > requires following rays extending outwards from a starting point and > calculating the intensity derivative at each point. The idea is to find the > point where the difference in intensity goes beyond a particular threshold. > > Specifically I'm examining an image of an eye to find the pupil, and the > edge of the pupil is a sharp change in intensity. > > How does one iterate along a line in a 2d matrix, and is there a better > way to do this? Is this a problem that linear algebra can help with? > > Thanks > Stephen Emslie > > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job > easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > |
From: Albert S. <fu...@gm...> - 2006-07-03 12:59:11
|
Hello all Travis Oliphant wrote: > The ctypes-conversion object has attributes which return c_types aware > objects so that the information can be passed directly to c-code (as an > integer, the number of dimensions can already be passed using c-types). > > The information available and it's corresponding c_type is > > data - c_void_p > shape, strides - c_int * nd or c_long * nd or c_longlong * nd > depending on platform Stefan and I did some more experiments and it seems like .ctypes.strides isn't doing the right thing for subarrays. For example: In [52]: x = N.rand(3,4) In [57]: [x.ctypes.strides[i] for i in range(x.ndim)] Out[57]: [32, 8] This looks fine. But for this subarray: In [56]: [x[1:3,1:4].ctypes.strides[i] for i in range(x.ndim)] Out[56]: [32, 8] In this case, I think one wants strides[0] (the row stride) to return 40. .ctypes.data already seems to do the right thing: In [60]: x.ctypes.data Out[60]: c_void_p(31685288) In [61]: x[1:3,1:4].ctypes.data Out[61]: c_void_p(31685328) In [62]: 31685288-31685328 Out[62]: 40 What would be a good way of dealing with discontiguous arrays? It seems like one might want to disable their .ctypes attribute. Regards, Albert |
From: Sasha <nd...@ma...> - 2006-07-03 12:54:55
|
On 7/3/06, Andrew Corrigan <aco...@gm...> wrote: > ... Essentially I want to say something like: > A[:,:,repeat(newaxis, B.ndim)]*B[newaxis,newaxis,...] > > How can I express what I mean, such that it actually works? >>> A[(slice(None),)*2 + (newaxis,)*B.ndim] |
From: Andrew C. <aco...@gm...> - 2006-07-03 12:37:01
|
In a function I'm writing, I multiply two arrays together: A and B. where A.ndim == 2 and I don't know B.ndim in advance If I knew B.ndim == 3, then I would write A[:,:,newaxis,newaxis,newaxis]*B[newaxis,newaxis,...] or if I knew that B.ndim == 1 then I would write A[:,:,newaxis*B[newaxis,newaxis,...] but I don't know B.ndim. Essentially I want to say something like: A[:,:,repeat(newaxis, B.ndim)]*B[newaxis,newaxis,...] How can I express what I mean, such that it actually works? |
From: Stefan v. d. W. <st...@su...> - 2006-07-03 12:34:59
|
On Mon, Jul 03, 2006 at 11:16:26AM +0900, Bill Baxter wrote: > What's the best way to combine say several 2-d arrays together into a g= rid? > Here's the best I can see: >=20 > >>> a =3D eye(2,2) > >>> b =3D 2*a > >>> c =3D 3*a > >>> d =3D 4*a > >>> r_[c_[a,b],c_[c,d]] > array([[1, 0, 2, 0], > [0, 1, 0, 2], > [3, 0, 4, 0], > [0, 3, 0, 4]]) >=20 > In matlab you'd get the same effect by saying: [ a, b; c, d ] >=20 > Compared to that, r_[c_[a,b],c_[c,d]] looks quite a mess. You could always explicitly write out what you are doing, i.e. In [47]: N.vstack((N.hstack((a,b)), N.hstack((c,d)))) Out[47]: array([[ 1., 0., 2., 0.], [ 0., 1., 0., 2.], [ 3., 0., 4., 0.], [ 0., 3., 0., 4.]]) St=E9fan |
From: Sven S. <sve...@gm...> - 2006-07-03 10:44:28
|
I agree with everything in your post, so I'm really happy you're a central figure of numpy! As for rand(), ones() etc.: I don't mind (too much) the double pair of parentheses in ones((5,5)), but I find Robert's proposal in an earlier thread ("If you want a function that takes tuples, use numpy.random.random().") a little impractical, because I'm mostly one of the prototypers as you defined them. I've come to realize that for me (and from what I read, for Bill as well) rand() is not adding convenience because we cannot settle on any habit (neither single nor double pairs of parentheses) and therefore we continue to get one or the other wrong when we're not paying full attention. Anything that allows me to develop a consistent habit is fine with me! Thanks much, Sven Travis Oliphant schrieb: > Hmmm..... One thing that bothers me is that it seems that those > arguing *against* this behavior are relatively long-time users of Python > while those arguing *for* it are from what I can tell somewhat new to > Python/NumPy. I'm not sure what this means. > > Is the current behavior a *wart* you get used to or a clear *feature* > providing some gain in programming efficiency. > > If new users complain about something, my tendency is to listen openly > to the discussion and then discourage the implementation only if there > is a clear reason to. > > With this one, I'm not so sure of the clear reason. I can see that > "program parsers" would have a harder time with a "flexible" calling > convention. But, in my calculus, user-interface benefit outweighs > programmer benefit (all things being equal). > > It's easy as a programmer to get caught up in a very rigid system when > many users want flexibility. > > I must confess that I don't like looking at ones((5,5)) either. I much > prefer ones(5,5) or even ones([5,5]). > > But perhaps what this shows is something I've heard Robert discuss > before that ihas not received enough attention. NumPy really has at > least two "users" 1) application developers and 2) prototype developers > (the MATLAB crowd for lack of a better term). > > These two groups usually want different functionality (and in reality > most of us probably fall in both groups on different occasions). The > first clamors for more rigidity and conformity even at the expense of > user interfaces. These people usually want > > 1) long_but_explanatory_function names > 2) rigid calling syntax > 3) strict name-space control > > The second group which wants to get something prototyped and working > quickly wants > > 1) short easy-to-remember names > 2) flexible calling syntax > 3) all-in-one name-space control > > My hypothesis is that when you first start with NumPy (especially with a > MATLAB or IDL history) you seem to start out in group 2 and stay there > for quick projects. Then, as code-size increases and applications get > larger, you become more like group 1. > > I think both groups have valid points to make and both styles can be > useful and one point or another. Perhaps, the solution is the one I have > barely begun to recognize, that others of you have probably already seen. > > A) Make numpy a good "library" for group 1. > B) Make a shallow "name-space" (like pylab) with the properties of group 2. > > Perhaps a good solution is to formalize the discussion and actually > place in NumPy a name-space package much like Bill has done privately. > > -Travis > > > > > > > > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Sven S. <sve...@gm...> - 2006-07-03 09:53:59
|
Travis Oliphant schrieb: > > You can use a masked array specifically, or use nan's for missing values > and just tell Python you want a floating-point array (because it finds > the None object it's guessing incorrectly you want an "object" array. > > asarray(x, dtype=float) > > array([[ 1. , nan], > [ 2. , 3. ]]) > Is there anything else besides None which is recognized/converted to numpy.nan? Put differently, where can I find documentation about basic nan definition and handling in numpy? (I have the numpy book which covers isnan etc., when you have the nans already set up.) I was also a bit surprised at the following behavior: >>> a = numpy.asarray([1,1]) >>> a array([1, 1]) >>> a[0]=numpy.nan >>> a array([0, 1]) Is this a bug or intended? Thanks, Sven |
From: Simon B. <si...@ar...> - 2006-07-03 09:51:42
|
On Sun, 2 Jul 2006 16:36:14 -0700 "Webb Sprague" <web...@gm...> wrote: > > Given the long history of python and its ancestry in C (for which zero > based indexing made lots of sense since it dovetailed with thinking in > memory offsets in systems programming), there is probably nothing to > be done now. I guess I just want to vent, but also to ask if anyone > has found any way to deal with this issue in their own scientific > programming. the mathemaician John Conway, in his book, "the book of numbers" has a brilliant discussion of just this issue. It does indeed make sense mathematically. Simon. |
From: Robert C. <cim...@nt...> - 2006-07-03 09:35:17
|
Sasha wrote: > On 7/2/06, Norbert Nemec <Nor...@gm...> wrote: >> ... >> Does anybody know about the internals of the python "set"? How is >> .keys() implemented? I somehow have really doubts about the efficiency >> of this method. >> > Set implementation (Objects/setobject.c) is a copy and paste job from > dictobject with values removed. As a result it is heavily optimized > for the case of string valued keys - a case that is almost irrelevant > for numpy. > > I think something like the following (untested, 1d only) will probably > be much faster and sorted: > > def unique(x): > s = sort(x) > r = empty_like(s) > r[:-1] = s[1:] > r[-1] = s[0] > return s[r != s] There are 1d array set operations like this already in numpy (numpy/lib/arraysetops.py - unique1d, ...) r. |
From: Sven S. <sve...@gm...> - 2006-07-03 08:55:47
|
Bill Baxter schrieb: > Neat, didn't know about that. But, grr, always returns matrix > regardless of argument types. > --bb > Well Bill mayb you should have stayed with matrices ;-) But I also see no reason why it shouldn't be expected work for 2d-arrays in general. (Or maybe even for more dimensions as well?) Perhaps you should file this as a new ticket. cheers, sven |
From: Albert S. <fu...@gm...> - 2006-07-03 08:11:19
|
Hello all Travis Oliphant wrote: > <snip> > Unfortunately, from the source code this is not true. It would be an > improvement, but the source code shows that the from_param of each type > does something special and only works with particular kinds of > data-types --- basic Python types or ctypes types. I did not see > evidence that the _as_parameter_ method was called within any of the > from_param methods of _ctypes.c To summarise, I think we've come to the conclusion that one should avoid argtypes when mixing NumPy with ctypes? (at least for now) The extensions to .ctypes you propose below should make it possible to use NumPy arrays with argtypes set. "Raw" C functions will probably be wrapped by a Python function 99.9% of the time for error checking, etc. This hides the need to call the .ctypes stuff from the user. > <snip> > > Maybe there should be a way to get a pointer to the NumPy array data as > a > > POINTER(c_double) if it is known that the array's dtype is float64. > Ditto > > for c_int/int32 and the others. > > > > I could see value in > > arr.ctypes.data_as() > arr.ctypes.strides_as() > arr.ctypes.shape_as() > > methods which allow returning the data as different kinds of c-types > things instead of the defaults --- Perhaps we just make data, strides, > and shapes methods with an optional argument. Agreed. If you really like argtypes, arr.ctypes.data_as() would be perfect for doing the necessary work to make sure ctypes accepts the array. arr.ctypes.data_as(c_type) could be implemented as ctypes.cast(x.ctypes.data, ctypes.POINTER(c_type)) c_void_p, c_char_p and c_wchar_p are special cases that aren't going to work here, so maybe it should be ctypes.cast(x.ctypes.data, c_type) in which case one mostly call it as arr.ctypes.data_as(POINTER(c_type)). Regards, Albert |
From: Travis O. <oli...@ie...> - 2006-07-03 07:51:10
|
Hmmm..... One thing that bothers me is that it seems that those arguing *against* this behavior are relatively long-time users of Python while those arguing *for* it are from what I can tell somewhat new to Python/NumPy. I'm not sure what this means. Is the current behavior a *wart* you get used to or a clear *feature* providing some gain in programming efficiency. If new users complain about something, my tendency is to listen openly to the discussion and then discourage the implementation only if there is a clear reason to. With this one, I'm not so sure of the clear reason. I can see that "program parsers" would have a harder time with a "flexible" calling convention. But, in my calculus, user-interface benefit outweighs programmer benefit (all things being equal). It's easy as a programmer to get caught up in a very rigid system when many users want flexibility. I must confess that I don't like looking at ones((5,5)) either. I much prefer ones(5,5) or even ones([5,5]). But perhaps what this shows is something I've heard Robert discuss before that ihas not received enough attention. NumPy really has at least two "users" 1) application developers and 2) prototype developers (the MATLAB crowd for lack of a better term). These two groups usually want different functionality (and in reality most of us probably fall in both groups on different occasions). The first clamors for more rigidity and conformity even at the expense of user interfaces. These people usually want 1) long_but_explanatory_function names 2) rigid calling syntax 3) strict name-space control The second group which wants to get something prototyped and working quickly wants 1) short easy-to-remember names 2) flexible calling syntax 3) all-in-one name-space control My hypothesis is that when you first start with NumPy (especially with a MATLAB or IDL history) you seem to start out in group 2 and stay there for quick projects. Then, as code-size increases and applications get larger, you become more like group 1. I think both groups have valid points to make and both styles can be useful and one point or another. Perhaps, the solution is the one I have barely begun to recognize, that others of you have probably already seen. A) Make numpy a good "library" for group 1. B) Make a shallow "name-space" (like pylab) with the properties of group 2. Perhaps a good solution is to formalize the discussion and actually place in NumPy a name-space package much like Bill has done privately. -Travis |
From: Albert S. <fu...@gm...> - 2006-07-03 07:48:52
|
Hello all, Travis Oliphant wrote: > Hey Albert, I read the post you linked to on the ctypes mailing list. > I hope I didn't step on any toes with what I did in NumPy. I was just Certainly not. This is great stuff! > working on a ctypes interface and realized that a lot of the cruft to > convert to what ctypes was expecting could and should be handled in a > default way. The conversion of the shapes and strides information to > the "right-kind" of ctypes integer plus the inclusion of ctypes in > Python 2.5 was enough to convince me to put some kind of hook into the > array object. I decided to make the ctypes attribute return an object > so that the object could grow in the future additional attributes and/or > methods to make it easier to interface with ctypes. Ah! So arr.ctypes.* is a collection of things that one typically needs to pass to C functions to get them to do their work, i.e. a pointer to data and some description of the data buffer (shape, strides, etc.). Very nice. > <snip> > Basically, what you need is a type-map just like swig uses. But, now > that ctypes is in Python, it will be slower to change. That's a bit > unfortunate. If we find the ctypes in Python 2.5 to be missing some features, maybe Thomas Heller could release "ctypes2" to tide us over until Python 2.6. But I think ctypes as it will appear in Python 2.5 is already excellent. > But, ultimately, it works fine now. I don't know what is really gained > by applying an argtypes to a function call anyway --- some kind of > "type-checking". Is that supposed to be safer. Yes, type-checking mostly. Some interesting things might happen when you're passing structs by value. But hopefully it just works. > For NumPy extension modules, type checking is only a small part of the > memory-violation danger. In-correct array bounds and/or striding is far > more common - not-to mention unaligned memory areas and/or unwriteable > ones (like a read-only memory-mapped file). Agreed. > Thus, you're going to have to write a small "error-checking" code in > Python anyway that calls out to the C-library with the right > arguments. So, basically you write an extension module that calls > c-code just as you did before, but now the entire "extension" module can > all be in Python because the call to an arbitrary C-library is made > using ctypes. Exactly. And once you have your DLL/shared library, there is no need to compile anything again. Another useful benefit on Windows is that you can build your extension in debug mode without having to have a debug build of Python. This is very useful. > <snip> > Frankly, I'm quite impressed with the ease of accessing C-code available > using c-types. It quite rivals f2py in enjoyment using it. Indeed. Viva ctypes! > <snip> Regards, Albert |
From: Bill B. <wb...@gm...> - 2006-07-03 07:26:28
|
Neat, didn't know about that. But, grr, always returns matrix regardless of argument types. --bb On 7/3/06, Alan G Isaac <ai...@am...> wrote: > > On Mon, 3 Jul 2006, Bill Baxter apparently wrote: > > What's the best way to combine say several 2-d arrays > > together into a grid? > > > >>> help(N.bmat) > Help on function bmat in module numpy.core.defmatrix: > > bmat(obj, ldict=None, gdict=None) > Build a matrix object from string, nested sequence, or array. > > Ex: F = bmat('A, B; C, D') > F = bmat([[A,B],[C,D]]) > F = bmat(r_[c_[A,B],c_[C,D]]) > > all produce the same Matrix Object [ A B ] > [ C D ] > > if A, B, C, and D are appropriately shaped 2-d arrays. > > hth, > Alan Isaac > > > > > |
From: Bill B. <wb...@gm...> - 2006-07-03 06:59:56
|
On 7/3/06, Alan G Isaac <ai...@am...> wrote: > > > Your primary argument against changing the API, as far as > I can see, is that allowing *both* the extant behavior and > the numpy consistent behavior will result in confusing code. > http://aspn.activestate.com/ASPN/Mail/Message/numpy-discussion/3150643 > Is this a knock-down argument? I think not. In particular the argument was that it would make for code that's confusing for users to read. I.e. in some places the users see 'rand(2,2,3)' and in other places 'rand((2,2,3))'. I really don't see anything confusing about that. There's only one interpretation of either of those that makes sense. If there's any confusion issue I think it's more likely to come from looking at the source code of rand() itself. But even that is pretty minor given a few minutes to write a some comments about what's going on. Personally I think allowing the separate argument variations on these few methods would be a good thing. It makes ones() and zeros() more like Matlab's for one. But also it just looks cleaner to say ones(2,3,5) than it does to say ones((2,3,5)). I understand the general objections to it. -- It's kind of hacky with the *args,*kwargs; -- it leads to "more that one way to do it"; -- makes the implementation code a little harder to write and read (but I say it makes user code EASIER to write/read); -- can make IDEs confused (PyCrust at least tries suggests *args,**kwags as arguments); and -- isn't always possible to have both tuple and separate shape values if the arg after the shape arguments is also a number, like func(shape, num=0) But in this case, since these are functions that are both really simple and get a lot of use, I think it's worth it to make them easier to use, even if it uglifies the implementation. At this point since I've already got an implementation of this which works great for everything I want it to do, I'm not really going to be affected by whatever numpy decides to go with. I'll just wrap numpy with my functions. Making my own wrapper layer for my favorite numpy customizations was something I'd been meaning to do anyway. But I do think this is a change that would make numpy more user friendly. And as Alan points out, it seems to be a repeat discussion. I suspect that's because they are all functions newbies will encounter early on when they're trying to understand the logic behind numpy. |
From: Travis O. <oli...@ie...> - 2006-07-03 06:14:09
|
Albert Strasheim wrote: > I did a few tests and this seems to work nicely: > > In [133]: printf = ctypes.cdll.msvcrt.printf > > In [134]: printf.argtypes = [ctypes.c_char_p, ctypes.c_void_p] > > In [135]: x = N.array([1,2,3]) > > In [136]: printf('%p\n', x.ctypes.data) > 01CC8AC0 > Out[136]: 9 > > In [137]: hex(x.__array_interface__['data'][0]) > Out[137]: '0x1cc8ac0' > > It would be nice if we could the _as_parameter_ magic to work as well. See > this thread: > > http://aspn.activestate.com/ASPN/Mail/Message/ctypes-users/3122558 > > If I understood Thomas correctly, in the presence of argtypes an an > instance, say x, with _as_parameter_, the following is done to convert the > instance to something that the function accepts as its nth argument: > > func.argtypes[n].from_param(x._as_parameter_) > Unfortunately, from the source code this is not true. It would be an improvement, but the source code shows that the from_param of each type does something special and only works with particular kinds of data-types --- basic Python types or ctypes types. I did not see evidence that the _as_parameter_ method was called within any of the from_param methods of _ctypes.c > However, if I try passing x directly to printf, I get this: > > In [147]: printf('%p\n', x) > ... > ArgumentError: argument 2: exceptions.TypeError: wrong type > > However, this much works: > > In [148]: ctypes.c_void_p.from_param(x._as_parameter_) > Out[148]: <cparam 'P' (01cc8ac0)> > > So I don't understand why the conversion isn't happening automatically. > Despite any advertisement, the code is just not there in ctypes to do it when argtypes are present. Dealing with non-ctypes data is apparently not handled when argtypes are present. Get-rid of the argtypes setting and it will work (because then the _as_parameter_ method is called.... > Another quirk I noticed is that non-void pointers' from_param can't seem to > be used with ints. For example: > Yeah from the code it looks like each from_param method has it's own implementation that expects it's own set of "acceptable" things. There does not seem to be any way for an object to inform it appropriately. > I don't think this is too much of an issue though -- you could wrap all your > functions to take c_void_ps. If you happen to pass an int32 NumPy array to a > function expecting a double*, you might run into problems though. > Yeah, but you were going to run into trouble anyway. I don't really see a lot of "value-added" in the current type-checking c-types provides and would just ignore it at this point. Build a Python function that calls out to the c-function. > Maybe there should be a way to get a pointer to the NumPy array data as a > POINTER(c_double) if it is known that the array's dtype is float64. Ditto > for c_int/int32 and the others. > I could see value in arr.ctypes.data_as() arr.ctypes.strides_as() arr.ctypes.shape_as() methods which allow returning the data as different kinds of c-types things instead of the defaults --- Perhaps we just make data, strides, and shapes methods with an optional argument. -Travis |
From: Pierre GM <pgm...@ma...> - 2006-07-03 06:02:13
|
Keith, > > Is there something better than None to represent missing values so > > that when I convert to numpy arrays (actually matrices) I'll be all > > set? (I could use -99, but that would be even more embarrassing than > > my python skills.) As Tim suggested, have a look at the masked array module. However, the result will NOT be exportable to matrices, unless you fill the missing value first (for example, with -99 ;)). I use MaskedArrays a lot, they're quite flexible. An alternative would be to use nan instead of None: >>> import numpy as N >>> x = [[1, nan], [2, 3]] >>> print N.matrix(x) [[ 1. nan] [ 2. 3. ]] Of course, the solution will depend on what you need... |
From: Travis O. <oli...@ie...> - 2006-07-03 05:55:16
|
Keith Goodman wrote: > I have a list x > > >>> x >>> > [[1, None], [2, 3]] > > that I generate outside of numpy (with plain python). What is the best > way to convert x into an array? This doesn't work > > >>> asarray(x) >>> > > array([[1, None], > [2, 3]], dtype=object) <-- I'm hoping for something like dtype=float64 > > Is there something better than None to represent missing values so > that when I convert to numpy arrays (actually matrices) I'll be all > set? (I could use -99, but that would be even more embarrassing than > my python skills.) > You can use a masked array specifically, or use nan's for missing values and just tell Python you want a floating-point array (because it finds the None object it's guessing incorrectly you want an "object" array. asarray(x, dtype=float) array([[ 1. , nan], [ 2. , 3. ]]) -Travis |
From: Travis O. <oli...@ie...> - 2006-07-03 05:50:59
|
Albert Strasheim wrote: > I did a few tests and this seems to work nicely: > Hey Albert, I read the post you linked to on the ctypes mailing list. I hope I didn't step on any toes with what I did in NumPy. I was just working on a ctypes interface and realized that a lot of the cruft to convert to what ctypes was expecting could and should be handled in a default way. The conversion of the shapes and strides information to the "right-kind" of ctypes integer plus the inclusion of ctypes in Python 2.5 was enough to convince me to put some kind of hook into the array object. I decided to make the ctypes attribute return an object so that the object could grow in the future additional attributes and/or methods to make it easier to interface with ctypes. I looked a bit at the source code and was disappointed to see that the _as_parameter_ approach is pretty limited. While there is talk of supporting a tuple return of _as_parameter_ in the source code comments, there is no evidence in the source itself of supporting it. There is also the changed way of handling additional arguments when argtypes is set on the function which uses the from_param method. Unfortunately, as Thomas responds to your post, the from_param method must be on one of the ctypes to work. You have to add support specifically for one of the c-data types. I think the _as_parameter_ approach returning a tuple that could be interpreted as the right ctype was better because it let other objects play the ctypes game. Basically, what you need is a type-map just like swig uses. But, now that ctypes is in Python, it will be slower to change. That's a bit unfortunate. But, ultimately, it works fine now. I don't know what is really gained by applying an argtypes to a function call anyway --- some kind of "type-checking". Is that supposed to be safer. For NumPy extension modules, type checking is only a small part of the memory-violation danger. In-correct array bounds and/or striding is far more common - not-to mention unaligned memory areas and/or unwriteable ones (like a read-only memory-mapped file). Thus, you're going to have to write a small "error-checking" code in Python anyway that calls out to the C-library with the right arguments. So, basically you write an extension module that calls c-code just as you did before, but now the entire "extension" module can all be in Python because the call to an arbitrary C-library is made using ctypes. For arrays, you will typically need to pass one or more of the data, the dimension information, the stride information, and the number of dimensions. The data-type will be known about because function calls usually handle only a specific data-type. Thus, I started with a ctypes object that produces this needed data in the format that ctypes needs, so it can be very easy to use an array with the ctypes module. Frankly, I'm quite impressed with the ease of accessing C-code available using c-types. It quite rivals f2py in enjoyment using it. One thing I like about c-types over Pyrex, for example is that it lets you separate the C-code from the Python code instead of "mixing it all together" I wouldn't be surprised if c-types doesn't become the dominant way to interface C/C++ and possibly even Fortran code (but it has tougher competition in f2py) once it grows up a little with additional ease-of-use. -Travis |