You can subscribe to this list here.
2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
From: <sa...@hy...> - 2001-12-04 18:22:10
|
My digging shows that most stuff is passed on to Numeric and then on to mutliarray, which is a .pyd. At that point, I gave up trying to figure things out. On 4 Dec 01, at 11:05, Travis Oliphant wrote: > However, I'm not sure what is caused by MA and what is caused by basic > Numeric, however. > > > -Travis > > > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion |
From: Travis O. <oli...@ee...> - 2001-12-04 18:04:17
|
> I am trying to use objects in an array, and still be able to use the > various extra functions offered by multiarray. I am finding that some > of the functions work and some don't. Is it hopeless to try to use > objects in an array and expect <op>.reduce and others to work > properly? > > As a simple example, I have a DataPoint object that consists of a > value and flag(s). This object has all the __cmp__, __add_, etc > functions implemented. > > I can do MA.average(m), MA.sum(m), MA.add.reduce(m), (they > seem to use __add__) but I can't do MA.minimum.reduce(m) or > MA.maximum.reduce(m). This is actually, very helpful information. I'm not sure how well-tested the object-type is with Numeric. I know it has been used, but I'm not sure all of the ufunc methods have been well tested. However, I'm not sure what is caused by MA and what is caused by basic Numeric, however. -Travis |
From: <sa...@hy...> - 2001-12-04 17:27:41
|
I am trying to use objects in an array, and still be able to use the various extra functions offered by multiarray. I am finding that some of the functions work and some don't. Is it hopeless to try to use objects in an array and expect <op>.reduce and others to work properly? As a simple example, I have a DataPoint object that consists of a value and flag(s). This object has all the __cmp__, __add_, etc functions implemented. I can do MA.average(m), MA.sum(m), MA.add.reduce(m), (they seem to use __add__) but I can't do MA.minimum.reduce(m) or MA.maximum.reduce(m). I can do MA.maximum(m) and MA.minimum(m), but not MA.maximum(m, 0) or MA.minimum(m, 0) The values returned by MA.argmax(m) makes no sense (wrong index?) but is consistent with results from argsort(). MA.argmin(m) gives an error (I have a __neg__ fn in datapoint) File "C:\Python21\MA\MA.py", line 1977, in argmin return Numeric.argmin(d, axis) File "C:\Python21\Numeric\Numeric.py", line 281, in argmin a = -array(a, copy=0) TypeError: bad operand type for unary - for example: print m # 3 valid, 1 masked print MA.maximum(m) print MA.argmax(m) # gives index to masked value print MA.minimum(m) #print MA.argmin(m) - gives error above print MA.argsort(m) print MA.average(m) print MA.maximum.reduce(m, 0) [1, ] ,[10, a] ,[None, D] ,-- ,] [10, a] 3 [None, D] [2,0,1,3,] [3.66666666667, aD] Traceback (most recent call last): File "C:\Python21\Pythonwin\pywin\framework\scriptutils.py", line 301, in RunScript exec codeObject in __main__.__dict__ File "C:\Python21\HDP\Data\DataPoint.py", line 136, in ? print MA.maximum.reduce(m, 0) File "C:\Python21\MA\MA.py", line 1913, in reduce t = Numeric.maximum.reduce(filled(target, maximum_fill_value(target)), axis) TypeError: function not supported for these types, and can't coerce to supported types |
From: Pearu P. <pe...@ce...> - 2001-12-04 14:16:47
|
F2PY - Fortran to Python Interface Generator I am pleased to announce the third public release of f2py (2nd Edition) (version 2.3.328): http://cens.ioc.ee/projects/f2py2e/ f2py is a command line tool for binding Python and Fortran codes. It scans Fortran 77/90/95 codes and generates a Python C/API module that makes it possible to call Fortran subroutines from Python. No Fortran or C expertise is required for using this tool. Features include: *** All basic Fortran types are supported: integer[ | *1 | *2 | *4 | *8 ], logical[ | *1 | *2 | *4 | *8 ], character[ | *(*) | *1 | *2 | *3 | ... ] real[ | *4 | *8 | *16 ], double precision, complex[ | *8 | *16 | *32 ] *** Multi-dimensional arrays of (almost) all basic types. Dimension specifications: <dim> | <start>:<end> | * | : *** Supported attributes and statements: intent([ in | inout | out | hide | in,out | inout,out ]) dimension(<dimspec>) depend([<names>]) check([<C-booleanexpr>]) note(<LaTeX text>) optional, required, external NEW: intent(c), threadsafe, fortranname *** Calling Fortran 77/90/95 subroutines and functions. Also Fortran 90/95 module subroutines are supported. Internal initialization of optional arguments. *** Accessing COMMON blocks from Python. NEW: Accessing Fortran 90/95 module data. *** Call-back functions: calling Python functions from Fortran with very flexible hooks. *** In Python, arguments of the interfaced functions may be of different type - necessary type conversations are done internally in C level. *** Automatically generates documentation (__doc__,LaTeX) for interfaced functions. *** Automatically generates signature files --- user has full control over the interface constructions. Automatically detects the signatures of call-back functions, solves argument dependencies, etc. NEW: * Automatically generates setup_<modulename>.py for building extension modules using tools from distutils and fortran_support module (from SciPy). *** Automatically generates Makefile for compiling Fortran and C codes and linking them to a shared module. Many compilers are supported: gcc, Compaq Fortran, VAST/f90 Fortran, Absoft F77/F90, MIPSpro 7 Compilers, etc. Platforms: Intel/Alpha Linux, HP-UX, IRIX64. *** Complete User's Guide in various formats (html,ps,pdf,dvi). *** f2py users list is available for support, feedback, etc. NEW: * Installation with distutils. *** And finally, many bugs were fixed. More information about f2py, see http://cens.ioc.ee/projects/f2py2e/ LICENSE: f2py is released under the LGPL. Sincerely, Pearu Peterson <pe...@ce...> December 4, 2001 <P><A HREF="http://cens.ioc.ee/projects/f2py2e/">f2py 2.3.328</A> - The Fortran to Python Interface Generator (04-Dec-01) |
From: Gerard V. <gve...@la...> - 2001-12-04 08:10:21
|
Hi, Could somebody tell me if the Numeric executable installer for Windows is made with distutils? Thanks -- Gerard |
From: Travis O. <oli...@ee...> - 2001-12-03 20:35:39
|
> > Is there some equivalent of limits.h for Numeric? > > And is there a searchable archive for the numpy-discussion list?? IIRC, > the standard sourceforge lists aren't searchable (?!), which is bizarre > if true. Try limits.py in SciPy (www.scipy.org) -Travis |
From: John J. L. <jj...@po...> - 2001-12-03 20:01:20
|
Is there some equivalent of limits.h for Numeric? And is there a searchable archive for the numpy-discussion list?? IIRC, the standard sourceforge lists aren't searchable (?!), which is bizarre if true. John |
From: Rob <eu...@ho...> - 2001-12-01 01:34:53
|
Its now at www.pythonemproject.com. I can be reached at ro...@py.... All this has come about since @home is possibly suspending operation at midnite tonight :( Rob. Looks like I need to change my sig too :) -- The Numeric Python EM Project www.members.home.net/europax |
From: Konrad H. <hi...@cn...> - 2001-11-30 20:16:14
|
Mike Romberg <ro...@fs...> writes: > I'm wondering if there is some good reason why equal(), not_equal(), > nonzero() and the like do not work with numeric arrays of tyep > complex. I can see why operators like less() and less_equal() do not > work. But the pure equality ones seem like they should work. Or am I > missing something :). Before Python 2.1, comparison couldn't be implemented for equality only. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: Mike R. <ro...@fs...> - 2001-11-30 19:29:55
|
I'm wondering if there is some good reason why equal(), not_equal(), nonzero() and the like do not work with numeric arrays of tyep complex. I can see why operators like less() and less_equal() do not work. But the pure equality ones seem like they should work. Or am I missing something :). Thanks, Mike Romberg (ro...@fs...) |
From: <sa...@hy...> - 2001-11-29 23:20:38
|
Paul, Well, you're right. I did misunderstand your reply, as well as what the various functions were supposed to do. I was mis-using the sum, minimum, maximum as tho they were MA.<op>.reduce, and my test case didn't point out the difference. I should always have been doing the .reduce version. I apologize for this! I found a section on page 45 of the Numerical Python text (PDF form, July 13, 2001) that defines sum as 'The sum function is a synonym for the reduce method of the add ufunc. It returns the sum of all the elements in the sequence given along the specified axis (first axis by default).' This is where I would expect to see a caveat about it not retaining any mask-edness. I was misussing the MA.minimum and MA.maximum as tho they were .reduce version. My bad. The MA.average does produce a masked array, but it has changed the 'missing value' to fill_value=[ 1.00000002e+020,]). I do find this a bit odd, since the other reductions didn't change the fill value. Anyway, I can now get the stats I want in a format I want, and I understand better the various functions for array/masked array. Thanks for the comments/input. sue |
From: Paul F. D. <pa...@pf...> - 2001-11-29 20:53:51
|
You have misread my reply. It is not true that MA.op works one way and MA.op.reduce is different. sum and add.reduce are different, and the documentation for sum DOES say the right thing for sum. The function sum is a special case in that its native meaning was the same as add.reduce and so the function is redundant. I believe you are in error wrt average; average works the way you want. Function count can tell you the number of non-masked values either in the whole array or axis-wise if you give an axis argument. Function size gives you the total number, so #invalid is size(x)-count(x). maximum and minimum (don't use max and min, they are built-ins that don't know about Numeric) have two forms. When called with one argument they return the overall max or min of the whole array, returning masked only if all entries are masked. For two arguments, you get element-wise extrema, and the mask is on where any one of the arguments was masked. >>> print x [[1 ,-- ,3 ,] [11 ,-- ,-- ,]] >>> print average(x) [6.0 ,-- ,3.0 ,] >>> y array( [[ 6, 7, 8,] [ 9,10,11,]]) >>> print maximum(x,y) [[6 ,-- ,8 ,] [11 ,-- ,-- ,]] >>> y[0,0]=masked >>> print maximum(x,y) [[-- ,-- ,8 ,] [11 ,-- ,-- ,]] -----Original Message----- From: num...@li... [mailto:num...@li...] On Behalf Of Sue Giller Sent: Thursday, November 29, 2001 9:50 AM To: num...@li... Subject: [Numpy-discussion] Re: Using Reduce with Multi-dimensional Masked array Thanks for the pointer. The example I gave using the sum operation is merely an example - I could also be doing other manipulations such as min, max, average, etc. I see that the MA.<op>.reduce functions will do what I want, but to do an average, I will need to do two steps since the MA.average function will have the original 'unexpected' behavior that I don't want. That raises the question of how to determine a count of valid values in a masked array. Can I assume that I can do 'math' on the mask array itself, for example to sum along a given axis and have the masked cells add up? In my original example, I would expect a sum along the second axis to return [0,0,0,2,0]. Can I rely on this? I would suggest that a .count operator would be very useful in working with masked arrays (count valid and count masked). >>> m = MA.masked_values(a, -99) >>> m array(data = [[ 1, 2, 3,-99, 5,] [ 10, 20, 30,-99, 50,]], mask = [[0,0,0,1,0,] [0,0,0,1,0,]], fill_value=-99) To add an opinion on the question from Paul about 'expected' behavior, I was working off the documentation for Numerical Python, and there were no caveats in there about MA.<op> working one way, and MA.<op>.reduce working another. The answer is always in the documentation, especially for users like me who don't have time or knkowledge to go reading thru all the code modules to try and figure out what is happening. From a purely user standpoint, I would expect a masked array to retain it's mask-edness at all times, unless I explicitly tell it not to. In that case, I would still expect it to replace the 'masked' cells with the original masked value, and not just arbitrarily assign some other value, such as 0. Thanks again for the prompt reply. _______________________________________________ Numpy-discussion mailing list Num...@li... https://lists.sourceforge.net/lists/listinfo/numpy-discussion |
From: Reggie D. <re...@me...> - 2001-11-29 18:35:12
|
> That raises the question of how to determine a count of valid values > in a masked array. Can I assume that I can do 'math' on the mask > array itself, for example to sum along a given axis and have the > masked cells add up? > > In my original example, I would expect a sum along the second axis > to return [0,0,0,2,0]. Can I rely on this? I would suggest that a > .count operator would be very useful in working with masked arrays > (count valid and count masked). Actually masked arrays already have a count method that does what you want: Python 2.2b2 (#26, Nov 16 2001, 11:44:11) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from pydoc import help >>> import MA >>> x = MA.arange(10) >>> help(x.count) Help on method count in module MA.MA: count(self, axis=None) method of MA.MA.MaskedArray instance Count of the non-masked elements in a, or along a certain axis. >>> x.count() 10 >>> |
From: <sa...@hy...> - 2001-11-29 17:48:34
|
Thanks for the pointer. The example I gave using the sum operation is merely an example - I could also be doing other manipulations such as min, max, average, etc. I see that the MA.<op>.reduce functions will do what I want, but to do an average, I will need to do two steps since the MA.average function will have the original 'unexpected' behavior that I don't want. That raises the question of how to determine a count of valid values in a masked array. Can I assume that I can do 'math' on the mask array itself, for example to sum along a given axis and have the masked cells add up? In my original example, I would expect a sum along the second axis to return [0,0,0,2,0]. Can I rely on this? I would suggest that a .count operator would be very useful in working with masked arrays (count valid and count masked). >>> m = MA.masked_values(a, -99) >>> m array(data = [[ 1, 2, 3,-99, 5,] [ 10, 20, 30,-99, 50,]], mask = [[0,0,0,1,0,] [0,0,0,1,0,]], fill_value=-99) To add an opinion on the question from Paul about 'expected' behavior, I was working off the documentation for Numerical Python, and there were no caveats in there about MA.<op> working one way, and MA.<op>.reduce working another. The answer is always in the documentation, especially for users like me who don't have time or knkowledge to go reading thru all the code modules to try and figure out what is happening. From a purely user standpoint, I would expect a masked array to retain it's mask-edness at all times, unless I explicitly tell it not to. In that case, I would still expect it to replace the 'masked' cells with the original masked value, and not just arbitrarily assign some other value, such as 0. Thanks again for the prompt reply. |
From: Giulio B. <giu...@li...> - 2001-11-29 10:09:10
|
My answer is yes: the difference between the two behaviors could be confusing for the user. If I can dare to express a "general rule", I would say that the masks in MA arrays should not disappear if not EXPLICITLY required to do so! Of course you can interpret a provided value for the fill_value parameter in the sum function as such a request... but if value is not provided, than I would say that the correct approach would be to keep the mask on (after all, what special about the value 0? For instance, if you have to take logarithm in the next step of the calculation, it is a rather bad choice!) Giulio. "Paul F. Dubois" wrote: > > [dubois@ldorritt ~]$ pydoc MA.sum > Python Library Documentation: function sum in MA > > sum(a, axis=0, fill_value=0) > Sum of elements along a certain axis using fill_value for missing. > > If you use add.reduce, you'll get what you want. > >>> print m > [[1 ,2 ,3 ,-- ,5 ,] > [10 ,20 ,30 ,-- ,50 ,]] > >>> MA.sum(m) > array([11,22,33, 0,55,]) > >>> MA.add.reduce(m) > array(data = > [ 11, 22, 33,-99, 55,], > mask = > [0,0,0,1,0,], > fill_value=-99) > > In other words, > sum(m, axis, fill_value) = add.reduce(filled(m, fill_value), axis) > > Surprising in your case. Still, both uses are quite common, so I > probably was thinking to myself that since add.reduce already does one > of the jobs, I might as well make sum do the other one. One could have > just as well argued that one was a synonym for the other and so it is > revolting to have them be different. > > Well, MA users, is this something I should change, or not? > > -----Original Message----- > From: num...@li... > [mailto:num...@li...] On Behalf Of Sue > Giller > Sent: Wednesday, November 28, 2001 9:03 AM > To: num...@li... > Subject: [Numpy-discussion] Using Reduce with Multi-dimensional Masked > array > > I posted the following inquiry to pyt...@py... earlier this > week, but got no responses, so I thought I'd try a more focused > group. I assume MA module falls under NumPy area. > > I am using 2 (and more) dimensional masked arrays with some > numeric data, and using the reduce functionality on the arrays. I > use the masking because some of the values in the arrays are > 'missing' and should not be included in the results of the reduction. > > For example, assume a 5 x 2 array, with masked values for the 4th > entry for both of the 2nd dimension cells. If I want to sum along the > 2nd dimension, I would expect to get a 'missing' value for the 4th > entry because both of the entries for the sum are 'missing'. Instead, > I get 0, which might be a valid number in my data space, and the > returned 1 dimensional array has no mask associated with it. > > Is this expected behavior for masked arrays or a bug or am I > misusing the mask concept? Does anyone know how to get the > reduction to produce a masked value? > > Example Code: > >>> import MA > >>> a = MA.array([[1,2,3,-99,5],[10,20,30,-99,50]]) > >>> a > [[ 1, 2, 3,-99, 5,] > [ 10, 20, 30,-99, 50,]] > >>> m = MA.masked_values(a, -99) > >>> m > array(data = > [[ 1, 2, 3,-99, 5,] > [ 10, 20, 30,-99, 50,]], > mask = > [[0,0,0,1,0,] > [0,0,0,1,0,]], > fill_value=-99) > > >>> r = MA.sum(m) > >>> r > array([11,22,33, 0,55,]) > >>> t = MA.getmask(r) > >>> print t > None > > _______________________________________________ > Numpy-discussion mailing list Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion |
From: Paul F. D. <pa...@pf...> - 2001-11-29 04:31:00
|
[dubois@ldorritt ~]$ pydoc MA.sum Python Library Documentation: function sum in MA sum(a, axis=0, fill_value=0) Sum of elements along a certain axis using fill_value for missing. If you use add.reduce, you'll get what you want. >>> print m [[1 ,2 ,3 ,-- ,5 ,] [10 ,20 ,30 ,-- ,50 ,]] >>> MA.sum(m) array([11,22,33, 0,55,]) >>> MA.add.reduce(m) array(data = [ 11, 22, 33,-99, 55,], mask = [0,0,0,1,0,], fill_value=-99) In other words, sum(m, axis, fill_value) = add.reduce(filled(m, fill_value), axis) Surprising in your case. Still, both uses are quite common, so I probably was thinking to myself that since add.reduce already does one of the jobs, I might as well make sum do the other one. One could have just as well argued that one was a synonym for the other and so it is revolting to have them be different. Well, MA users, is this something I should change, or not? -----Original Message----- From: num...@li... [mailto:num...@li...] On Behalf Of Sue Giller Sent: Wednesday, November 28, 2001 9:03 AM To: num...@li... Subject: [Numpy-discussion] Using Reduce with Multi-dimensional Masked array I posted the following inquiry to pyt...@py... earlier this week, but got no responses, so I thought I'd try a more focused group. I assume MA module falls under NumPy area. I am using 2 (and more) dimensional masked arrays with some numeric data, and using the reduce functionality on the arrays. I use the masking because some of the values in the arrays are 'missing' and should not be included in the results of the reduction. For example, assume a 5 x 2 array, with masked values for the 4th entry for both of the 2nd dimension cells. If I want to sum along the 2nd dimension, I would expect to get a 'missing' value for the 4th entry because both of the entries for the sum are 'missing'. Instead, I get 0, which might be a valid number in my data space, and the returned 1 dimensional array has no mask associated with it. Is this expected behavior for masked arrays or a bug or am I misusing the mask concept? Does anyone know how to get the reduction to produce a masked value? Example Code: >>> import MA >>> a = MA.array([[1,2,3,-99,5],[10,20,30,-99,50]]) >>> a [[ 1, 2, 3,-99, 5,] [ 10, 20, 30,-99, 50,]] >>> m = MA.masked_values(a, -99) >>> m array(data = [[ 1, 2, 3,-99, 5,] [ 10, 20, 30,-99, 50,]], mask = [[0,0,0,1,0,] [0,0,0,1,0,]], fill_value=-99) >>> r = MA.sum(m) >>> r array([11,22,33, 0,55,]) >>> t = MA.getmask(r) >>> print t None _______________________________________________ Numpy-discussion mailing list Num...@li... https://lists.sourceforge.net/lists/listinfo/numpy-discussion |
From: <sa...@hy...> - 2001-11-28 17:02:00
|
I posted the following inquiry to pyt...@py... earlier this week, but got no responses, so I thought I'd try a more focused group. I assume MA module falls under NumPy area. I am using 2 (and more) dimensional masked arrays with some numeric data, and using the reduce functionality on the arrays. I use the masking because some of the values in the arrays are 'missing' and should not be included in the results of the reduction. For example, assume a 5 x 2 array, with masked values for the 4th entry for both of the 2nd dimension cells. If I want to sum along the 2nd dimension, I would expect to get a 'missing' value for the 4th entry because both of the entries for the sum are 'missing'. Instead, I get 0, which might be a valid number in my data space, and the returned 1 dimensional array has no mask associated with it. Is this expected behavior for masked arrays or a bug or am I misusing the mask concept? Does anyone know how to get the reduction to produce a masked value? Example Code: >>> import MA >>> a = MA.array([[1,2,3,-99,5],[10,20,30,-99,50]]) >>> a [[ 1, 2, 3,-99, 5,] [ 10, 20, 30,-99, 50,]] >>> m = MA.masked_values(a, -99) >>> m array(data = [[ 1, 2, 3,-99, 5,] [ 10, 20, 30,-99, 50,]], mask = [[0,0,0,1,0,] [0,0,0,1,0,]], fill_value=-99) >>> r = MA.sum(m) >>> r array([11,22,33, 0,55,]) >>> t = MA.getmask(r) >>> print t None |
From: Konrad H. <hi...@cn...> - 2001-11-28 08:08:35
|
"eric" <er...@en...> writes: > The standard version that Robin Dunn distributes is compiled with MSVC. If > you build a small > extensions with gcc that make wxPython call, it'll link just fine, but > seg-faults during execution. > Does anyone know if the same sorta thing is true on the Unices? If it is, > and Numeric was written in C++ then you'd have to compile extension modules > that use Numeric arrays with the same compiler that was used to compile > Numeric. This can lead to all sorts of hassles, and it has made me lean If you rely on dynamic linking for cross-module calls, you'd have the same problem with Unix, as different compilers use different name-mangling schemes. One way around this would be to limit cross-module calls to C functions compiled with "C" linking. Better yet, don't rely on dynamic linking at all and export a module's C API via a Python CObject, as described in the extension manual, and declare all symbols as static (except for the module initialization function of course). In my experience that is the only method that works on all platforms, with all compilers. Of course this also assumes that interfaces are at the C level. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: eric <er...@en...> - 2001-11-28 00:15:14
|
Hey group, Blitz++ is very cool, but I'm not sure it would make a very good underpinning for reimplementing Numeric. There are 2 (well maybe 3) main points. 1. Blitz++ declares arrays in the following way: The first issue deals with how you declare arrays in Blitz++. Array<float,3> A(N,N,N); The big deal here is that the dimensionality of Array is a template parameter, not a constructor parameter. In other words, 2D arrays are effectively a different type than 3D arrays. Numeric, on the other hand represents arrays of all dimensions with a single class/type. For Python, this makes the most sense. I think you could fanagle some way of getting blitz to work, but I'm not sure it would be the desired elegant solution. I've also tinkered with building a simple C++ templated (non-blitz) implementation of Numeric for kicks, but kept coming back to using the dreaded void* to store the data arrays. I still haven't completely given up on a templated solution, but it wasn't as obvious as I thought it would be. 2. Compiling Blitz++ is slooooow. scipy.compiler spits out 200-300 line extension modules at the most. Depending on hox complicated expressions are, it can take .5-1.5 minutes to compile a single extension funtion on an 850 MHz PIII. I can't imagine how long it would take to compile Numeric arrays for 1 through 11 dimensions (the most blitz supports as I remember) for all the different data types with 100s of extension functions. The cost wouldn't be linear because you do pay a one time hit for some of the template instantiation. Also, I've heard gcc 3.0 might be better. Still, it'd be a painful development process. 3. Portability. This comes at two levels. The first is that blitz++ has heavy duty requirements of the compiler. gcc works fine which is a huge plus, but a lot of other compilers don't. MSVC is the most notable of these because it is so heavily used on windows. The second level is the portability of C++ extension modules in general. I've run into this on windows, but I think it is an issue pretty much everywhere. For example, MSVC and GCC compiled C extension libraries can call each other on Windows because they the are binary compatible. C++ classes are _not_ binary compatible. This has come up for me with wxPython. The standard version that Robin Dunn distributes is compiled with MSVC. If you build a small extensions with gcc that make wxPython call, it'll link just fine, but seg-faults during execution. Does anyone know if the same sorta thing is true on the Unices? If it is, and Numeric was written in C++ then you'd have to compile extension modules that use Numeric arrays with the same compiler that was used to compile Numeric. This can lead to all sorts of hassles, and it has made me lean back towards C as the preferred language for something as fundemental as Numeric. (Note that I do like C++ for modules that don't really define an API called by other modules). Ok, so maybe there's a 4th point. Paul D. pointed out that blitz isn't much of a win unless you have lazy evaluation (which scipy.compiler already provides). I also think improved speed _isn't_ the biggest goal of a reimplementation (although it can't be sacrificed either). I'm more excited about a code base that more people can comprehend. Perry G. et al's mixed Python/C implementation with the code generators is a very good idea and a step in this direction. I hope the speed issues for small arrays can be solved. I also hope the memory mapped aspect doesn't complicate the code base much. see ya, eric |
From: William R. <ws...@fa...> - 2001-11-27 23:51:03
|
At 10:46 AM 11/27/2001 -0800, Chris Barker wrote: >Hung Jung Lu wrote: > > Is there any recommendation for fast machines at the > > price range of a few thousand dollars? (I cannot > > afford supercomputers or connection machines.) My > > purpose is to run Monte Carlo simulation. This means > > that a lot of scenarios can be run in parallel > > fashion. Of course I can just use regular cheap > > Pentium boxes... but they are kind of bulky, and I > > don't need any of the video, audio, USB features (I > >I've been looking into setting up a system to do similar work, and it >looks to me like the best bang for the buck right now are dual Athlon >systems. If space is an important consideration, you can get dual Athlon >1U rack mount systems for less than $2000. I'm pretty sure the only dual >Athlon board currently available (Tyan K7 thunder) has on board video, >ethernet and SCSI, which means it cost a little more than it could, but >these systems are still a pretty good deal if you get one without a hard >drive (or a very cheap one). I just did quick web search, and epox is >supposed to be coming out with a dual board as well, so there may be >cheaper options soon. > >-Chris There is a cheaper dual CPU Tyan board which uses the same motherboard chipset. Its the Tyan Tiger-MP S2460, which doesn't have SCSI, onboard video, or Ethernet, but is half the price (around $200). -willryu |
From: Chris B. <chr...@ho...> - 2001-11-27 18:27:18
|
Hung Jung Lu wrote: > Is there any recommendation for fast machines at the > price range of a few thousand dollars? (I cannot > afford supercomputers or connection machines.) My > purpose is to run Monte Carlo simulation. This means > that a lot of scenarios can be run in parallel > fashion. Of course I can just use regular cheap > Pentium boxes... but they are kind of bulky, and I > don't need any of the video, audio, USB features (I I've been looking into setting up a system to do similar work, and it looks to me like the best bang for the buck right now are dual Athlon systems. If space is an important consideration, you can get dual Athlon 1U rack mount systems for less than $2000. I'm pretty sure the only dual Athlon board currently available (Tyan K7 thunder) has on board video, ethernet and SCSI, which means it cost a little more than it could, but these systems are still a pretty good deal if you get one without a hard drive (or a very cheap one). I just did quick web search, and epox is supposed to be coming out with a dual board as well, so there may be cheaper options soon. -Chris -- Christopher Barker, Ph.D. Chr...@ho... --- --- --- http://members.home.net/barkerlohmann ---@@ -----@@ -----@@ ------@@@ ------@@@ ------@@@ Oil Spill Modeling ------ @ ------ @ ------ @ Water Resources Engineering ------- --------- -------- Coastal and Fluvial Hydrodynamics -------------------------------------- ------------------------------------------------------------------------ |
From: <ro...@bl...> - 2001-11-27 17:43:06
|
>>>>> "HJL" == Hung Jung Lu <hun...@ya...> writes: HJL> Again, I have a tangential question. I am hitting the HJL> physical limit of the CPU (meaning things have been optimized HJL> down to assembly level), in order to achieve even higher HJL> performance, the only way to go is hardware. HJL> Is there any recommendation for fast machines at the price HJL> range of a few thousand dollars? (I cannot afford HJL> supercomputers or connection machines.) My purpose is to run HJL> Monte Carlo simulation. This means that a lot of scenarios HJL> can be run in parallel fashion. Of course I can just use HJL> regular cheap Pentium boxes... but they are kind of bulky, HJL> and I don't need any of the video, audio, USB features (I HJL> think 10 machines at 1GHz each would be the size of HJL> calculation power I need, or equivalently, a single machine HJL> at an equivalent 10GHz. Heck, if there are some specialized HJL> racks/boxes, I can wire the motherboards myself.) I am HJL> wondering what you people do for heavy number crunching? Are HJL> there any cheap yet specialized machines? What about machines HJL> with dual processor? I would imagine a lot of people in the HJL> number crunching world run into my situation, and since the HJL> number crunching machines don't require much beyond a HJL> motherboard and a small hard-drive, maybe there are already HJL> some cheap solutions out there. The usual way is to build some "blackboxes", i.e. mobo/cpu/memory/NIC, diskless or nearly diskless (you don't want to maintain machines :-). Connect them using 100bT or faster networks (though 100bT should be fine). Do such things exist? Sort of -- they tend to be more expensive than building them yourself, but if you've got a reliable local supplier, they can build them fairly cheaply for you. I'd go with single or dual athlons, myself :-). If power and maintenance is an issue, duals, and if not, maybe singles. We use MOSIX (www.mosix.org) for transparent load balancing between linux machines, and it could be used on the machines I described (using a floppy or CD to boot). The next question is whether some form of parallel RNG will help. The answer is "maybe". I worked with a student who evaluated coupled chains, and we couldn't do too much better. And then after that, is whether you want to figure out how to post-process the results. If you want to automate the whole thing (and it isn't clear that it would be worth it, but...), you could use PyPVM to front-end the sub-processes distributed on the network, load-balanced at the system level by MOSIX. Now for the problems -- MOSIX seems to have difficulties with Python. Severe difficulties. I don't know if it still holds true for recent MOSIX releases. (note that I use R (www.r-project.org) for most of my simulation work these days, but am looking at Python for stat analyses, of which MCMC tools are of interest). best, -tony -- A.J. Rossini Rsrch. Asst. Prof. of Biostatistics U. of Washington Biostatistics rossini@u.washington.edu FHCRC/SCHARP/HIV Vaccine Trials Net ro...@sc... -------------- http://software.biostat.washington.edu/ -------------- FHCRC: M-W: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email UW: T-Th: 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX Rosen: (Mullins' Lab) Fridays, and I'm unreachable except by email. |
From: Hung J. Lu <hun...@ya...> - 2001-11-27 16:27:08
|
Hi, Thanks to Jon Saenz and Chris Baker for helping out with fast linear algebra and statistical distribution routines. Again, I have a tangential question. I am hitting the physical limit of the CPU (meaning things have been optimized down to assembly level), in order to achieve even higher performance, the only way to go is hardware. Is there any recommendation for fast machines at the price range of a few thousand dollars? (I cannot afford supercomputers or connection machines.) My purpose is to run Monte Carlo simulation. This means that a lot of scenarios can be run in parallel fashion. Of course I can just use regular cheap Pentium boxes... but they are kind of bulky, and I don't need any of the video, audio, USB features (I think 10 machines at 1GHz each would be the size of calculation power I need, or equivalently, a single machine at an equivalent 10GHz. Heck, if there are some specialized racks/boxes, I can wire the motherboards myself.) I am wondering what you people do for heavy number crunching? Are there any cheap yet specialized machines? What about machines with dual processor? I would imagine a lot of people in the number crunching world run into my situation, and since the number crunching machines don't require much beyond a motherboard and a small hard-drive, maybe there are already some cheap solutions out there. thanks! Hung Jung __________________________________________________ Do You Yahoo!? Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month. http://geocities.yahoo.com/ps/info1 |
From: Krishnaswami, N. <ne...@cs...> - 2001-11-27 13:51:41
|
Perry Greenfield [mailto:pe...@st...] wrote: > > > > I know large datasets were one of your driving factors, but I really > > don't want to make performance on smaller datasets secondary. > > That's why we are asking, and it seems so far that there are enough > of those that do care about small arrays to spend the effort to > significantly improve the performance. Well, here's my application. I do data mining work, and one of the techniques I want to use Numpy for is to implement robust regression algorithms like least-trimmed-squares. Now for a k-variable regression, the best-of-breed algorithm for this involves taking hundreds of thousands of k-element samples and calculating the fitting hyperplane through them. Small matrix performance is thus something this program lives or dies by, and right now it seems like 'dies' is the right measure -- it is about 10x slower than the Gauss program that does the same thing. :( When I profiled it seems like Numpy is spending almost all of its time in _castCopyAndTranspose. Switching to the Intel MKL LAPACK had no performance effect, but changing _castCopyAndTranspose into a C function was a 20% speed increase. If Numpy2 is even slower on small matrices I'd have to give up using it, and that's a shame: it's a *much* nicer environment than Gauss is. -- Neel Krishnaswami ne...@cs... |
From: Achim G. <Ach...@un...> - 2001-11-27 08:19:05
|
Ok, there is a clear need for the facility of easy contribution. Please be patient until Friday, December 7th. Then I have time to let it = happen. It is right that the oficial site for this project is at pygsl.sourcefogr= ge.net (Brian Gough, can you change the link on the gsl homepage, thanks :-) ) But I will show some discussion points that must be clear before a cvs re= lease: - Is the file and directory structure fully expandable, can several perso= ns work parallel? - Should classes be created with excellent working objects or should it b= e a 1:1 wrapper? - should there be one interface dynamic library or more than one? - Is there an other way expect that of the GPL (personally prefered, but = other opinions should be discussed before the contribution of source) Some questions of minor weight: - Is the tuple return value for (value,error) ok in the sf module? - Test cases are needed These questions are the reason, why I do not simply "copy" my code into c= vs. Jochen K=FCpper wrote: >=20 > It only provides wrapper for the special functions, but more is to > come. (Hopefully Achim will put the cvs on sf soon.) >=20 > Yes, I agree, PyGSL should be fully integrated with Numpy2, but it > should probably also remain a separate project -- as Numpy should stay > a base layer for all kind of numerical stuff and hopefully make it > into core python at some point (my personal wish, no more, AFAICT!). >=20 > I think when PyGSL will fully go to SF (or anything similar) more > people would start contributing and we should have a fine general > numerical algorithms library for python soon! >=20 I agree with Jochen and I'd like to move to the core of Python too. But t= his is far away and I hate monolithic distributions. If there is the need to discuss seperately about PyGSL we can do that her= e or at the gsl-discuss list mailto:gsl...@so... . But there is= also the possibility of a mailinglist at pygsl.sourceforge.net . Please let me= know. |