From: Paul B. <Ba...@st...> - 2000-02-08 20:13:12
|
Travis Oliphant writes: > > > > 1) The re-use of temporary arrays -- to conserve memory. > > Please elaborate about this request. When Python evaluates the expression: >>> Y = B*X + A where A, B, X, and Y are all arrays, B*X creates a temporary array, T. A new array, Y, will be created to hold the result of T + A, and T will be deleted. If T and Y have the same shape and typecode, then instead of creating Y, T can be re-used to conserve memory. > > > > 2) A copy-on-write option -- to enhance performance. > > > > I need more explanation of this as well. This would be an advanced feature of arrays that use memory-mapping or access their arrays from disk. It is similar to the secondary cache of a CPU. The data is held in memory until a write request is made. > > > > 3) The initialization of arrays by default -- to help novices. > > What kind of initialization are you taking about (we have zeros and ones > and random already). For mixed-type (or object) arrays containing strings, zeros() and ones() would be confusing. Therefore by default, integer and floating types are initialized to 0 and string types to ' ', and the option would be available to not initialize the array for performance. > > > > 4) The creation of a standard API -- which I guess is assumed, if it > > is to be part of the Python standard distribution. > > Any suggestions as to what needs to be changed in the already somewhat > standard API. No, not exactly. But the last time I looked, I thought some improvements could be made to it. > > > > 5) The inclusion of IEEE support. > > This was supposed to be there from the beginning, but it didn't get > finished. Jim's original idea was to have two math modules, one which > checked and gave error's for 1/0 and another that returned IEEE inf for > 1/0. > > The current umath does both with different types which is annoying. When I last spoke to Jim about this at IPC6, I was under the impression that IEEE support was not fully implemented and much work still needed to be done. Has this situation changed since then? > > > > And > > > > 6) Enhanced support for mixed-types or objects. > > > > This last issue is very import to me and the astronomical community, > > since we routinely store data as (multi-dimensional) arrays of fixed > > length records or C-structures. A current deficiency of NumPy is that > > the object typecode does not work with the fromstring() method, so > > importing arrays of records from a binary file is just not possible. > > I've been developing my own C-extension type to handle this situation > > and have come to realize that my record type is really just a > > generalization of NumPy's types. > > > I would like to see the code for your generalized type which would help me > see if there were some relatively painless way the two could be merged. recordmodule.c is part of my PyFITS module for dealing with FITS files. You can find it here: ftp://ra.stsci.edu/pub/barrett/PyFITS_0.3.tgz I use NumPy to access fixed-type arrays and the record type for accessing mixed-type arrays. A common example is accessing the second element of a mixed-type (ie. an object) from the entire array. This returns a record type with a single element, which is equivalent to a NumPy array of fixed type. Therefore users expect this object to be a NumPy array and it isn't. They have to convert it to one. > > two C-extension types merged. I think this enhancement can be done > > with minimal change to the current NumPy behavior and minor changes to > > the typecode system. > > If you already see how to do it, then great. Note that NumPy already has some support for an Object type. It has been proposed that it be removed, because it is not well supported and hence few people use it. I have the contrary opinion and feel we should enhance the Object type and make it much more usable. If you don't need it, then you don't have to use it. This enhancement really shouldn't get in the way of those who only use fixed-type arrays. So what changes to NumPy are needed? 1) Instead of a typecode (or in addition to the typecode for backward compatibility), I suggest an optional format keyword, which can be used to specify the mixed-type or object format. Namely, format = 'i, f, s10', where 'i' is an integer type, 'f' a floating point type, and s10 is a string of 10 characters. 2) Array access will be the same as it is now. For example # Create a 10x10 mixed-type array. A = array((10, 10), format = 'i, f, 10s') # Create a 10x10 fixed-type array. B = array((10, 10), typecode = 'i') # Print a 5x5 subarray of mixed-type. print A[:5,:5] # Print a 5x5 subarray of fixed-type print B[:5,:5] # Or # (Note that the 3rd index is optional for fixed-type arrays, it # always defaults to 0.) print B[:5,:5,0] # Print the second element of the mixed-type of the entire array. # Note that this is now an array of fixed-type. print A[:,:,1] The major thorn that I see at this point is how to reconcile the behavior of numbers and strings during operations. But I don't see this as an intractable problem. I actually believe this enhancement will encourage us to create a better and more generic multi-dimensional array module by concentrating on the behavioral aspects of this extension type. Note that J, which NumPy is base upon, allows such mixed-types. -- Dr. Paul Barrett Space Telescope Science Institute Phone: 410-516-6714 DESD/DPT FAX: 410-516-8615 Baltimore, MD 21218 |
From: Beausoleil, R. <be...@ex...> - 2000-02-08 21:32:56
|
I've been reading the posts on this topic with considerable interest. For a moment, I want to emphasize the "code-cleanup" angle more literally than the functionality mods suggested so far. Several months ago, I hacked my personal copy of the NumPy distribution so that I could use the Intel Math Kernel Library for Windows. The IMKL is (1) freely available from Intel at http://developer.intel.com/vtune/perflibst/mkl/index.htm; (2) basically BLAS and LAPACK, with an FFT or two thrown in for good measure; (3) optimized for the different x86 processors (e.g., generic x86, Pentium II & III); (4) configured to use 1, 2, or 4 processors; and (5) configured to use multithreading. It is an impressive, fast implementation. I'm sure there are similar native libraries available on other platforms. Probably due to my inexperience with both Python and NumPy, it took me a couple of days to successfully tear out the f2c'd stuff and get the IMKL linking correctly. The parts I've used work fine, but there are probably other features that I haven't tested yet that still aren't up to snuff. In any case, the resulting code wasn't very pretty. As long as the NumPy code is going to be commented and cleaned up, I'd be glad to help make sure that the process of using a native BLAS/LAPACK distribution (which was probably compiled using Fortran storage and naming conventions) is more straightforward. Among the more tedious issues to consider are: (1) The extent of the support for LAPACK. Do we want to stick with LAPACK Lite? (2) The storage format. If we've still got row-ordered matrices under the hood, and we want to use native LAPACK libraries that were compiled using column-major format, then we'll have to be careful to set all of the flags correctly. This isn't going to be a big deal, _unless_ NumPy will support more of LAPACK when a native library is available. Then, of course, there are the special cases: the IMKL has both a C and a Fortran interface to the BLAS. (3) Through the judicious use of header files with compiler-dependent flags, we could accommodate the various naming conventions used when the FORTRAN libraries were compiled (e.g., sgetrf_ or SGETRF). The primary output of this effort would be an expansion of the "Compilation Notes" subsection of Section 15 of the NumPy documentation, and some header files that made the recompilation easier than it is now. Regards, Ray ============================ Ray Beausoleil Hewlett-Packard Laboratories mailto:be...@hp... Vox: 425-883-6648 Fax: 425-883-2535 HP Telnet: 957-4951 ============================ |
From: James R. W. <jr...@go...> - 2000-02-09 03:07:05
|
There is now a linux native BLAS available through links at http://www.cs.utk.edu/~ghenry/distrib/ courtesy of the ASCI Option Red Project. There is also ATLAS (http://www.netlib.org/atlas/). Either library seems to link into NumPy without a hitch. ----- Original Message ----- From: "Beausoleil, Raymond" <be...@ex...> To: <num...@li...> Cc: <mat...@py...> Sent: Tuesday, February 08, 2000 2:31 PM Subject: RE: [Matrix-SIG] An Experiment in code-cleanup. > I've been reading the posts on this topic with considerable interest. For a > moment, I want to emphasize the "code-cleanup" angle more literally than the > functionality mods suggested so far. > > Several months ago, I hacked my personal copy of the NumPy distribution so > that I could use the Intel Math Kernel Library for Windows. The IMKL is > (1) freely available from Intel at > http://developer.intel.com/vtune/perflibst/mkl/index.htm; > (2) basically BLAS and LAPACK, with an FFT or two thrown in for good > measure; > (3) optimized for the different x86 processors (e.g., generic x86, Pentium > II & III); > (4) configured to use 1, 2, or 4 processors; and > (5) configured to use multithreading. > It is an impressive, fast implementation. I'm sure there are similar native > libraries available on other platforms. > > Probably due to my inexperience with both Python and NumPy, it took me a > couple of days to successfully tear out the f2c'd stuff and get the IMKL > linking correctly. The parts I've used work fine, but there are probably > other features that I haven't tested yet that still aren't up to snuff. In > any case, the resulting code wasn't very pretty. > > As long as the NumPy code is going to be commented and cleaned up, I'd be > glad to help make sure that the process of using a native BLAS/LAPACK > distribution (which was probably compiled using Fortran storage and naming > conventions) is more straightforward. Among the more tedious issues to > consider are: > (1) The extent of the support for LAPACK. Do we want to stick with LAPACK > Lite? > (2) The storage format. If we've still got row-ordered matrices under the > hood, and we want to use native LAPACK libraries that were compiled using > column-major format, then we'll have to be careful to set all of the flags > correctly. This isn't going to be a big deal, _unless_ NumPy will support > more of LAPACK when a native library is available. Then, of course, there > are the special cases: the IMKL has both a C and a Fortran interface to the > BLAS. > (3) Through the judicious use of header files with compiler-dependent flags, > we could accommodate the various naming conventions used when the FORTRAN > libraries were compiled (e.g., sgetrf_ or SGETRF). > > The primary output of this effort would be an expansion of the "Compilation > Notes" subsection of Section 15 of the NumPy documentation, and some header > files that made the recompilation easier than it is now. > > Regards, > > Ray > > ============================ > Ray Beausoleil > Hewlett-Packard Laboratories > mailto:be...@hp... > Vox: 425-883-6648 > Fax: 425-883-2535 > HP Telnet: 957-4951 > ============================ > > _______________________________________________ > Matrix-SIG maillist - Mat...@py... > http://www.python.org/mailman/listinfo/matrix-sig > |
From: Paul F. D. <pau...@ho...> - 2000-03-06 18:54:30
|
We are seeking a volunteer developer for Numeric who will remove the current BLAS/LINPACK lite stuff in favor of linking to whatever the native version is on a particular machine. The current default is that you have to work harder to get the good ones than our bad ones; we want to reverse that. We have gotten a lot of complaints about the current situation and while we are aware of the counter arguments the Council of Nummies has reached a consensus to do this. A truly excited volunteer would widen the amount of stuff that the interface can get to. They would work via the CVS tree; see http://numpy.sourceforge.net. Please reply to du...@us.... > -----Original Message----- > From: num...@li... > [mailto:num...@li...]On Behalf Of James > R. Webb > Sent: Monday, February 07, 2000 10:04 PM > To: num...@li... > Cc: mat...@py... > Subject: [Numpy-discussion] Re: [Matrix-SIG] An Experiment in > code-cleanup. > > > There is now a linux native BLAS available through links at > http://www.cs.utk.edu/~ghenry/distrib/ courtesy of the ASCI Option Red > Project. > > There is also ATLAS (http://www.netlib.org/atlas/). > > Either library seems to link into NumPy without a hitch. > > ----- Original Message ----- > From: "Beausoleil, Raymond" <be...@ex...> > To: <num...@li...> > Cc: <mat...@py...> > Sent: Tuesday, February 08, 2000 2:31 PM > Subject: RE: [Matrix-SIG] An Experiment in code-cleanup. > > > > I've been reading the posts on this topic with considerable > interest. For > a > > moment, I want to emphasize the "code-cleanup" angle more literally than > the > > functionality mods suggested so far. > > > > Several months ago, I hacked my personal copy of the NumPy > distribution so > > that I could use the Intel Math Kernel Library for Windows. The IMKL is > > (1) freely available from Intel at > > http://developer.intel.com/vtune/perflibst/mkl/index.htm; > > (2) basically BLAS and LAPACK, with an FFT or two thrown in for good > > measure; > > (3) optimized for the different x86 processors (e.g., generic > x86, Pentium > > II & III); > > (4) configured to use 1, 2, or 4 processors; and > > (5) configured to use multithreading. > > It is an impressive, fast implementation. I'm sure there are similar > native > > libraries available on other platforms. > > > > Probably due to my inexperience with both Python and NumPy, it took me a > > couple of days to successfully tear out the f2c'd stuff and get the IMKL > > linking correctly. The parts I've used work fine, but there are probably > > other features that I haven't tested yet that still aren't up > to snuff. In > > any case, the resulting code wasn't very pretty. > > > > As long as the NumPy code is going to be commented and cleaned > up, I'd be > > glad to help make sure that the process of using a native BLAS/LAPACK > > distribution (which was probably compiled using Fortran storage > and naming > > conventions) is more straightforward. Among the more tedious issues to > > consider are: > > (1) The extent of the support for LAPACK. Do we want to stick > with LAPACK > > Lite? > > (2) The storage format. If we've still got row-ordered matrices > under the > > hood, and we want to use native LAPACK libraries that were > compiled using > > column-major format, then we'll have to be careful to set all > of the flags > > correctly. This isn't going to be a big deal, _unless_ NumPy > will support > > more of LAPACK when a native library is available. Then, of > course, there > > are the special cases: the IMKL has both a C and a Fortran interface to > the > > BLAS. > > (3) Through the judicious use of header files with compiler-dependent > flags, > > we could accommodate the various naming conventions used when > the FORTRAN > > libraries were compiled (e.g., sgetrf_ or SGETRF). > > > > The primary output of this effort would be an expansion of the > "Compilation > > Notes" subsection of Section 15 of the NumPy documentation, and some > header > > files that made the recompilation easier than it is now. > > > > Regards, > > > > Ray > > > > ============================ > > Ray Beausoleil > > Hewlett-Packard Laboratories > > mailto:be...@hp... > > Vox: 425-883-6648 > > Fax: 425-883-2535 > > HP Telnet: 957-4951 > > ============================ > > > > _______________________________________________ > > Matrix-SIG maillist - Mat...@py... > > http://www.python.org/mailman/listinfo/matrix-sig > > > > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > http://lists.sourceforge.net/mailman/listinfo/numpy-discussion |
From: <hi...@di...> - 2000-03-07 20:20:04
|
> We are seeking a volunteer developer for Numeric who will remove the current > BLAS/LINPACK lite stuff in favor of linking to whatever the native version > is on a particular machine. The current default is that you have to work This should take several volunteers; nobody has access to all machine types! > A truly excited volunteer would widen the amount of stuff that the interface > can get to. They would work via the CVS tree; see That is not necessary; a full BLAS/LAPACK interface has existed for years, written by Doug Heisterkamp. In fact, the lapack_lite module is simply a subset of it. By some strange coincidence I have worked a bit on this just a few days ago. I have added a compilation/installation script and added thread support (such that LAPACK calls don't block other threads). You can pick it up at ftp://dirac.cnrs-orleans.fr/pub/ as a tar archive and as RPMs for (RedHat) Linux. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: Michael H. <mh...@bl...> - 2000-03-08 04:50:52
|
Hello, "Paul F. Dubois" <pau...@ho...> writes: > We are seeking a volunteer developer for Numeric who will remove the > current BLAS/LINPACK lite stuff in favor of linking to whatever the > native version is on a particular machine. I don't have much time to help out with the python interface, but I have some (mostly) machine-translated C/C++ header files for LAPACK that might be useful. These files could be SWIGged as a starting point for a python binding. Even better, the perl (*yuck*) script that does the translation could be modified to prototype input vs. output arrays differently (the script determines which arrays are input vs. output from the comment lines from the Fortran source). Then SWIG typemaps could be written that handle input/output correctly and much of the wrapping job would be automated. Of course all this won't help with row vs. column storage format. The header files and a translation script can be obtained from http://monsoon.harvard.edu/~mhagger/download/ Unfortuantely I don't have the same thing for BLAS, mostly because the comments in the BLAS Fortran files are less careful and consistent, making machine interpretation more difficult. Please let me know if you find these headers useful. Yours, Michael -- Michael Haggerty mh...@bl... |
From: Konrad H. <hi...@cn...> - 2000-02-09 18:45:50
|
> (1) The extent of the support for LAPACK. Do we want to stick with LAPACK > Lite? There has been a full LAPACK interface for a long while, of which LAPACK Lite is just the subset that is needed for supporting the high-level routines in the module LinearAlgebra. I seem to have lost the URL to the full version, but it's on my disk, so I can put it onto my FTP server if there is a need. > (2) The storage format. If we've still got row-ordered matrices under the > hood, and we want to use native LAPACK libraries that were compiled using > column-major format, then we'll have to be careful to set all of the flags > correctly. This isn't going to be a big deal, _unless_ NumPy will support > more of LAPACK when a native library is available. Then, of course, there The low-level interface routines don't take care of this. It's the high-level Python code (module LinearAlgebra) that sets the transposition argument correctly. That looks like a good compromise to me. > (3) Through the judicious use of header files with compiler-dependent flags, > we could accommodate the various naming conventions used when the FORTRAN > libraries were compiled (e.g., sgetrf_ or SGETRF). That's already done! Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: Beausoleil, R. <be...@ex...> - 2000-02-09 19:18:25
|
From: Konrad Hinsen [mailto:hi...@cn...] > > (1) The extent of the support for LAPACK. Do we want to stick > > with LAPACK Lite? > > There has been a full LAPACK interface for a long while, of which > LAPACK Lite is just the subset that is needed for supporting the > high-level routines in the module LinearAlgebra. I seem to have lost > the URL to the full version, but it's on my disk, so I can put it > onto my FTP server if there is a need. Yes, I'd like to get a copy! You can simply e-mail it to me, if you'd prefer. > > (2) The storage format. If we've still got row-ordered matrices > > under the hood, and we want to use native LAPACK libraries that > > were compiled using column-major format, then we'll have to be > > careful to set all of the flags correctly. This isn't going to > > be a big deal, _unless_ NumPy will support more of LAPACK when a > > native library is available. Then, of course, there ... > > The low-level interface routines don't take care of this. It's the > high-level Python code (module LinearAlgebra) that sets the > transposition argument correctly. That looks like a good compromise > to me. I'll have to look at this more carefully. Due to my relative lack of Python experience, I hacked the C code so that Fortran routines could be called instead, producing the expected results. > > (3) Through the judicious use of header files with compiler- > > dependent flags, we could accommodate the various naming > > conventions used when the FORTRAN libraries were compiled (e.g., > > sgetrf_ or SGETRF). > > That's already done! Where? Even in the latest f2c'd source code that I downloaded from SourceForge, I see all names written using the lower-case-trailing-underscore convention (e.g., dgeqrf_). The Intel MKL was compiled from Fortran source using the upper-case-no-underscore convention (e.g., DGEQRF). If I replace dgeqrf_ with DGEQRF in dlapack_lite.c (and a few other tweaks), then the subsequent link with the IMKL succeeds. ============================ Ray Beausoleil Hewlett-Packard Laboratories mailto:be...@hp... Vox: 425-883-6648 Fax: 425-883-2535 HP Telnet: 957-4951 ============================ |
From: Konrad H. <hi...@cn...> - 2000-02-09 21:02:37
|
> > onto my FTP server if there is a need. > > Yes, I'd like to get a copy! You can simply e-mail it to me, if you'd > prefer. OK, coming soon... > I'll have to look at this more carefully. Due to my relative lack of Python > experience, I hacked the C code so that Fortran routines could be called > instead, producing the expected results. That's fine, you can simply replace the f2c-generated code by Fortran-compiled code, as long as the calling conventions are the same. I have used optimized BLAS as well on some machines. > Where? Even in the latest f2c'd source code that I downloaded from > SourceForge, I see all names written using the > lower-case-trailing-underscore convention (e.g., dgeqrf_). The Intel MKL was Sure, f2c generates the underscores. But the LAPACK interface code (the one I'll send you, and also LAPACK Lite) supports both conventions, controlled by the preprocessor symbol NO_APPEND_FORTRAN (maybe not the most obvious name). On the other hand, there is no support for uppercase names; that convention is not used in the Unix world. But I suppose it could be added by machine transformation of the code. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: David A. <Da...@Ac...> - 2000-02-08 20:33:49
|
> So what changes to NumPy are needed? > > 1) Instead of a typecode (or in addition to the typecode for backward > compatibility), I suggest an optional format keyword, which can be > used to specify the mixed-type or object format. Namely, format = > 'i, f, s10', where 'i' is an integer type, 'f' a floating point > type, and s10 is a string of 10 characters. I'd suggest to go all the way and make it a real object, not just a string. That object can then have useful attributes, like size in bytes, maxval, minval, some indication of precision, etc. Logically, itemsize should be an attribute of the numeric type of an array, not of the array itself. --david ascher |