[atlas-devel] 3.9.42
Brought to you by:
rwhaley,
tonyc040457
From: Clint W. <wh...@cs...> - 2011-06-22 22:04:25
|
Guys, I have released 3.9.42. It has a number of bugfixes, as detailed below. I am mainly working on stabilizing things for 3.10, so the main new development I'm working on is improving ATLAS's ability to auto-test and time. On the testing front, I have added a directory EXtest/, where I'm going to put in a series of one-off testers that are written to look for particular bugs. As time goes on, this directory should fill up with testers written by myself and ATLAS users that test corner cases that the standard testers aren't getting at. Right now, it just has some memory alignment testers for the L2BLAS. If you have tests to contribute, please send them in, including simple things like scripts that call the normal ATLAS testers in ways that have found bugs in the past.. The main new thing is the results/ directory. I have added the capability for ATLAS to automatically time ATLAS versus the system BLAS (or a previous release of ATLAS, to catch performance regressions/improvements), and to generate bar charts with the comparison using ploticus: http://ploticus.sourceforge.net/doc/welcome.html You can also easily extend this framework for arbitrary benchmarking/charting using the simple tvec tools I have provided in this directory. A rough overview of these new tools and charting capabilities is provided in a chapter of ATLAS/doc/atlas_devel.pdf. As usual, you can use the ATLAS framework to measure other BLAS/LAPACK. I hope that if you develop cool measures and charting capabilities with these tools, you'll share them! The bad news is that this makes the comparison very easy, and right now both ACML and MKL *crush* me on parallel factorization performance. Part of this is that they have better GEMMs, part of it is that they use persistent worker threads, but there is clearly more than this to it. Their numbers jump around a lot, but at their best they can pretty much double my parallel performance (on a platform where they are only 25% faster in serial). Unless I find a problem in ATLAS artificially depressing my current performance, it will undoubtedly be a while before I can close such an impressive gap. Let me know if you find the new charting capability helpful. Cheers, Clint ATLAS 3.9.42 released 06/22/11, changes from 3.9.41: * Added ability to autobuild performance charts in results/ * Added EXtest/ and all-aligment testing for GER and GEMV * Fixed bug in BETA=0 case of ATL_cgemvN_8x4_sse3.c * Added results/ directory that can autobuild performance charts * numerous fixes to qrtest and some fixes for the QR fact routines * Added missing $(F77SYSLIB) in Make.lib's dylib and ptdylib targets * Added chapter in atlas_install explaining how to use mmflagsearch * Fixed uninitialized memory read caused by copying data I don't reference in parallel GEMM. * Fixed unitialized memory read in gemvT * Changed extendedmodel=2, model=5 from Corei2 to Corei1 in archinfo_x86 ************************************************************************** ** R. Clint Whaley, PhD ** Assist Prof, UTSA ** www.cs.utsa.edu/~whaley ** ************************************************************************** |