From: JJ <jos...@ya...> - 2006-06-10 22:15:14
|
Hello. I am a new user to scipy, thinking about crossing over from Matlab. I have a new AMD 64 machine and just installed fedora 5 and scipy. It is a dual boot machine with windows XP. I did a small test to compare the speed of matlab (in 32 bit windows, Matlab student v14) to the speed of scipy (in fedora, 64 bit). I generated two random matrices of 10,000 by 2,000 elements and then took their dot product. The scipy code was: python import numpy import scipy a = scipy.random.normal(0,1,[10000,2000]) b = scipy.random.normal(0,1,[10000,2000]) c = scipy.dot(a,scipy.transpose(b)) I timed the last line of the code and compared it to the equivalent code in Matlab. The results were that Matlab took 3.3 minutes and scipy took 11.5 minutes. Thats a factor of three. I am surprised with the difference and am wondering if there is anything I can do to speed up scipy. I installed scipy, blas, atlas, numpy and lapack from source, just as the instructions on the scipy web site suggested (or as close to the instructions as I could). The only thing odd was that when installing numpy, I received messages that the atlas libraries could not be found. However, it did locate the lapack libraries. I dont know why it could not find the atlas libraries, as I told it exactly where to find them. It did not give the message that it was using the slower default libraries. I also tried compiling after an export ATLAS = statement, but that did not make a difference. Wherever I could, I complied it specifically for the 64 bit machine. I used the current gcc compiler. The ATLAS notes suggested that the speed problems with the 2.9+ compilers had been fixed. Any ideas on where to look for a speedup? If the problem is that it could not locate the atlas ibraries, how might I assure that numpy finds the atlas libraries. I can recompile and send along the results if it would help. Thanks. John PS. I first sent this to the scipy mailing list, but it didnt seem to make it there. __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |
From: Robert K. <rob...@gm...> - 2006-06-10 22:32:06
|
JJ wrote: > Any ideas on where to look for a speedup? If the > problem is that it could not locate the atlas > ibraries, how might I assure that numpy finds the > atlas libraries. I can recompile and send along the > results if it would help. Run ldd(1) on the file lapack_lite.so . It should show you what dynamic libraries it is linked against. > PS. I first sent this to the scipy mailing list, but > it didnt seem to make it there. That's okay. This is actually the right place. All of the functions you used are numpy functions, not scipy. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: Charles R H. <cha...@gm...> - 2006-06-11 04:47:29
|
Hmm, I just tried this and it took so long on my machine (Athlon64, fc5_x86_64), that I ctrl-c'd out of it. Running ldd on lapack_lite.so shows libpthread.so.0 => /lib64/libpthread.so.0 (0x00002aaaaace2000) libc.so.6 => /lib64/libc.so.6 (0x00002aaaaadfa000) /lib64/ld-linux-x86-64.so.2 (0x0000555555554000) So apparently the Atlas library present in /usr/lib64/atlas was not linked in. I built numpy from the svn repository two days ago. I expect JJ's version is linked with atlas 'cause mine sure didn't run in 11 seconds. Chuck On 6/10/06, Robert Kern <rob...@gm...> wrote: > > JJ wrote: > > Any ideas on where to look for a speedup? If the > > problem is that it could not locate the atlas > > ibraries, how might I assure that numpy finds the > > atlas libraries. I can recompile and send along the > > results if it would help. > > Run ldd(1) on the file lapack_lite.so . It should show you what dynamic > libraries it is linked against. > > > PS. I first sent this to the scipy mailing list, but > > it didnt seem to make it there. > > That's okay. This is actually the right place. All of the functions you > used are > numpy functions, not scipy. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma > that is made terrible by our own mad attempt to interpret it as though it > had > an underlying truth." > -- Umberto Eco > > > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Rob H. <ro...@ho...> - 2006-06-11 08:31:34
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 JJ wrote: > python > import numpy > import scipy > a = scipy.random.normal(0,1,[10000,2000]) > b = scipy.random.normal(0,1,[10000,2000]) > c = scipy.dot(a,scipy.transpose(b)) Hi, My experience with the old Numeric tells me that the first thing I would try to speed this up is to copy the transposed b into a fresh array. It might be that the memory access in dot is very inefficient due to the transposed (and hence large-stride) array. Of course I may be completely wrong. Rob - -- Rob W.W. Hooft || ro...@ho... || http://www.hooft.net/people/rob/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEi9TdH7J/Cv8rb3QRAgXYAJ9EcJtfUeX3H0ZWf22AapOvC3dgTwCgtF5r QW6si4kqTjCvifCfTc/ShC0= =uuUY -----END PGP SIGNATURE----- |
From: Paulo J. da S. e S. <pjs...@im...> - 2006-06-11 23:06:12
|
Em Sáb, 2006-06-10 às 15:15 -0700, JJ escreveu: > python > import numpy > import scipy > a = scipy.random.normal(0,1,[10000,2000]) > b = scipy.random.normal(0,1,[10000,2000]) > c = scipy.dot(a,scipy.transpose(b)) Interesting enough, I may have found "the reason". I am using only numpy (as I don't have scipy compiled and it is not necessary to the code above). The problem is probably memory consumption. Let me explain. After creating a, ipython reports 160Mb of memory usage. After creating b, 330Mb. But when I run the last line, the memory footprint jumps to 1.2gb! This is four times the original memory consumption. In my computer the result is swapping and the calculation would take forever. Why is the memory usage getting so high? Paulo Obs: As a side not. If you decrease the matrix sizes (like for example 2000x2000), numpy and matlab spend basically the same time. If the transpose imposes some penalty for numpy, it imposes the same penalty for matlab (version 6.5, R13). |