|
From: Andrew S. <str...@as...> - 2006-01-18 19:35:32
|
Paulo J. S. Silva wrote: >Em Qua, 2006-01-18 =C3=A0s 11:15 -0700, Travis Oliphant escreveu: > > =20 > >>Will you run these again with the latest SVN version of numpy. I=20 >>couldn't figure out why a copy was being made on transpose (because it=20 >>shouldn't have been). Then, I dug deep into the PyArray_FromAny code=20 >>and found bad logic in when a copy was needed that was causing an=20 >>inappropriate copy. >> >>I fixed that and now wonder how things will change. Because presumably= ,=20 >>the dotblas function should handle the situation now... >> >> =20 >> > >Good work Travis :-) > >Tests x.T*y x*y.T A*x A*B A.T*x half 2in2 > >Dimension: 5 >Array 0.9000 0.2400 0.2000 0.2600 0.7100 0.9400 1.1600 >Matrix 4.7800 1.5700 0.6200 0.7600 1.0600 3.0400 4.6500 >NumArr 3.2900 0.7400 0.6800 0.7800 8.4800 7.4200 11.6600 >Numeri 1.3300 0.3900 0.3100 0.4200 0.7900 0.6800 0.7600 >Matlab 1.88 0.44 0.41 0.35 0.37 1.20 0.98 > >Dimension: 50 >Array 9.0000 2.1400 0.5500 18.9500 1.4100 4.2700 4.4500 >Matrix 48.7400 3.9200 1.0100 20.2000 1.8000 6.5000 8.1900 >NumArr 32.3900 2.6800 1.0000 18.9700 13.0300 8.6300 13.0700 >Numeri 13.1000 2.2600 0.6500 18.2700 10.1500 1.0400 3.2600 >Matlab 16.98 1.94 1.07 17.86 0.73 1.57 1.77 > >Dimension: 500 >Array 1.1400 9.2300 2.0100 168.2700 2.1800 4.0200 4.2900 >Matrix 5.0300 9.3500 2.1500 167.5300 2.1700 4.1100 4.4200 >NumArr 3.4400 9.1000 2.1000 168.7100 21.8400 4.3900 5.8900 >Numeri 1.5800 9.2700 2.0700 167.5600 20.0500 3.4000 4.6800 >Matlab 2.09 6.07 2.17 169.45 2.10 2.56 3.06 > >Note the 10-fold speed-up for higher dimensions :-) > >It looks like that now that numpy only looses to matlab in small >dimensions. Probably, the problem is the creation of the object to >represent the transposed object. Probably Matlab creation of objects is >very lightweight (they only have matrices objects to deal with). >Probably this phenomenon explains the behavior for the indexing >operations too. > >Paulo > > > =20 > Here's an idea Fernando and I have briefly talked about off-list, but=20 which perhaps bears talking about here: Is there speed to be gained by=20 an alternative, very simple, very optimized ndarray constructor? The=20 idea would be a special-case constructor with very limited functionality=20 designed purely for speed. It wouldn't support (m)any of the fantastic=20 things Travis has done, but would be useful only in specialized use=20 cases, such as creating indices. I'm not familiar enough with what the normal constructor does to know if=20 we could implement something, (in C, perhaps) that would do nothing but=20 create a simple, contiguous array significantly faster than what is=20 currently done. Or does the current constructor create a new instance=20 about as fast as possible? I know Travis has optimized it, but it's a=20 general purpose constructor, and I'm thinking these extra features may=20 take some extra CPU cycles. Cheers! Andrew |