|
From: Raymond T. <to...@rt...> - 2003-11-24 15:43:19
|
>>>>> "Nicolas" == Nicolas Neuss <Nic...@iw...> writes:
Nicolas> From the numbers it is obvious that the call is even much more expensive
Nicolas> than a daxpy for 256 double-floats. How comes?
>> as the daxpy for the case +N-short+=256, while calling Lisp functions is
>> much faster. Is it possible to cut down these costs?
>>
>> Thanks, Nicolas.
>>
I'll try to look into this. There's probably some improvement to be
had, but I doubt we can improve it enough for you. I think the
overhead comes from computing the necessary addresses, and also having
to turn off GC during the computation. IIRC, this involves an
unwind-protect which does add quite a bit of code.
Note that I also noticed long ago that a simple vector add in Lisp was
at least as fast as calling BLAS. However, having everything go
through FFI to BLAS at least allows us to take advantage of any
special libraries that might be available.
I, however, am not opposed to implementing the BLAS in Lisp. Other
LAPACK routines will still use the original BLAS, and Lisp code can
get the faster versions. Will need thinking, design, and
experimentation.
Ray
|