|
From: Raymond T. <to...@rt...> - 2003-11-24 16:35:49
|
>>>>> "Nicolas" == Nicolas Neuss <Nic...@iw...> writes:
Nicolas> Raymond Toy <to...@rt...> writes:
>> I'll try to look into this. There's probably some improvement to be
>> had, but I doubt we can improve it enough for you. I think the
>> overhead comes from computing the necessary addresses, and also having
>> to turn off GC during the computation. IIRC, this involves an
>> unwind-protect which does add quite a bit of code.
Nicolas> Yes, you are right. I see this now. If switching off multithreading is
Nicolas> expensive, there is a problem here. I don't know enough of these things to
Nicolas> help you here.
It's not multithreading, per se. It's because we can't have GC
suddenly move the vectors before doing the foreign call, otherwise the
foreign function will be reading and writing to some random place in
memory.
>> Note that I also noticed long ago that a simple vector add in Lisp was
>> at least as fast as calling BLAS.
Nicolas> Probably this was before I started using Matlisp.
Yeah, probably before matlisp became matlisp.
Nicolas> I will have to do this at least for a small part of the routines, if the
Nicolas> foreign call cannot be achieved with really little overhead (say two times
Nicolas> a Lisp function call). I want to implement flexible sparse block matrices,
A factor of 2 will be very difficult to achieve, since a Lisp function
call basically loads up a bunch of pointers and calls the function.
We need to compute addresses, do the without-gc/unwind-protect stuff,
load up the registers for a foreign call and then call it.
Nicolas> and choosing Matlisp data for the blocks would be a possibility. But the
Nicolas> blocks can be small, therefore I cannot make compromises when operating on
Nicolas> those blocks.
I assume you've profiled it so that the small blocks really are the
bottleneck?
Nicolas> P.S.: BTW, how does ACL perform in this respect? Just today I read Duane
Don't know since I don't have a version of ACL that can run matlisp.
Ray
|