The sequence functions carry an immense amount of
overhead. As a result, POSITION, COUNT etc. are much
slower than one would expect.
I advocate that CLISP should recognize optimizable
cases and handle them internally. The time to test
internally at run-time for optimizable cases is
negligible compared to the full-blown sequence and
stepping overhead.
Specifically, I advocate that CLISP handles optimizable
cases transparently and does not expose something like
EXT:VECTOR-POSITION. CLISP should encourage portable
code and thus use of POSITION etc., with guarantees
that it will be fast where it can be.
Optimizable cases are, at a glance
o vector instead sequence (possibly only string or
unsigned-byte 8/16/32)
o and test EQ or EQL (or maybe CHAR= and = with
corresponding array element-type)
o with KEY #'IDENTITY
o :from-end is optimizable as well (means almost
duplicate code).
So far, I only implemented string-position, ignoring
test and from-end. Maybe somebody will want to
continue work from this.
fast STRING-POSITION
Logged In: YES
user_id=638212
If you have a special case, you want to put it into its own
internal lisp function, then add a compiler macro that tries
to figure out (at compile time) if any of the special cases
apply. Otherwise, the general case function will attempt to
figure it out at run time and call the appropriate special
case function.
You should look in other lisps and use DISASSEMBLE to
determine which special cases are compiled in this way.
Allegro CL, for example, compiles (ASSOC ... :TEST #'EQ)
specially.
Logged In: YES
user_id=377168
It's precisely the "otherwise" (run-time) cases that I'm
after. Without type inference la Python/CMUCL, there's no
way to know before run-time whether (position elt x) will be
called on a vector. Same for (CONCATENATE 'STRING
,@only-strings). (position char "") does not make for
an interesting compile-time case. Nor does :test
#<some-built-in-function>.
fastseq.d working with clisp-2.30
Logged In: YES
user_id=377168
Here's my fastseq.d working as a module with clisp-2.30 (due
to renaming orgy between 2.28 and 2.30).
Here are timings of my application:
;-- using position
Real time: 135.09164 sec.
Run time: 132.84 sec.
Space: 33_810_380 Bytes
GC: 64, GC time: 0.81 sec.
;-- using ext::vector-position
Real time: 46.52058 sec.
Run time: 44.54 sec.
Space: 57_954_024 Bytes
GC: 110, GC time: 1.53 sec.
[There's more GC because the app even did an extra analysis
in this testcase, which the slow version did not].
It's only using this fast implementation of position that I
can achieve my goal of beating perl 10 times speedwise.