#9 preliminary fast POSITION

open
nobody
None
5
2003-02-18
2003-02-18
Jörg Höhle
No

The sequence functions carry an immense amount of
overhead. As a result, POSITION, COUNT etc. are much
slower than one would expect.

I advocate that CLISP should recognize optimizable
cases and handle them internally. The time to test
internally at run-time for optimizable cases is
negligible compared to the full-blown sequence and
stepping overhead.

Specifically, I advocate that CLISP handles optimizable
cases transparently and does not expose something like
EXT:VECTOR-POSITION. CLISP should encourage portable
code and thus use of POSITION etc., with guarantees
that it will be fast where it can be.

Optimizable cases are, at a glance
o vector instead sequence (possibly only string or
unsigned-byte 8/16/32)
o and test EQ or EQL (or maybe CHAR= and = with
corresponding array element-type)
o with KEY #'IDENTITY
o :from-end is optimizable as well (means almost
duplicate code).

So far, I only implemented string-position, ignoring
test and from-end. Maybe somebody will want to
continue work from this.

Discussion

  • Jörg Höhle
    Jörg Höhle
    2003-02-18

    fast STRING-POSITION

     
    Attachments
  • Paul F. Dietz
    Paul F. Dietz
    2003-02-18

    Logged In: YES
    user_id=638212

    If you have a special case, you want to put it into its own
    internal lisp function, then add a compiler macro that tries
    to figure out (at compile time) if any of the special cases
    apply. Otherwise, the general case function will attempt to
    figure it out at run time and call the appropriate special
    case function.

    You should look in other lisps and use DISASSEMBLE to
    determine which special cases are compiled in this way.
    Allegro CL, for example, compiles (ASSOC ... :TEST #'EQ)
    specially.

     
  • Jörg Höhle
    Jörg Höhle
    2003-02-18

    Logged In: YES
    user_id=377168

    It's precisely the "otherwise" (run-time) cases that I'm
    after. Without type inference la Python/CMUCL, there's no
    way to know before run-time whether (position elt x) will be
    called on a vector. Same for (CONCATENATE 'STRING
    ,@only-strings). (position char "") does not make for
    an interesting compile-time case. Nor does :test
    #<some-built-in-function>.

     
  • Jörg Höhle
    Jörg Höhle
    2003-02-26

    fastseq.d working with clisp-2.30

     
    Attachments
  • Jörg Höhle
    Jörg Höhle
    2003-02-26

    Logged In: YES
    user_id=377168

    Here's my fastseq.d working as a module with clisp-2.30 (due
    to renaming orgy between 2.28 and 2.30).

    Here are timings of my application:
    ;-- using position
    Real time: 135.09164 sec.
    Run time: 132.84 sec.
    Space: 33_810_380 Bytes
    GC: 64, GC time: 0.81 sec.

    ;-- using ext::vector-position
    Real time: 46.52058 sec.
    Run time: 44.54 sec.
    Space: 57_954_024 Bytes
    GC: 110, GC time: 1.53 sec.
    [There's more GC because the app even did an extra analysis
    in this testcase, which the slow version did not].

    It's only using this fast implementation of position that I
    can achieve my goal of beating perl 10 times speedwise.