Re: [Sbcl-devel] on an unsettling Lisp versus Python 'competition' or why isn't Lisp doing better h

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

hi,

i've optimized it a bit which i think is in par with the python speed
if not faster now. two things are attached to the mail: one is the
optimized split-sequence and the other is the diff to sbcl HEAD. this
latter diff contains the previously posted base-char changes to
read-line and an optimization to string-trim. (the big change in
stream.lisp is only the make-result-string macrolet and the (declare
(type index len index)) type declaration).

i was working with a smaller version of the file to keep it in the
disk cache and avoid swapping. it went down from 9.927 to 2.597.

the final form of the defun:

(defun parse-text (&optional (filename "/home/ati/fake-data.txt"))
  (declare (optimize speed (debug 0))
           (inline split-sequence:split-sequence
                   string-trim))
  (with-open-file (in filename
                      :element-type 'base-char
                      :external-format :ascii
                      :direction :input
                      :if-does-not-exist :error)
    (let ((ht (make-hash-table :test 'equal)))
      (loop
         for line of-type simple-base-string = (read-line in nil)
         while line do
           (let ((fields (split-sequence:split-sequence #\~ line)))
             (when (= (length fields) 3)
               (let ((id (string-trim " " (the simple-base-string
(first fields))))
                     (attribute (string-trim " " (the
simple-base-string (second fields))))
                     (value (string-trim " " (the simple-base-string
(third fields)))))
                 (when (not (gethash id ht))
                   (setf (gethash id ht) (make-hash-table :test 'equal)))
                 (let ((fields-ht (gethash id ht)))
                   (setf (gethash attribute fields-ht) value))))))
      ;;(print (hash-table-count ht))
      (values))))

i've recorded the interesting steps:

CL-USER> (progn
           (sb-ext:gc :full t)
           (time (parse-text)))

*** unoptimized:

Evaluation took:
  9.927 seconds of real time
  9.056566 seconds of user run time
  0.856054 seconds of system run time
  [Run times include 2.352 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  1,633,159,856 bytes consed.

*** after applying the patch for read-line to return base-string's:

difference is below measuring error (but it's important for the
split-sequence inlining)

*** after adding (declare (optimize speed (debug 0)))

difference is below measuring error

*** after optimizing and inlining split-sequence (it even had apply
#'position calls!)

Evaluation took:
  4.916 seconds of real time
  4.504282 seconds of user run time
  0.408026 seconds of system run time
  [Run times include 1.044 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  796,540,592 bytes consed.

*** after adding optimization to string-trim that avoids calling
subseq if there's nothing to be trimmed (note: this is explicitly
allowed by the spec):

Evaluation took:
  3.372 seconds of real time
  2.940184 seconds of user run time
  0.428027 seconds of system run time
  [Run times include 1.048 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  728,696,000 bytes consed.

*** after inlining string-trim:

time difference is below measuring error, but consing went down a little.

Evaluation took:
  3.28 seconds of real time
  2.864179 seconds of user run time
  0.396024 seconds of system run time
  [Run times include 1.024 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  718,985,664 bytes consed.

*** annotating (the simple-base-string ...) for the args of the
string-trim calls:

Evaluation took:
  3.107 seconds of real time
  2.740172 seconds of user run time
  0.360022 seconds of system run time
  [Run times include 1.048 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  718,925,760 bytes consed.

*** at this point the profiling looks like this (sorry, it's
unreadable without fixed font, but plain text mails are preferred):

Rank Name                                                    Self% Cumul% Total%
1    READ-LINE                                               39.34  53.08  39.34
     Callers
       PARSE-TEXT                                                   53.08
     Calls
       SB-INT:FAST-READ-CHAR-REFILL                                  9.00
       SB-KERNEL:%SHRINK-VECTOR                                      3.32
       SB-KERNEL:UB32-BASH-COPY                                      1.42
2    (FLET #:CLEANUP-FUN-[CALL-WITHOUT-INTERRUPTS]24)        24.64  24.64  63.98
3    SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII               7.58   9.00  71.56
     Callers
       SB-INT:FAST-READ-CHAR-REFILL                                  9.00
     Calls
       SB-IMPL::REFILL-INPUT-BUFFER                                  1.42
4    PARSE-TEXT                                               6.16  93.84  77.73
     Callers
       NIL                                                          93.84
     Calls
       READ-LINE                                                    53.08
       SB-IMPL::GETHASH3                                             5.21
       SB-VM::GENERIC-+                                              4.74
       MAKE-HASH-TABLE                                               3.32
       SB-KERNEL:UB8-BASH-COPY                                       2.84
       SB-KERNEL:%PUTHASH                                            0.95
5    SB-VM::GENERIC-+                                         4.74   4.74  82.46
6    SB-KERNEL:%SHRINK-VECTOR                                 3.32   3.32  85.78
7    SB-KERNEL:UB8-BASH-COPY                                  2.84   2.84  88.63
8    SB-KERNEL:%SP-STRING-COMPARE                             2.37   2.37  91.00
9    MAKE-HASH-TABLE                                          2.37   3.32  93.36
10   SB-KERNEL:STRING=*                                       1.42   3.79  94.79
11   SB-KERNEL:UB32-BASH-COPY                                 1.42   1.42  96.21
12   (FLET #:BODY-FUN-[GETHASH3]1076)                         0.95   5.21  97.16
13   (FLET SB-IMPL::TRICK)                                    0.47   0.47  97.63
14   SB-IMPL::REFILL-INPUT-BUFFER                             0.47   1.42  98.10
15   SB-IMPL::CEIL-POWER-OF-TWO                               0.47   0.47  98.58
16   (FLET #:CLEANUP-FUN-[%PUTHASH]1409)                      0.47   0.47  99.05
17   SB-KERNEL:%PUTHASH                                       0.47   0.95  99.53
18   SB-IMPL::%MAKE-HASH-TABLE                                0.47   0.47 100.00

*** after rising +ansi-stream-in-buffer-length+ to 4096 and recompiling sbcl

note that consing fall down drastically but due to the problem with
REPLACE that Christoph mentioned it got slower. but hopefully Juho has
something to say here.

Evaluation took:
  3.59 seconds of real time
  3.272205 seconds of user run time
  0.316019 seconds of system run time
  [Run times include 0.528 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  260,260,240 bytes consed.

Rank Name                                                    Self% Cumul% Total%
1    REPLACE                                                 16.42  49.25  16.42
     Callers
       READ-LINE                                                    49.25
       REPLACE                                                       0.37
     Calls
       SB-IMPL::OPTIMIZED-DATA-VECTOR-SET                           10.07
       SB-KERNEL:HAIRY-DATA-VECTOR-SET                               9.70
       SB-KERNEL:HAIRY-DATA-VECTOR-REF                               7.84
       SB-IMPL::OPTIMIZED-DATA-VECTOR-REF                            2.61
       SB-IMPL::OPTIMIZED-DATA-VECTOR-REF                            2.24
       LENGTH                                                        0.37
       REPLACE                                                       0.37
2    SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII              11.57  16.42  27.99
     Callers
       SB-INT:FAST-READ-CHAR-REFILL                                 16.42
     Calls
       SB-IMPL::REFILL-INPUT-BUFFER                                  4.85
3    (FLET #:CLEANUP-FUN-[CALL-WITHOUT-INTERRUPTS]24)        10.82  10.82  38.81
4    READ-LINE                                               10.07  76.12  48.88
     Callers
       PARSE-TEXT                                                   76.12
     Calls
       REPLACE                                                      49.25
       SB-INT:FAST-READ-CHAR-REFILL                                 16.42
       SB-KERNEL:%SHRINK-VECTOR                                      0.37
5    SB-IMPL::OPTIMIZED-DATA-VECTOR-SET                      10.07  10.07  58.96
     Callers
       REPLACE                                                      10.07
6    SB-KERNEL:HAIRY-DATA-VECTOR-SET                          9.70   9.70  68.66
7    PARSE-TEXT                                               7.84  95.90  76.49
8    SB-KERNEL:HAIRY-DATA-VECTOR-REF                          7.84   7.84  84.33
9    SB-IMPL::OPTIMIZED-DATA-VECTOR-REF                       2.61   2.61  86.94
10   SB-IMPL::OPTIMIZED-DATA-VECTOR-REF                       2.24   2.24  89.18
11   SB-KERNEL:UB8-BASH-COPY                                  1.87   1.87  91.04
12   SB-KERNEL:STRING=*                                       1.49   2.24  92.54
13   SB-KERNEL:%PUTHASH                                       0.75   1.12  93.28
14   SB-KERNEL:%SP-STRING-COMPARE                             0.75   0.75  94.03

*** after getting rid of the REPLACE problem with an ugly hack in
ansi-stream-read-line that helps propagating the type to REPLACE:

Evaluation took:
  2.597 seconds of real time
  2.356147 seconds of user run time
  0.240015 seconds of system run time
  [Run times include 0.532 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  260,262,768 bytes consed.

Rank Name                                                    Self% Cumul% Total%
1    PARSE-TEXT                                              16.40  96.83  16.40
2    SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII              14.81  18.52  31.22
     Callers
       SB-INT:FAST-READ-CHAR-REFILL                                 18.52
     Calls
       SB-IMPL::REFILL-INPUT-BUFFER                                  3.70
3    READ-LINE                                               13.76  61.90  44.97
     Callers
       PARSE-TEXT                                                   61.90
     Calls
       REPLACE                                                      29.63
       SB-INT:FAST-READ-CHAR-REFILL                                 18.52
4    (FLET #:CLEANUP-FUN-[CALL-WITHOUT-INTERRUPTS]24)        13.76  13.76  58.73
     Callers
       SB-UNIX::CALL-WITHOUT-INTERRUPTS                             13.76
5    REPLACE                                                  8.47  29.63  67.20
     Callers
       READ-LINE                                                    29.63
       REPLACE                                                       0.53
     Calls
       SB-KERNEL:HAIRY-DATA-VECTOR-REF                               7.41
       SB-KERNEL:HAIRY-DATA-VECTOR-SET                               5.82
       SB-IMPL::OPTIMIZED-DATA-VECTOR-SET                            3.70
       SB-IMPL::OPTIMIZED-DATA-VECTOR-REF                            3.70
       REPLACE                                                       0.53
       LENGTH                                                        0.53
6    SB-KERNEL:HAIRY-DATA-VECTOR-REF                          7.41   7.41  74.60
7    SB-KERNEL:HAIRY-DATA-VECTOR-SET                          5.82   5.82  80.42
8    SB-KERNEL:UB8-BASH-COPY                                  4.23   4.23  84.66
9    SB-IMPL::OPTIMIZED-DATA-VECTOR-REF                       3.70   3.70  88.36
10   SB-IMPL::OPTIMIZED-DATA-VECTOR-SET                       3.70   3.70  92.06
11   (FLET #:BODY-FUN-[%PUTHASH]1355)                         1.59   2.12  93.65
12   SB-KERNEL:%SP-STRING-COMPARE                             1.06   1.06  94.71
13   (FLET #:BODY-FUN-[GETHASH3]1076)                         1.06   2.65  95.77
14   LENGTH                                                   0.53   0.53  96.30
15   SB-KERNEL::%MAKE-INSTANCE-WITH-LAYOUT                    0.53   0.53  96.83
16   SB-IMPL::REFILL-INPUT-BUFFER                             0.53   3.70  97.35
17   SB-KERNEL:%UNARY-TRUNCATE                                0.53   0.53  97.88
18   SB-THREAD::MAKE-SPINLOCK                                 0.53   0.53  98.41
19   (FLET SB-IMPL::TRICK)                                    0.53   0.53  98.94
20   SB-KERNEL:%SXHASH-SIMPLE-STRING                          0.53   1.06  99.47

*** and an interesting last piece: using (ppcre:split "~" line
:sharedp t), adding a (declare (type simple-base-string
target-string)) to split and avoiding apply #'subseq, it's hardly
slower:

Evaluation took:
  3.028 seconds of real time
  2.772173 seconds of user run time
  0.256016 seconds of system run time
  [Run times include 0.556 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  269,862,768 bytes consed.

-- 
 attila

Re: [Sbcl-devel] on an unsettling Lisp versus Python 'competition' or why isn't Lisp doing better h

Common Lisp compiler and runtime

Re: [Sbcl-devel] on an unsettling Lisp versus Python 'competition' or why isn't Lisp doing better here...?