From: Ingvar <in...@he...> - 2007-12-05 19:02:30
|
James Y Knight writes: > On Dec 4, 2007, at 8:35 PM, Brian Downing wrote: [ SNIP ] > > I would personally really like to see BASE-CHAR/STRING be able to hold > > iso-8859-1. I've had several cases where I wanted to parse "plain > > ascii > > with other random binary garbage I didn't care about", and doing that > > in SBCL as it stands without taking the unicode hit is pretty painful. > > I like that it only holds ASCII, in that it forces the programmer to > think, and use strings for what they're meant for: text. If you want > to store random bytes, you ought to be using a byte array. And if you > want to store text, it's pretty rare that you actually really only > want LATIN-1 (or if you do want that, it's pretty rare that you > *should* want that) For me, essentially every single text-processing I've done in the last 20 years fits quite well inside "Just LATIN-1" and frequently, the source data is encoded in Latin-1 anyway. Saying that, I can certainly see the logic to "BASE-CHAR can only take ASCII" and taht covers probably 90% of the times I use strings anyway (for "random binary data" I tend to use arrays of (UNSIGNED-BYTE 8)). //INgvar |
From: Stefan L. <lan...@gm...> - 2007-11-22 11:02:01
|
On Donnerstag, 22. November 2007, Bruno Daniel wrote: > Juho Snellman writes: > > As for the slowness issue, since everybody seems to be making > > wild guesses about the reasons without actually doing any > > profiling, I'll throw in one more. > > Here's the result SBCL's statistical profiler shows on my 64 bit > machine (The statistics are reproducible up to fluctuations of > about 1%): Since SUBSEQ shows up very high on the list: I've improved one or two entries on the Great Computer Language Shootout (they are gone, alioth lost a few days data around that time) and on the way encountered that some sequence functions, including SUBSEQ, can be improved by an order of magnitude or more for simple array types. In my experience this is the main reason why direct ports from Ruby/Python code perform bad under SBCL. Stefan |
From: Attila L. <att...@gm...> - 2007-11-22 14:19:30
Attachments:
split-sequence.lisp
sbcl.diff
|
hi, i've optimized it a bit which i think is in par with the python speed if not faster now. two things are attached to the mail: one is the optimized split-sequence and the other is the diff to sbcl HEAD. this latter diff contains the previously posted base-char changes to read-line and an optimization to string-trim. (the big change in stream.lisp is only the make-result-string macrolet and the (declare (type index len index)) type declaration). i was working with a smaller version of the file to keep it in the disk cache and avoid swapping. it went down from 9.927 to 2.597. the final form of the defun: (defun parse-text (&optional (filename "/home/ati/fake-data.txt")) (declare (optimize speed (debug 0)) (inline split-sequence:split-sequence string-trim)) (with-open-file (in filename :element-type 'base-char :external-format :ascii :direction :input :if-does-not-exist :error) (let ((ht (make-hash-table :test 'equal))) (loop for line of-type simple-base-string = (read-line in nil) while line do (let ((fields (split-sequence:split-sequence #\~ line))) (when (= (length fields) 3) (let ((id (string-trim " " (the simple-base-string (first fields)))) (attribute (string-trim " " (the simple-base-string (second fields)))) (value (string-trim " " (the simple-base-string (third fields))))) (when (not (gethash id ht)) (setf (gethash id ht) (make-hash-table :test 'equal))) (let ((fields-ht (gethash id ht))) (setf (gethash attribute fields-ht) value)))))) ;;(print (hash-table-count ht)) (values)))) i've recorded the interesting steps: CL-USER> (progn (sb-ext:gc :full t) (time (parse-text))) *** unoptimized: Evaluation took: 9.927 seconds of real time 9.056566 seconds of user run time 0.856054 seconds of system run time [Run times include 2.352 seconds GC run time.] 0 calls to %EVAL 0 page faults and 1,633,159,856 bytes consed. *** after applying the patch for read-line to return base-string's: difference is below measuring error (but it's important for the split-sequence inlining) *** after adding (declare (optimize speed (debug 0))) difference is below measuring error *** after optimizing and inlining split-sequence (it even had apply #'position calls!) Evaluation took: 4.916 seconds of real time 4.504282 seconds of user run time 0.408026 seconds of system run time [Run times include 1.044 seconds GC run time.] 0 calls to %EVAL 0 page faults and 796,540,592 bytes consed. *** after adding optimization to string-trim that avoids calling subseq if there's nothing to be trimmed (note: this is explicitly allowed by the spec): Evaluation took: 3.372 seconds of real time 2.940184 seconds of user run time 0.428027 seconds of system run time [Run times include 1.048 seconds GC run time.] 0 calls to %EVAL 0 page faults and 728,696,000 bytes consed. *** after inlining string-trim: time difference is below measuring error, but consing went down a little. Evaluation took: 3.28 seconds of real time 2.864179 seconds of user run time 0.396024 seconds of system run time [Run times include 1.024 seconds GC run time.] 0 calls to %EVAL 0 page faults and 718,985,664 bytes consed. *** annotating (the simple-base-string ...) for the args of the string-trim calls: Evaluation took: 3.107 seconds of real time 2.740172 seconds of user run time 0.360022 seconds of system run time [Run times include 1.048 seconds GC run time.] 0 calls to %EVAL 0 page faults and 718,925,760 bytes consed. *** at this point the profiling looks like this (sorry, it's unreadable without fixed font, but plain text mails are preferred): Rank Name Self% Cumul% Total% 1 READ-LINE 39.34 53.08 39.34 Callers PARSE-TEXT 53.08 Calls SB-INT:FAST-READ-CHAR-REFILL 9.00 SB-KERNEL:%SHRINK-VECTOR 3.32 SB-KERNEL:UB32-BASH-COPY 1.42 2 (FLET #:CLEANUP-FUN-[CALL-WITHOUT-INTERRUPTS]24) 24.64 24.64 63.98 3 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII 7.58 9.00 71.56 Callers SB-INT:FAST-READ-CHAR-REFILL 9.00 Calls SB-IMPL::REFILL-INPUT-BUFFER 1.42 4 PARSE-TEXT 6.16 93.84 77.73 Callers NIL 93.84 Calls READ-LINE 53.08 SB-IMPL::GETHASH3 5.21 SB-VM::GENERIC-+ 4.74 MAKE-HASH-TABLE 3.32 SB-KERNEL:UB8-BASH-COPY 2.84 SB-KERNEL:%PUTHASH 0.95 5 SB-VM::GENERIC-+ 4.74 4.74 82.46 6 SB-KERNEL:%SHRINK-VECTOR 3.32 3.32 85.78 7 SB-KERNEL:UB8-BASH-COPY 2.84 2.84 88.63 8 SB-KERNEL:%SP-STRING-COMPARE 2.37 2.37 91.00 9 MAKE-HASH-TABLE 2.37 3.32 93.36 10 SB-KERNEL:STRING=* 1.42 3.79 94.79 11 SB-KERNEL:UB32-BASH-COPY 1.42 1.42 96.21 12 (FLET #:BODY-FUN-[GETHASH3]1076) 0.95 5.21 97.16 13 (FLET SB-IMPL::TRICK) 0.47 0.47 97.63 14 SB-IMPL::REFILL-INPUT-BUFFER 0.47 1.42 98.10 15 SB-IMPL::CEIL-POWER-OF-TWO 0.47 0.47 98.58 16 (FLET #:CLEANUP-FUN-[%PUTHASH]1409) 0.47 0.47 99.05 17 SB-KERNEL:%PUTHASH 0.47 0.95 99.53 18 SB-IMPL::%MAKE-HASH-TABLE 0.47 0.47 100.00 *** after rising +ansi-stream-in-buffer-length+ to 4096 and recompiling sbcl note that consing fall down drastically but due to the problem with REPLACE that Christoph mentioned it got slower. but hopefully Juho has something to say here. Evaluation took: 3.59 seconds of real time 3.272205 seconds of user run time 0.316019 seconds of system run time [Run times include 0.528 seconds GC run time.] 0 calls to %EVAL 0 page faults and 260,260,240 bytes consed. Rank Name Self% Cumul% Total% 1 REPLACE 16.42 49.25 16.42 Callers READ-LINE 49.25 REPLACE 0.37 Calls SB-IMPL::OPTIMIZED-DATA-VECTOR-SET 10.07 SB-KERNEL:HAIRY-DATA-VECTOR-SET 9.70 SB-KERNEL:HAIRY-DATA-VECTOR-REF 7.84 SB-IMPL::OPTIMIZED-DATA-VECTOR-REF 2.61 SB-IMPL::OPTIMIZED-DATA-VECTOR-REF 2.24 LENGTH 0.37 REPLACE 0.37 2 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII 11.57 16.42 27.99 Callers SB-INT:FAST-READ-CHAR-REFILL 16.42 Calls SB-IMPL::REFILL-INPUT-BUFFER 4.85 3 (FLET #:CLEANUP-FUN-[CALL-WITHOUT-INTERRUPTS]24) 10.82 10.82 38.81 4 READ-LINE 10.07 76.12 48.88 Callers PARSE-TEXT 76.12 Calls REPLACE 49.25 SB-INT:FAST-READ-CHAR-REFILL 16.42 SB-KERNEL:%SHRINK-VECTOR 0.37 5 SB-IMPL::OPTIMIZED-DATA-VECTOR-SET 10.07 10.07 58.96 Callers REPLACE 10.07 6 SB-KERNEL:HAIRY-DATA-VECTOR-SET 9.70 9.70 68.66 7 PARSE-TEXT 7.84 95.90 76.49 8 SB-KERNEL:HAIRY-DATA-VECTOR-REF 7.84 7.84 84.33 9 SB-IMPL::OPTIMIZED-DATA-VECTOR-REF 2.61 2.61 86.94 10 SB-IMPL::OPTIMIZED-DATA-VECTOR-REF 2.24 2.24 89.18 11 SB-KERNEL:UB8-BASH-COPY 1.87 1.87 91.04 12 SB-KERNEL:STRING=* 1.49 2.24 92.54 13 SB-KERNEL:%PUTHASH 0.75 1.12 93.28 14 SB-KERNEL:%SP-STRING-COMPARE 0.75 0.75 94.03 *** after getting rid of the REPLACE problem with an ugly hack in ansi-stream-read-line that helps propagating the type to REPLACE: Evaluation took: 2.597 seconds of real time 2.356147 seconds of user run time 0.240015 seconds of system run time [Run times include 0.532 seconds GC run time.] 0 calls to %EVAL 0 page faults and 260,262,768 bytes consed. Rank Name Self% Cumul% Total% 1 PARSE-TEXT 16.40 96.83 16.40 2 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII 14.81 18.52 31.22 Callers SB-INT:FAST-READ-CHAR-REFILL 18.52 Calls SB-IMPL::REFILL-INPUT-BUFFER 3.70 3 READ-LINE 13.76 61.90 44.97 Callers PARSE-TEXT 61.90 Calls REPLACE 29.63 SB-INT:FAST-READ-CHAR-REFILL 18.52 4 (FLET #:CLEANUP-FUN-[CALL-WITHOUT-INTERRUPTS]24) 13.76 13.76 58.73 Callers SB-UNIX::CALL-WITHOUT-INTERRUPTS 13.76 5 REPLACE 8.47 29.63 67.20 Callers READ-LINE 29.63 REPLACE 0.53 Calls SB-KERNEL:HAIRY-DATA-VECTOR-REF 7.41 SB-KERNEL:HAIRY-DATA-VECTOR-SET 5.82 SB-IMPL::OPTIMIZED-DATA-VECTOR-SET 3.70 SB-IMPL::OPTIMIZED-DATA-VECTOR-REF 3.70 REPLACE 0.53 LENGTH 0.53 6 SB-KERNEL:HAIRY-DATA-VECTOR-REF 7.41 7.41 74.60 7 SB-KERNEL:HAIRY-DATA-VECTOR-SET 5.82 5.82 80.42 8 SB-KERNEL:UB8-BASH-COPY 4.23 4.23 84.66 9 SB-IMPL::OPTIMIZED-DATA-VECTOR-REF 3.70 3.70 88.36 10 SB-IMPL::OPTIMIZED-DATA-VECTOR-SET 3.70 3.70 92.06 11 (FLET #:BODY-FUN-[%PUTHASH]1355) 1.59 2.12 93.65 12 SB-KERNEL:%SP-STRING-COMPARE 1.06 1.06 94.71 13 (FLET #:BODY-FUN-[GETHASH3]1076) 1.06 2.65 95.77 14 LENGTH 0.53 0.53 96.30 15 SB-KERNEL::%MAKE-INSTANCE-WITH-LAYOUT 0.53 0.53 96.83 16 SB-IMPL::REFILL-INPUT-BUFFER 0.53 3.70 97.35 17 SB-KERNEL:%UNARY-TRUNCATE 0.53 0.53 97.88 18 SB-THREAD::MAKE-SPINLOCK 0.53 0.53 98.41 19 (FLET SB-IMPL::TRICK) 0.53 0.53 98.94 20 SB-KERNEL:%SXHASH-SIMPLE-STRING 0.53 1.06 99.47 *** and an interesting last piece: using (ppcre:split "~" line :sharedp t), adding a (declare (type simple-base-string target-string)) to split and avoiding apply #'subseq, it's hardly slower: Evaluation took: 3.028 seconds of real time 2.772173 seconds of user run time 0.256016 seconds of system run time [Run times include 0.556 seconds GC run time.] 0 calls to %EVAL 0 page faults and 269,862,768 bytes consed. -- attila |
From: James Y K. <fo...@fu...> - 2007-12-05 18:03:48
|
On Dec 4, 2007, at 8:35 PM, Brian Downing wrote: > On Wed, Dec 05, 2007 at 03:25:07AM +0200, Juho Snellman wrote: >> (There's also the slight problem that some of the speed increase of >> this hack comes from completely ignoring external formats. This would >> be ok if base-strings were defined to contain iso-8859-1, but they're >> actually defined to contain just ascii. Ok for a quick hack, less >> good >> for something included with SBCL. There's always the option of >> defining base-char to map to 8859-1 instead, but that's the >> discussion >> we've had before. I haven't measured whether this is an effect that >> actually matters.) > > I would personally really like to see BASE-CHAR/STRING be able to hold > iso-8859-1. I've had several cases where I wanted to parse "plain > ascii > with other random binary garbage I didn't care about", and doing that > in SBCL as it stands without taking the unicode hit is pretty painful. I like that it only holds ASCII, in that it forces the programmer to think, and use strings for what they're meant for: text. If you want to store random bytes, you ought to be using a byte array. And if you want to store text, it's pretty rare that you actually really only want LATIN-1 (or if you do want that, it's pretty rare that you *should* want that) James |
From: Brian D. <bd-...@la...> - 2007-12-05 18:50:21
|
On Wed, Dec 05, 2007 at 01:03:22PM -0500, James Y Knight wrote: > I like that it only holds ASCII, in that it forces the programmer to > think, and use strings for what they're meant for: text. If you want > to store random bytes, you ought to be using a byte array. And if you > want to store text, it's pretty rare that you actually really only > want LATIN-1 (or if you do want that, it's pretty rare that you > *should* want that) I think this is a decent theoretical design. Unfortauntely, I think it pretty much sucks from a practical standpoint. Let me use my earlier example. I was parsing RCS files. RCS is a mostly-text-based grammar with some binary sections (i.e. the actual file data). There is no way to know where the binary parts are without parsing the text. For performance and memory-consumption reasons, I could not deal with using 4x the memory to load an RCS file into an SBCL unicode string. And since the file can contain high-byte characters, I couldn't load it into a base-string as they exist today. Further, I couldn't load the file a "byte at a time", reading the right kind of element (character or byte), as that was very slow. So, I wound up slurping the whole thing into a byte array, as you recommend above. I think this sucks. Here's why: I had to rewrite my lexer to work on a byte array, not a string. This means: * The code is very clumsy, and not at all prototypical with what you'd expect for string operations. * I couldn't use any of the nice built-in CL string functions. * I couldn't use any of the nice external CL libraries that deal in strings. (*cough* CL-PPCRE *cough*) The fact is the whole project would have been a lot easier had I been able to reason about the file as a string, efficiently. Frankly, it's very common to have arbitrary binary data with large amounts of ASCII text. Making this a pain in the ass to deal with efficiently in SBCL doesn't seem like a good practical decision. Add more interesting games, like mmapping a file into memory and sticking an SBCL base-string header in front of it, and I think you really want to be able to handle 8-bit strings. -bcd |
From: Nathan F. <fr...@gm...> - 2007-11-22 22:34:07
|
On Nov 22, 2007 6:01 AM, Stefan Lang <lan...@gm...> wrote: > Since SUBSEQ shows up very high on the list: > I've improved one or two entries on the Great Computer > Language Shootout (they are gone, alioth lost a few days > data around that time) and on the way encountered that some > sequence functions, including SUBSEQ, can be improved by an > order of magnitude or more for simple array types. > > In my experience this is the main reason why direct > ports from Ruby/Python code perform bad under SBCL. For better or for worse, SBCL's philosophy is that the general sequence functions are, well, general--with all the type dispatching and such that implies. If you want speed, then you need to declare types at the call site. This is in contrast to many "scripting" languages, where the library functions are going to be pretty speedy. Thinking out loud: would it be worthwhile to specialize things like SUBSEQ--particularly on simple arrays--in the generic library code, similar to the way Juho optimized array access? Doing this sort of optimization for REPLACE/MISMATCH/etc. would probably require too much space and handling all the keyword arguments in functions like FIND or POSITION might be tricky, but the benefits to scripting-language-esque code might be worth it. Failing that, we might try being a little more careful in optimizing things like string functions--I know that the core function for string comparison can be sped up quite a bit by doing type dispatch on the type of strings prior to actually doing the comparison to eliminate jumps in the inner loops, for example. -Nathan |
From: Nikodemus S. <nik...@ra...> - 2007-11-23 12:08:41
|
On Nov 22, 2007 10:34 PM, Nathan Froyd <fr...@gm...> wrote: > Thinking out loud: would it be worthwhile to specialize things like > SUBSEQ--particularly on simple arrays--in the generic library code, > similar to the way Juho optimized array access? Doing this sort of > optimization for REPLACE/MISMATCH/etc. would probably require too much > space and handling all the keyword arguments in functions like FIND or > POSITION might be tricky, but the benefits to scripting-language-esque > code might be worth it. If we convert first to %REPLACE &co with required arguments only, then the keyword handling doesn't take any space... Cheers, -- Nikodemus |
From: Gary K. <gw...@me...> - 2007-11-25 20:33:31
|
Hi Attila, Thanks for all this work. Interesting notes. On Nov 22, 2007, at 9:19 AM, Attila Lendvai wrote: > hi, > > i've optimized it a bit which i think is in par with the python speed > if not faster now. two things are attached to the mail: one is the > optimized split-sequence and the other is the diff to sbcl HEAD. this > latter diff contains the previously posted base-char changes to > read-line and an optimization to string-trim. (the big change in > stream.lisp is only the make-result-string macrolet and the (declare > (type index len index)) type declaration). > > i was working with a smaller version of the file to keep it in the > disk cache and avoid swapping. it went down from 9.927 to 2.597. > > the final form of the defun: > > (defun parse-text (&optional (filename "/home/ati/fake-data.txt")) > (declare (optimize speed (debug 0)) > (inline split-sequence:split-sequence > string-trim)) > (with-open-file (in filename > :element-type 'base-char > :external-format :ascii > :direction :input > :if-does-not-exist :error) > (let ((ht (make-hash-table :test 'equal))) > (loop > for line of-type simple-base-string = (read-line in nil) > while line do > (let ((fields (split-sequence:split-sequence #\~ line))) > (when (= (length fields) 3) > (let ((id (string-trim " " (the simple-base-string > (first fields)))) > (attribute (string-trim " " (the > simple-base-string (second fields)))) > (value (string-trim " " (the simple-base-string > (third fields))))) > (when (not (gethash id ht)) > (setf (gethash id ht) (make-hash-table :test > 'equal))) > (let ((fields-ht (gethash id ht))) > (setf (gethash attribute fields-ht) value)))))) > ;;(print (hash-table-count ht)) > (values)))) > > > i've recorded the interesting steps: > > CL-USER> (progn > (sb-ext:gc :full t) > (time (parse-text))) > > *** unoptimized: > > Evaluation took: > 9.927 seconds of real time > 9.056566 seconds of user run time > 0.856054 seconds of system run time > [Run times include 2.352 seconds GC run time.] > 0 calls to %EVAL > 0 page faults and > 1,633,159,856 bytes consed. > > > > > *** after applying the patch for read-line to return base-string's: > > difference is below measuring error (but it's important for the > split-sequence inlining) > > > > > *** after adding (declare (optimize speed (debug 0))) > > difference is below measuring error > > > > > *** after optimizing and inlining split-sequence (it even had apply > #'position calls!) > > Evaluation took: > 4.916 seconds of real time > 4.504282 seconds of user run time > 0.408026 seconds of system run time > [Run times include 1.044 seconds GC run time.] > 0 calls to %EVAL > 0 page faults and > 796,540,592 bytes consed. > > > > > *** after adding optimization to string-trim that avoids calling > subseq if there's nothing to be trimmed (note: this is explicitly > allowed by the spec): > > Evaluation took: > 3.372 seconds of real time > 2.940184 seconds of user run time > 0.428027 seconds of system run time > [Run times include 1.048 seconds GC run time.] > 0 calls to %EVAL > 0 page faults and > 728,696,000 bytes consed. > > > > > *** after inlining string-trim: > > time difference is below measuring error, but consing went down a > little. > > Evaluation took: > 3.28 seconds of real time > 2.864179 seconds of user run time > 0.396024 seconds of system run time > [Run times include 1.024 seconds GC run time.] > 0 calls to %EVAL > 0 page faults and > 718,985,664 bytes consed. > > > > > *** annotating (the simple-base-string ...) for the args of the > string-trim calls: > > Evaluation took: > 3.107 seconds of real time > 2.740172 seconds of user run time > 0.360022 seconds of system run time > [Run times include 1.048 seconds GC run time.] > 0 calls to %EVAL > 0 page faults and > 718,925,760 bytes consed. > > > > > > *** at this point the profiling looks like this (sorry, it's > unreadable without fixed font, but plain text mails are preferred): > > Rank Name Self% > Cumul% Total% > 1 READ-LINE 39.34 > 53.08 39.34 > Callers > PARSE-TEXT > 53.08 > Calls > SB-INT:FAST-READ-CHAR- > REFILL 9.00 > SB-KERNEL:%SHRINK- > VECTOR 3.32 > SB-KERNEL:UB32-BASH- > COPY 1.42 > 2 (FLET #:CLEANUP-FUN-[CALL-WITHOUT-INTERRUPTS]24) 24.64 > 24.64 63.98 > 3 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII > 7.58 9.00 71.56 > Callers > SB-INT:FAST-READ-CHAR- > REFILL 9.00 > Calls > SB-IMPL::REFILL-INPUT- > BUFFER 1.42 > 4 PARSE-TEXT 6.16 > 93.84 77.73 > Callers > NIL > 93.84 > Calls > READ-LINE > 53.08 > SB- > IMPL::GETHASH3 5.21 > SB-VM::GENERIC- > + 4.74 > MAKE-HASH- > TABLE 3.32 > SB-KERNEL:UB8-BASH- > COPY 2.84 > SB-KERNEL:% > PUTHASH 0.95 > 5 SB-VM::GENERIC-+ > 4.74 4.74 82.46 > 6 SB-KERNEL:%SHRINK-VECTOR > 3.32 3.32 85.78 > 7 SB-KERNEL:UB8-BASH-COPY > 2.84 2.84 88.63 > 8 SB-KERNEL:%SP-STRING-COMPARE > 2.37 2.37 91.00 > 9 MAKE-HASH-TABLE > 2.37 3.32 93.36 > 10 SB-KERNEL:STRING=* > 1.42 3.79 94.79 > 11 SB-KERNEL:UB32-BASH-COPY > 1.42 1.42 96.21 > 12 (FLET #:BODY-FUN-[GETHASH3]1076) > 0.95 5.21 97.16 > 13 (FLET SB-IMPL::TRICK) > 0.47 0.47 97.63 > 14 SB-IMPL::REFILL-INPUT-BUFFER > 0.47 1.42 98.10 > 15 SB-IMPL::CEIL-POWER-OF-TWO > 0.47 0.47 98.58 > 16 (FLET #:CLEANUP-FUN-[%PUTHASH]1409) > 0.47 0.47 99.05 > 17 SB-KERNEL:%PUTHASH > 0.47 0.95 99.53 > 18 SB-IMPL::%MAKE-HASH-TABLE > 0.47 0.47 100.00 > > > > > > *** after rising +ansi-stream-in-buffer-length+ to 4096 and > recompiling sbcl > > note that consing fall down drastically but due to the problem with > REPLACE that Christoph mentioned it got slower. but hopefully Juho has > something to say here. > > Evaluation took: > 3.59 seconds of real time > 3.272205 seconds of user run time > 0.316019 seconds of system run time > [Run times include 0.528 seconds GC run time.] > 0 calls to %EVAL > 0 page faults and > 260,260,240 bytes consed. > > Rank Name Self% > Cumul% Total% > 1 REPLACE 16.42 > 49.25 16.42 > Callers > READ-LINE > 49.25 > > REPLACE 0.37 > Calls > SB-IMPL::OPTIMIZED-DATA-VECTOR-SET > 10.07 > SB-KERNEL:HAIRY-DATA-VECTOR- > SET 9.70 > SB-KERNEL:HAIRY-DATA-VECTOR- > REF 7.84 > SB-IMPL::OPTIMIZED-DATA-VECTOR- > REF 2.61 > SB-IMPL::OPTIMIZED-DATA-VECTOR- > REF 2.24 > > LENGTH 0.37 > > REPLACE 0.37 > 2 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII 11.57 > 16.42 27.99 > Callers > SB-INT:FAST-READ-CHAR-REFILL > 16.42 > Calls > SB-IMPL::REFILL-INPUT- > BUFFER 4.85 > 3 (FLET #:CLEANUP-FUN-[CALL-WITHOUT-INTERRUPTS]24) 10.82 > 10.82 38.81 > 4 READ-LINE 10.07 > 76.12 48.88 > Callers > PARSE-TEXT > 76.12 > Calls > REPLACE > 49.25 > SB-INT:FAST-READ-CHAR-REFILL > 16.42 > SB-KERNEL:%SHRINK- > VECTOR 0.37 > 5 SB-IMPL::OPTIMIZED-DATA-VECTOR-SET 10.07 > 10.07 58.96 > Callers > REPLACE > 10.07 > 6 SB-KERNEL:HAIRY-DATA-VECTOR-SET > 9.70 9.70 68.66 > 7 PARSE-TEXT 7.84 > 95.90 76.49 > 8 SB-KERNEL:HAIRY-DATA-VECTOR-REF > 7.84 7.84 84.33 > 9 SB-IMPL::OPTIMIZED-DATA-VECTOR-REF > 2.61 2.61 86.94 > 10 SB-IMPL::OPTIMIZED-DATA-VECTOR-REF > 2.24 2.24 89.18 > 11 SB-KERNEL:UB8-BASH-COPY > 1.87 1.87 91.04 > 12 SB-KERNEL:STRING=* > 1.49 2.24 92.54 > 13 SB-KERNEL:%PUTHASH > 0.75 1.12 93.28 > 14 SB-KERNEL:%SP-STRING-COMPARE > 0.75 0.75 94.03 > > > > > *** after getting rid of the REPLACE problem with an ugly hack in > ansi-stream-read-line that helps propagating the type to REPLACE: > > Evaluation took: > 2.597 seconds of real time > 2.356147 seconds of user run time > 0.240015 seconds of system run time > [Run times include 0.532 seconds GC run time.] > 0 calls to %EVAL > 0 page faults and > 260,262,768 bytes consed. > > Rank Name Self% > Cumul% Total% > 1 PARSE-TEXT 16.40 > 96.83 16.40 > 2 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII 14.81 > 18.52 31.22 > Callers > SB-INT:FAST-READ-CHAR-REFILL > 18.52 > Calls > SB-IMPL::REFILL-INPUT- > BUFFER 3.70 > 3 READ-LINE 13.76 > 61.90 44.97 > Callers > PARSE-TEXT > 61.90 > Calls > REPLACE > 29.63 > SB-INT:FAST-READ-CHAR-REFILL > 18.52 > 4 (FLET #:CLEANUP-FUN-[CALL-WITHOUT-INTERRUPTS]24) 13.76 > 13.76 58.73 > Callers > SB-UNIX::CALL-WITHOUT-INTERRUPTS > 13.76 > 5 REPLACE 8.47 > 29.63 67.20 > Callers > READ-LINE > 29.63 > > REPLACE 0.53 > Calls > SB-KERNEL:HAIRY-DATA-VECTOR- > REF 7.41 > SB-KERNEL:HAIRY-DATA-VECTOR- > SET 5.82 > SB-IMPL::OPTIMIZED-DATA-VECTOR- > SET 3.70 > SB-IMPL::OPTIMIZED-DATA-VECTOR- > REF 3.70 > > REPLACE 0.53 > > LENGTH 0.53 > 6 SB-KERNEL:HAIRY-DATA-VECTOR-REF > 7.41 7.41 74.60 > 7 SB-KERNEL:HAIRY-DATA-VECTOR-SET > 5.82 5.82 80.42 > 8 SB-KERNEL:UB8-BASH-COPY > 4.23 4.23 84.66 > 9 SB-IMPL::OPTIMIZED-DATA-VECTOR-REF > 3.70 3.70 88.36 > 10 SB-IMPL::OPTIMIZED-DATA-VECTOR-SET > 3.70 3.70 92.06 > 11 (FLET #:BODY-FUN-[%PUTHASH]1355) > 1.59 2.12 93.65 > 12 SB-KERNEL:%SP-STRING-COMPARE > 1.06 1.06 94.71 > 13 (FLET #:BODY-FUN-[GETHASH3]1076) > 1.06 2.65 95.77 > 14 LENGTH > 0.53 0.53 96.30 > 15 SB-KERNEL::%MAKE-INSTANCE-WITH-LAYOUT > 0.53 0.53 96.83 > 16 SB-IMPL::REFILL-INPUT-BUFFER > 0.53 3.70 97.35 > 17 SB-KERNEL:%UNARY-TRUNCATE > 0.53 0.53 97.88 > 18 SB-THREAD::MAKE-SPINLOCK > 0.53 0.53 98.41 > 19 (FLET SB-IMPL::TRICK) > 0.53 0.53 98.94 > 20 SB-KERNEL:%SXHASH-SIMPLE-STRING > 0.53 1.06 99.47 > > > > > *** and an interesting last piece: using (ppcre:split "~" line > :sharedp t), adding a (declare (type simple-base-string > target-string)) to split and avoiding apply #'subseq, it's hardly > slower: > > Evaluation took: > 3.028 seconds of real time > 2.772173 seconds of user run time > 0.256016 seconds of system run time > [Run times include 0.556 seconds GC run time.] > 0 calls to %EVAL > 0 page faults and > 269,862,768 bytes consed. > > -- > attila<split- > sequence.lisp><sbcl.diff>--------------------------------------------- > ---------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Sbcl-devel mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-devel -- Gary Warren King, metabang.com Cell: (413) 559 8738 Fax: (206) 338-4052 gwkkwg on Skype * garethsan on AIM |
From: David J. N. <dav...@gm...> - 2007-11-30 23:03:41
|
Hi Attila, Thanks for all of your work! We're very interested in improving the performance of the program that Gary posted, and have tried some comparisions with and without your changes, and using Python. I used SBCL 1.0.12.7 from the git repo, and the fake-data.tgz file that Gary provided. Here's what I found: 1. Using the code as Gary provided: Evaluation took: 22.975 seconds of real time 20.319805 seconds of user run time 2.187422 seconds of system run time [Run times include 7.712 seconds GC run time.] 0 calls to %EVAL 0 page faults and 4,192,713,792 bytes consed. 2. Applying your sbcl.diff, using your split-sequence, and using your version of parse-text, I got: Evaluation took: 12.328 seconds of real time 11.457298 seconds of user run time 0.800819 seconds of system run time [Run times include 2.122 seconds GC run time.] 0 calls to %EVAL 0 page faults and 911,974,696 bytes consed 3. Python: 4.188586 seconds I'm wondering if you or anyone else had any thoughts on the following questions: a. While this is a /dramatic/ improvement: - It's still much slower than Python, and you mentioned that your results were comparable to Python. Any thoughts why? Did you get the same results on Gary's fake-data.tgz? - The reduction in consing in the example, was dramatic, but didn't compare to your results. Again, any thoughts why? Did you get the same results on Gary's fake-data.tgz? b. If I run (sb-ext:gc :full t) after each run, I can repeatedly call PARSE-TEXT with no issues, if I don't make the gc call, I get dropped into the debugger, is this to be expected? c. Juho had mentioned that tuning the garbage collector was discussed on this list, would this address point b., and could someone point me to the appropriate post? I found this post http://sourceforge.net/mailarchive/message.php?msg_id=l37iskysld.fsf%40kyle.netcamp.se but am not sure how to directly apply the discussion. d. You wrote: > >*** and an interesting last piece: using (ppcre:split "~" line > >:sharedp t), adding a (declare (type simple-base-string > >target-string)) to split and avoiding apply #'subseq, it's hardly > >slower: To clarify, did you put (declare (type simple-base-string target-string)) in api.lisp? Could you explain how you: "avoiding apply #'subseq"? Again, many thanks for your work on this, and any answers you can provide! Cheers, David On Sun, Nov 25, 2007 at 03:33:17PM -0500, Gary King wrote: > Hi Attila, > > Thanks for all this work. Interesting notes. > > On Nov 22, 2007, at 9:19 AM, Attila Lendvai wrote: > > >hi, > > > >i've optimized it a bit which i think is in par with the python speed > >if not faster now. two things are attached to the mail: one is the > >optimized split-sequence and the other is the diff to sbcl HEAD. this > >latter diff contains the previously posted base-char changes to > >read-line and an optimization to string-trim. (the big change in > >stream.lisp is only the make-result-string macrolet and the (declare > >(type index len index)) type declaration). > > > >i was working with a smaller version of the file to keep it in the > >disk cache and avoid swapping. it went down from 9.927 to 2.597. > > > >the final form of the defun: > > > >(defun parse-text (&optional (filename "/home/ati/fake-data.txt")) > > (declare (optimize speed (debug 0)) > > (inline split-sequence:split-sequence > > string-trim)) > > (with-open-file (in filename > > :element-type 'base-char > > :external-format :ascii > > :direction :input > > :if-does-not-exist :error) > > (let ((ht (make-hash-table :test 'equal))) > > (loop > > for line of-type simple-base-string = (read-line in nil) > > while line do > > (let ((fields (split-sequence:split-sequence #\~ line))) > > (when (= (length fields) 3) > > (let ((id (string-trim " " (the simple-base-string > >(first fields)))) > > (attribute (string-trim " " (the > >simple-base-string (second fields)))) > > (value (string-trim " " (the simple-base-string > >(third fields))))) > > (when (not (gethash id ht)) > > (setf (gethash id ht) (make-hash-table :test > >'equal))) > > (let ((fields-ht (gethash id ht))) > > (setf (gethash attribute fields-ht) value)))))) > > ;;(print (hash-table-count ht)) > > (values)))) > > > > > >i've recorded the interesting steps: > > > >CL-USER> (progn > > (sb-ext:gc :full t) > > (time (parse-text))) > > > >*** unoptimized: > > > >Evaluation took: > > 9.927 seconds of real time > > 9.056566 seconds of user run time > > 0.856054 seconds of system run time > > [Run times include 2.352 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 1,633,159,856 bytes consed. > > > > > > > > > >*** after applying the patch for read-line to return base-string's: > > > >difference is below measuring error (but it's important for the > >split-sequence inlining) > > > > > > > > > >*** after adding (declare (optimize speed (debug 0))) > > > >difference is below measuring error > > > > > > > > > >*** after optimizing and inlining split-sequence (it even had apply > >#'position calls!) > > > >Evaluation took: > > 4.916 seconds of real time > > 4.504282 seconds of user run time > > 0.408026 seconds of system run time > > [Run times include 1.044 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 796,540,592 bytes consed. > > > > > > > > > >*** after adding optimization to string-trim that avoids calling > >subseq if there's nothing to be trimmed (note: this is explicitly > >allowed by the spec): > > > >Evaluation took: > > 3.372 seconds of real time > > 2.940184 seconds of user run time > > 0.428027 seconds of system run time > > [Run times include 1.048 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 728,696,000 bytes consed. > > > > > > > > > >*** after inlining string-trim: > > > >time difference is below measuring error, but consing went down a > >little. > > > >Evaluation took: > > 3.28 seconds of real time > > 2.864179 seconds of user run time > > 0.396024 seconds of system run time > > [Run times include 1.024 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 718,985,664 bytes consed. > > > > > > > > > >*** annotating (the simple-base-string ...) for the args of the > >string-trim calls: > > > >Evaluation took: > > 3.107 seconds of real time > > 2.740172 seconds of user run time > > 0.360022 seconds of system run time > > [Run times include 1.048 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 718,925,760 bytes consed. > > > > > > > > > > > >*** at this point the profiling looks like this (sorry, it's > >unreadable without fixed font, but plain text mails are preferred): > > > >Rank Name Self% > >Cumul% Total% > >1 READ-LINE 39.34 > >53.08 39.34 > > Callers > > PARSE-TEXT > >53.08 > > Calls > > SB-INT:FAST-READ-CHAR- > >REFILL 9.00 > > SB-KERNEL:%SHRINK- > >VECTOR 3.32 > > SB-KERNEL:UB32-BASH- > >COPY 1.42 > >2 (FLET #:CLEANUP-FUN-[CALL-WITHOUT-INTERRUPTS]24) 24.64 > >24.64 63.98 > >3 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII > >7.58 9.00 71.56 > > Callers > > SB-INT:FAST-READ-CHAR- > >REFILL 9.00 > > Calls > > SB-IMPL::REFILL-INPUT- > >BUFFER 1.42 > >4 PARSE-TEXT 6.16 > >93.84 77.73 > > Callers > > NIL > >93.84 > > Calls > > READ-LINE > >53.08 > > SB- > >IMPL::GETHASH3 5.21 > > SB-VM::GENERIC- > >+ 4.74 > > MAKE-HASH- > >TABLE 3.32 > > SB-KERNEL:UB8-BASH- > >COPY 2.84 > > SB-KERNEL:% > >PUTHASH 0.95 > >5 SB-VM::GENERIC-+ > >4.74 4.74 82.46 > >6 SB-KERNEL:%SHRINK-VECTOR > >3.32 3.32 85.78 > >7 SB-KERNEL:UB8-BASH-COPY > >2.84 2.84 88.63 > >8 SB-KERNEL:%SP-STRING-COMPARE > >2.37 2.37 91.00 > >9 MAKE-HASH-TABLE > >2.37 3.32 93.36 > >10 SB-KERNEL:STRING=* > >1.42 3.79 94.79 > >11 SB-KERNEL:UB32-BASH-COPY > >1.42 1.42 96.21 > >12 (FLET #:BODY-FUN-[GETHASH3]1076) > >0.95 5.21 97.16 > >13 (FLET SB-IMPL::TRICK) > >0.47 0.47 97.63 > >14 SB-IMPL::REFILL-INPUT-BUFFER > >0.47 1.42 98.10 > >15 SB-IMPL::CEIL-POWER-OF-TWO > >0.47 0.47 98.58 > >16 (FLET #:CLEANUP-FUN-[%PUTHASH]1409) > >0.47 0.47 99.05 > >17 SB-KERNEL:%PUTHASH > >0.47 0.95 99.53 > >18 SB-IMPL::%MAKE-HASH-TABLE > >0.47 0.47 100.00 > > > > > > > > > > > >*** after rising +ansi-stream-in-buffer-length+ to 4096 and > >recompiling sbcl > > > >note that consing fall down drastically but due to the problem with > >REPLACE that Christoph mentioned it got slower. but hopefully Juho has > >something to say here. > > > >Evaluation took: > > 3.59 seconds of real time > > 3.272205 seconds of user run time > > 0.316019 seconds of system run time > > [Run times include 0.528 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 260,260,240 bytes consed. > > > >Rank Name Self% > >Cumul% Total% > >1 REPLACE 16.42 > >49.25 16.42 > > Callers > > READ-LINE > >49.25 > > > >REPLACE 0.37 > > Calls > > SB-IMPL::OPTIMIZED-DATA-VECTOR-SET > >10.07 > > SB-KERNEL:HAIRY-DATA-VECTOR- > >SET 9.70 > > SB-KERNEL:HAIRY-DATA-VECTOR- > >REF 7.84 > > SB-IMPL::OPTIMIZED-DATA-VECTOR- > >REF 2.61 > > SB-IMPL::OPTIMIZED-DATA-VECTOR- > >REF 2.24 > > > >LENGTH 0.37 > > > >REPLACE 0.37 > >2 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII 11.57 > >16.42 27.99 > > Callers > > SB-INT:FAST-READ-CHAR-REFILL > >16.42 > > Calls > > SB-IMPL::REFILL-INPUT- > >BUFFER 4.85 > >3 (FLET #:CLEANUP-FUN-[CALL-WITHOUT-INTERRUPTS]24) 10.82 > >10.82 38.81 > >4 READ-LINE 10.07 > >76.12 48.88 > > Callers > > PARSE-TEXT > >76.12 > > Calls > > REPLACE > >49.25 > > SB-INT:FAST-READ-CHAR-REFILL > >16.42 > > SB-KERNEL:%SHRINK- > >VECTOR 0.37 > >5 SB-IMPL::OPTIMIZED-DATA-VECTOR-SET 10.07 > >10.07 58.96 > > Callers > > REPLACE > >10.07 > >6 SB-KERNEL:HAIRY-DATA-VECTOR-SET > >9.70 9.70 68.66 > >7 PARSE-TEXT 7.84 > >95.90 76.49 > >8 SB-KERNEL:HAIRY-DATA-VECTOR-REF > >7.84 7.84 84.33 > >9 SB-IMPL::OPTIMIZED-DATA-VECTOR-REF > >2.61 2.61 86.94 > >10 SB-IMPL::OPTIMIZED-DATA-VECTOR-REF > >2.24 2.24 89.18 > >11 SB-KERNEL:UB8-BASH-COPY > >1.87 1.87 91.04 > >12 SB-KERNEL:STRING=* > >1.49 2.24 92.54 > >13 SB-KERNEL:%PUTHASH > >0.75 1.12 93.28 > >14 SB-KERNEL:%SP-STRING-COMPARE > >0.75 0.75 94.03 > > > > > > > > > >*** after getting rid of the REPLACE problem with an ugly hack in > >ansi-stream-read-line that helps propagating the type to REPLACE: > > > >Evaluation took: > > 2.597 seconds of real time > > 2.356147 seconds of user run time > > 0.240015 seconds of system run time > > [Run times include 0.532 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 260,262,768 bytes consed. > > > >Rank Name Self% > >Cumul% Total% > >1 PARSE-TEXT 16.40 > >96.83 16.40 > >2 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII 14.81 > >18.52 31.22 > > Callers > > SB-INT:FAST-READ-CHAR-REFILL > >18.52 > > Calls > > SB-IMPL::REFILL-INPUT- > >BUFFER 3.70 > >3 READ-LINE 13.76 > >61.90 44.97 > > Callers > > PARSE-TEXT > >61.90 > > Calls > > REPLACE > >29.63 > > SB-INT:FAST-READ-CHAR-REFILL > >18.52 > >4 (FLET #:CLEANUP-FUN-[CALL-WITHOUT-INTERRUPTS]24) 13.76 > >13.76 58.73 > > Callers > > SB-UNIX::CALL-WITHOUT-INTERRUPTS > >13.76 > >5 REPLACE 8.47 > >29.63 67.20 > > Callers > > READ-LINE > >29.63 > > > >REPLACE 0.53 > > Calls > > SB-KERNEL:HAIRY-DATA-VECTOR- > >REF 7.41 > > SB-KERNEL:HAIRY-DATA-VECTOR- > >SET 5.82 > > SB-IMPL::OPTIMIZED-DATA-VECTOR- > >SET 3.70 > > SB-IMPL::OPTIMIZED-DATA-VECTOR- > >REF 3.70 > > > >REPLACE 0.53 > > > >LENGTH 0.53 > >6 SB-KERNEL:HAIRY-DATA-VECTOR-REF > >7.41 7.41 74.60 > >7 SB-KERNEL:HAIRY-DATA-VECTOR-SET > >5.82 5.82 80.42 > >8 SB-KERNEL:UB8-BASH-COPY > >4.23 4.23 84.66 > >9 SB-IMPL::OPTIMIZED-DATA-VECTOR-REF > >3.70 3.70 88.36 > >10 SB-IMPL::OPTIMIZED-DATA-VECTOR-SET > >3.70 3.70 92.06 > >11 (FLET #:BODY-FUN-[%PUTHASH]1355) > >1.59 2.12 93.65 > >12 SB-KERNEL:%SP-STRING-COMPARE > >1.06 1.06 94.71 > >13 (FLET #:BODY-FUN-[GETHASH3]1076) > >1.06 2.65 95.77 > >14 LENGTH > >0.53 0.53 96.30 > >15 SB-KERNEL::%MAKE-INSTANCE-WITH-LAYOUT > >0.53 0.53 96.83 > >16 SB-IMPL::REFILL-INPUT-BUFFER > >0.53 3.70 97.35 > >17 SB-KERNEL:%UNARY-TRUNCATE > >0.53 0.53 97.88 > >18 SB-THREAD::MAKE-SPINLOCK > >0.53 0.53 98.41 > >19 (FLET SB-IMPL::TRICK) > >0.53 0.53 98.94 > >20 SB-KERNEL:%SXHASH-SIMPLE-STRING > >0.53 1.06 99.47 > > > > > > > > > >*** and an interesting last piece: using (ppcre:split "~" line > >:sharedp t), adding a (declare (type simple-base-string > >target-string)) to split and avoiding apply #'subseq, it's hardly > >slower: > > > >Evaluation took: > > 3.028 seconds of real time > > 2.772173 seconds of user run time > > 0.256016 seconds of system run time > > [Run times include 0.556 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 269,862,768 bytes consed. > > > >-- > > attila<split- > >sequence.lisp><sbcl.diff>--------------------------------------------- > >---------------------------- > >This SF.net email is sponsored by: Microsoft > >Defy all challenges. Microsoft(R) Visual Studio 2005. > >http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >_______________________________________________ > >Sbcl-devel mailing list > >Sbc...@li... > >https://lists.sourceforge.net/lists/listinfo/sbcl-devel > > -- > Gary Warren King, metabang.com > Cell: (413) 559 8738 > Fax: (206) 338-4052 > gwkkwg on Skype * garethsan on AIM > > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Sbcl-devel mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-devel |
From: Attila L. <att...@gm...> - 2007-12-01 15:09:40
|
> Hi Attila, > > Thanks for all of your work! i'm glad it helps. i was just pasting the results into a file and at the end i've thought it may be interesting to others. > I'm wondering if you or anyone else had any thoughts on the following > questions: > > a. While this is a /dramatic/ improvement: > - It's still much slower than Python, and you mentioned that your > results were comparable to Python. Any thoughts why? Did you get > the same results on Gary's fake-data.tgz? i was using the fake-data file, but chopped off the end at around 40 MB, iirc. i didn't want swapping and disk reading to influence the measurements. at the end it was about 1/4th of the initial time, so that 23 secs for you should have gone down to 5.75. it was on linux x86_64. but please note that as i've mentioned, applying that simple-base-string patch to sbcl disables optimizations for a REPLACE call in ansi-stream-read-line. you can read that in the compiler notes when C-cC-c'ing ansi-stream-read-line. it's relatively easy to rewrite the code to reenable it by adding an ecase for the stream element type and adding two distinct calls to REPLACE. that turned the 3x speedup for me into a 4x speedup. but that's a kludge, so it's not included in the patch i've sent. (see Christophe's comment on a possible proper solution) > - The reduction in consing in the example, was dramatic, but didn't > compare to your results. Again, any thoughts why? Did you get > the same results on Gary's fake-data.tgz? > > b. If I run (sb-ext:gc :full t) after each run, I can repeatedly call > PARSE-TEXT with no issues, if I don't make the gc call, I get dropped > into the debugger, is this to be expected? that's a limitation of the current copying gc in sbcl, but i don't know much more about the details. > d. You wrote: > > >*** and an interesting last piece: using (ppcre:split "~" line > > >:sharedp t), adding a (declare (type simple-base-string > > >target-string)) to split and avoiding apply #'subseq, it's hardly > > >slower: > To clarify, did you put > (declare (type simple-base-string target-string)) > in api.lisp? yes > Could you explain how you: "avoiding apply #'subseq"? i've replaced the (funcall substr-fn ...) to a simple (subseq ...) and C-cC-c'd the function (as opposed to my bogus comment about #'apply i wrote from memory). using funcall/apply hinders the application of numerous compiler optimizations. the key thing is using the profiler and checking the compiler notes about failed optimizations when the (optimize (speed 3)) declaration is in effect. i was using sb-sprof with Juho's brilliant slime-profile-browser: http://jsnell.iki.fi/blog/archive/2006-11-19-sb-sprof.html hth, -- attila |
From: Christophe R. <cs...@ca...> - 2007-12-05 21:22:48
|
Brian Downing <bd-...@la...> writes: > * I couldn't use any of the nice external CL libraries that deal in > strings. (*cough* CL-PPCRE *cough*) So, as the major culprit for making the theoretical decision to have base-strings handle ASCII only, I'll ask: how hard would it be to adapt CL-PPCRE or similar libraries to work on octet vectors, using the same programmer-friendly stringy input format that currently exists? The other point, parsing file contents where the encoding is not a priori known, sounds like an excellent argument for implementing (setf stream-external-format) and bivalent streams, but not so much for stuffing arbitrary binary data into a sequence of characters. I'll admit, I have a strong bias against pandering to a particular interpretation of what happens to be convenient today when that is the only argument that is advanced for doing something; if people could convince me that latin-1 is the Right Thing in all circumstances for BASE-CHAR to represent, then I would not argue against it. (Duh. :-) Cheers, Christophe |
From: Brian D. <bd-...@la...> - 2007-12-05 22:24:27
|
On Wed, Dec 05, 2007 at 09:22:37PM +0000, Christophe Rhodes wrote: > Brian Downing <bd-...@la...> writes: > > * I couldn't use any of the nice external CL libraries that deal in > > strings. (*cough* CL-PPCRE *cough*) > > So, as the major culprit for making the theoretical decision to have > base-strings handle ASCII only, I'll ask: how hard would it be to > adapt CL-PPCRE or similar libraries to work on octet vectors, using > the same programmer-friendly stringy input format that currently > exists? I dunno, but I do know that I didn't want to. :) > The other point, parsing file contents where the encoding is not a > priori known, sounds like an excellent argument for implementing (setf > stream-external-format) and bivalent streams, but not so much for > stuffing arbitrary binary data into a sequence of characters. The thing is what I really want is a "bivalent array." I wanted it all, in memory, ready to be accessed in either form. Calling stream accessors per character was way too slow. A latin-1 base-string is as close as I could figure to getting that and still being able to do string operations on it. > I'll admit, I have a strong bias against pandering to a particular > interpretation of what happens to be convenient today when that is the > only argument that is advanced for doing something; if people could > convince me that latin-1 is the Right Thing in all circumstances for > BASE-CHAR to represent, then I would not argue against it. (Duh. :-) Well, latin-1 is already a higher-class citizen than any other encoding, as the first 256 Unicode code points are latin-1. Also, it's not as if the first 128 are any more stable. A lot of Japanese code pages have #\¥ at #x5C, for instance. And then there's EBCDIC... So I'm not sure assuming latin-1 for BASE-CHAR is any less sane than assuming ASCII for BASE-CHAR. And it's quite a bit more convenient for things like I just described. -bcd |
From: David J. N. <dav...@gm...> - 2007-12-03 22:12:05
|
Hi Juho, Many thanks for taking the time to craft a solution - it runs extremely fast - much faster than the Python script and dramatically reduced the number of bytes consed: Evaluation took: 1.937 seconds of real time 1.791011 seconds of user run time 0.119532 seconds of system run time [Run times include 0.432 seconds GC run time.] 0 calls to %EVAL 0 page faults and 103,786,536 bytes consed. However, on the third attempt to run it I get System call error 12 (Cannot allocate memory) [Condition of type SB-POSIX:SYSCALL-ERROR] is there a way around this? > Ok. May I ask why you're interested in this? Sure, for two reasons: 1. We have a production app that processes large quantities of texual and binary data. As part of textual side, we're trying to tune some learning algorithms, and want to work in CL. 2. Reading text files is a common task, and we're assuming that if we're running into problems, it's likely someone is also, and this could be a barrier to them adopting CL. > Reading text files and manipulating strings are really the bread and > butter features of scripting languages, and will have been optimized > accordingly. So it's somewhat unreasonable to expect SBCL to be > exactly as fast in the generic case. OK, but: LispWorks performs quite nicely compare to the Python program's 4.246053 seconds: LispWorks against a text file w/out unicode using original code as posted by Gary: User time = 6.106 System time = 0.202 Elapsed time = 6.349 Allocation = 756144540 bytes 0 Page faults Timing the evaluation of (PARSE-TEXT "fake-data.txt") LispWorks against a text file w/out unicode using original code as posted by Gary using cl-ppcre:split rather than split-sequence: split-sequence: User time = 6.426 System time = 0.240 Elapsed time = 6.716 Allocation = 602020240 bytes 0 Page faults Timing the evaluation of (PARSE-TEXT-USING-SPLIT "fake-data.txt") SBCL 1.0.12.14, with --dynamic-space-size 1500, against a text file w/out unicode using original code as posted by Gary: TEXT-PARSER> (time (parse-text "fake-data.txt")) Evaluation took: 19.346 seconds of real time 16.760124 seconds of user run time 2.497303 seconds of system run time [Run times include 8.315 seconds GC run time.] 0 calls to %EVAL 0 page faults and 4,183,290,560 bytes consed. SBCL 1.0.12.14 with --dynamic-space-size 1500, against a text file w/out unicode using original code as posted by Gary using cl-ppcre:split rather than split-sequence: Evaluation took: 16.284 seconds of real time 13.648916 seconds of user run time 2.546111 seconds of system run time [Run times include 8.323 seconds GC run time.] 0 calls to %EVAL 0 page faults and 3,356,522,032 bytes consed. And, then there's the gc issue - the orginal sbcl program can't be run twice without dropping into the gc ... so thanks for the link about gc characteristics and tuning: > http://thread.gmane.org/gmane.lisp.steel-bank.devel/9879/focus=9883 unfortunately, after doing (define-alien-variable gencgc-oldest-gen-to-gc char) (setf gencgc-oldest-gen-to-gc 0) at the top level, the program drops into the ldb before completing even one run. BTW, Attila's solution can be run repeatedly without memory issues, I assume due to the "non-unicode" savings. My question to the sbcl developers is: Is there a solution to reading a plain (non-unicode) text file into a data structure, that can be incorporated into sbcl, maybe something along the lines of what Attila provided, that will result in speed and memory usage on par with Python, or LispWorks, or in the best case Juho's program? Again, many thanks for the help - we hope the solutions being offered here are useful to others! Cheers, David On Sun, Dec 02, 2007 at 12:58:52AM +0200, Juho Snellman wrote: > "David J. Neu" <dav...@gm...> writes: > > Hi Attila, > > > > Thanks for all of your work! > > > > We're very interested in improving the performance of the program that > > Gary posted, and have tried some comparisions with and without your > > changes, and using Python. > > Ok. May I ask why you're interested in this? Reading text files and > manipulating strings are really the bread and butter features of > scripting languages, and will have been optimized accordingly. So it's > somewhat unreasonable to expect SBCL to be exactly as fast in the > generic case. > > That said, I've attached a version of the benchmark program that runs > in 2.5s versus 4.3s for the Python version. > > ;;;; Utilities to mmap a file directly into an SBCL base string > > (defmacro with-mmaped-base-string ((string file) &body body) > (let ((handle-var (gensym)) > (stream-var (gensym))) > `(with-open-file (,stream-var ,file) > (let ((,handle-var (mmap-as-base-string ,stream-var))) > (unwind-protect > (let ((,string (mmap-handle-string ,handle-var))) > ,@body) > (mmap-close ,handle-var)))))) > > (defstruct mmap-handle > (string (coerce "" 'base-string) :type simple-base-string) > fd > address > length) > > (defun mmap-close (handle) > (sb-posix:munmap (mmap-handle-address handle) > (mmap-handle-length handle))) > > (defun mmap-as-base-string (stream) > (declare (optimize debug) > (notinline sb-posix::mmap)) > (with-open-file (devnull "/dev/null") > (let* ((length (file-length stream)) > (sap1 (sb-posix:mmap nil > (+ length 4096) > (logior sb-posix:prot-read sb-posix:prot-write) > sb-posix:map-private > (sb-impl::fd-stream-fd stream) > 0)) > (sap2 (sb-posix:mmap (sb-sys:sap+ sap1 4096) > length > sb-posix:prot-read > (logior sb-posix:map-private sb-posix:map-fixed) > (sb-impl::fd-stream-fd stream) > 0)) > (handle (make-mmap-handle :address sap1 > :length length))) > ;; simple-base-string header word > (setf (sb-sys:sap-ref-word sap2 (- (* 2 sb-vm:n-word-bytes))) > sb-vm:simple-base-string-widetag) > ;; simple-base-string length word (as fixnum) > (setf (sb-sys:sap-ref-word sap2 (- (* 1 sb-vm:n-word-bytes))) > (ash length sb-vm:n-fixnum-tag-bits)) > (setf (mmap-handle-string handle) > (sb-kernel:%make-lisp-obj > (logior sb-vm:other-pointer-lowtag > (- (sb-sys:sap-int sap2) (* 2 sb-vm:n-word-bytes))))) > handle))) > > ;;;; Implement the benchmark > > (defun split-and-trim-sequence (delimiter string start-of-line end-of-line trim) > (declare (type simple-base-string string) > (type fixnum start-of-line end-of-line) > (optimize speed)) > (loop for start = start-of-line then (1+ end) > for end = (position delimiter string :start start :end end-of-line) > for end-pos = (or end end-of-line) > for length = (- end-pos start) > do (loop while (< start end-pos) > while (eql (aref string start) trim) > do (incf start)) > do (loop until (eql end-pos start) > while (eql (aref string (1- end-pos)) trim) > do (decf end-pos)) > collect (if (< length 64) > ;; A displaced array takes around 64 bytes. For strings > ;; shorter than 64 base-chars we might as well just > ;; make a simple string. > (subseq string start end-pos) > (make-array length > :element-type 'base-char > :displaced-to string > :displaced-index-offset start)) > while (and end (< end end-of-line)))) > > (defun parse-text (filename) > (declare (optimize speed)) > (with-mmaped-base-string (string filename) > ;; Note that the contents of the hash-table won't be valid outside > ;; the dynamic scope of WITH-MMAPED-BASE-STRING, since we're > ;; displacing arrays to STRING. > (let ((ht (make-hash-table :test 'equal))) > (loop for start = 0 then (1+ end) > for end = (position #\Newline string :start start) > for end-pos = (or end (length string)) > do (let ((fields (split-and-trim-sequence #\~ string start end-pos > #\space))) > (when (= (length (the list fields)) 3) > (destructuring-bind (id attribute value) fields > (when (not (gethash id ht)) > (setf (gethash id ht) (make-hash-table :test 'equal))) > (let ((fields-ht (gethash id ht))) > (setf (gethash attribute fields-ht) value))))) > while end) > (print (hash-table-count ht)) > nil))) > > (You might object that the program is more complex than the original > code. But if the factor of 2 difference in runtimes between the > earlier Lisp solution and Python is an issue for you, surely a little > extra complexity is a price worth paying for something that's a factor > of 2 faster than the Python program.) > > > b. If I run (sb-ext:gc :full t) after each run, I can repeatedly call > > PARSE-TEXT with no issues, if I don't make the gc call, I get dropped > > into the debugger, is this to be expected? > > Yes. > > > c. Juho had mentioned that tuning the garbage collector was discussed > > on this list, would this address point b., and could someone point me > > to the appropriate post? > > > > I found this post > > http://sourceforge.net/mailarchive/message.php?msg_id=l37iskysld.fsf%40kyle.netcamp.se > > but am not sure how to directly apply the discussion. > > That is a completely unrelated discussion. See for example: > > http://thread.gmane.org/gmane.lisp.steel-bank.devel/9879/focus=9883 > > -- > Juho Snellman |
From: John W. <jo...@ne...> - 2007-12-04 03:18:34
|
This might not be apropos, but I just came across this tech note from Allegro where they are doing a very similar thing with CL: using it to parse strings from gigantic text files. The way they combatted the performance issue was to custom-build a non-consing version of read-line. I wonder if such a thing would improve SBCL's performance in this example? http://www.franz.com/support/tech_corner/cons-tricks-121306.lhtml John On Dec 3, 2007, at 6:11 PM, David J. Neu wrote: > Many thanks for taking the time to craft a solution - it runs > extremely fast - much faster than the Python script and dramatically > reduced the number of bytes consed: > > Evaluation took: > 1.937 seconds of real time > 1.791011 seconds of user run time > 0.119532 seconds of system run time > [Run times include 0.432 seconds GC run time.] > 0 calls to %EVAL > 0 page faults and > 103,786,536 bytes consed. |
From: David J. N. <dav...@gm...> - 2007-12-04 12:21:45
|
Yes, this would be very nice, and there have also be many suggestions in this thread about "improving" read-line itself such as: - having its return type depend on its stream's element-type, so e.g. that it would in this example return a base-string - increasing +ANSI-STREAM-IN-BUFFER-LENGTH+ Cheers, David On Mon, Dec 03, 2007 at 11:17:58PM -0400, John Wiegley wrote: > This might not be apropos, but I just came across this tech note from > Allegro > where they are doing a very similar thing with CL: using it to parse > strings > from gigantic text files. The way they combatted the performance > issue was > to custom-build a non-consing version of read-line. I wonder if such > a thing > would improve SBCL's performance in this example? > > http://www.franz.com/support/tech_corner/cons-tricks-121306.lhtml > > John > > On Dec 3, 2007, at 6:11 PM, David J. Neu wrote: > > > Many thanks for taking the time to craft a solution - it runs > > extremely fast - much faster than the Python script and dramatically > > reduced the number of bytes consed: > > > > Evaluation took: > > 1.937 seconds of real time > > 1.791011 seconds of user run time > > 0.119532 seconds of system run time > > [Run times include 0.432 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 103,786,536 bytes consed. > > ------------------------------------------------------------------------- > SF.Net email is sponsored by: The Future of Linux Business White Paper > from Novell. From the desktop to the data center, Linux is going > mainstream. Let it simplify your IT future. > http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 > _______________________________________________ > Sbcl-devel mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-devel |
From: Juho S. <js...@ik...> - 2007-12-04 16:15:20
|
"David J. Neu" <dav...@gm...> writes: > However, on the third attempt to run it I get > > System call error 12 (Cannot allocate memory) > [Condition of type SB-POSIX:SYSCALL-ERROR] > > is there a way around this? Hmm... looks like an off-by-4096, the last mmaped page was never freed -> address space becomes fragmented. I wonder why it didn't fail on Linux. Maybe the following would help? (defvar +page-size+ (sb-posix:getpagesize)) (defun mmap-as-base-string (stream) (declare (optimize debug) (notinline sb-posix::mmap)) (with-open-file (devnull "/dev/null") (let* ((file-length (file-length stream)) (map-length (+ file-length +page-size+)) (sap1 (sb-posix:mmap nil map-length (logior sb-posix:prot-read sb-posix:prot-write) sb-posix:map-private (sb-impl::fd-stream-fd stream) 0)) (sap2 (sb-posix:mmap (sb-sys:sap+ sap1 +page-size+) file-length sb-posix:prot-read (logior sb-posix:map-private sb-posix:map-fixed) (sb-impl::fd-stream-fd stream) 0)) (handle (make-mmap-handle :address sap1 :length map-length))) ;; simple-base-string header word (setf (sb-sys:sap-ref-word sap2 (- (* 2 sb-vm:n-word-bytes))) sb-vm:simple-base-string-widetag) ;; simple-base-string length word (as fixnum) (setf (sb-sys:sap-ref-word sap2 (- (* 1 sb-vm:n-word-bytes))) (ash file-length sb-vm:n-fixnum-tag-bits)) (setf (mmap-handle-string handle) (sb-kernel:%make-lisp-obj (logior sb-vm:other-pointer-lowtag (- (sb-sys:sap-int sap2) (* 2 sb-vm:n-word-bytes))))) handle))) > > Ok. May I ask why you're interested in this? > Sure, for two reasons: > > 1. We have a production app that processes large quantities of texual > and binary data. As part of textual side, we're trying to tune some > learning algorithms, and want to work in CL. It's not obvious to me that Gary's benchmark accurately models most processes I've seen that process large quantities of text data. Usually they for example don't hang on to all of the data for the duration of the whole process. > > Reading text files and manipulating strings are really the bread and > > butter features of scripting languages, and will have been optimized > > accordingly. So it's somewhat unreasonable to expect SBCL to be > > exactly as fast in the generic case. > OK, but: > > LispWorks performs quite nicely compare to the Python program's > 4.246053 seconds: > > LispWorks against a text file w/out unicode using original code > as posted by Gary: > User time = 6.106 That's still basically crappy performance. Obviously not nearly as crappy as SBCL, but it's still 2x slower than an idiomatically written Python program would be. > > http://thread.gmane.org/gmane.lisp.steel-bank.devel/9879/focus=9883 > > unfortunately, after doing > (define-alien-variable gencgc-oldest-gen-to-gc char) > (setf gencgc-oldest-gen-to-gc 0) > at the top level, the program drops into the ldb before > completing even one run. Sure. You're trying to stuff 230*4 (unicode)*2 (copying gc)+a bit (non-strings) of data into 1500MB, which can reasonably be expected to fail. > My question to the sbcl developers is: Is there a solution to reading > a plain (non-unicode) text file into a data structure, that can be > incorporated into sbcl, maybe something along the lines of what Attila > provided I'm going to be committing a nicer implementation of READ-LINE once I get some machine with my ssl keys back on the net. This is basically just fixing the issues that the current implementation has with long lines (this reduces the amount of time for reading in the input file from 5.5s to 3.5s). I don't think doing any base-string specific handling in READ-LINE would make much sense as long as the fd-stream character input buffer is implemented as a (SIMPLE-ARRAY CHARACTER). The simple solution would be to just wrap the mmap hack into a couple of gray streams for base-char or (unsigned-byte 8) input. Should be trivial, it's only one more data copy than that direct solution, but it would hide all the ugliness. I would rather see this in a user library though, since once fu-streams have been implemented (ha, ha), those gray streams would basically be obsolete. > that will result in speed and memory usage on par with Python Unlikely to ever happen. > or LispWorks, Somewhat improbable in the near future on that exact program out of the box. With slight rewriting and using a gray stream like above, sure, right now. > or in the best case Juho's program? That program itself is obviously not going to go into SBCL :-) I'm not wild about adding support for mmaping stuff as lisp arrays in SBCL either, though it has been proposed before. -- Juho Snellman |
From: David J. N. <dav...@gm...> - 2007-12-04 18:09:17
|
Hi Juho, On Tue, Dec 04, 2007 at 06:15:40PM +0200, Juho Snellman wrote: > "David J. Neu" <dav...@gm...> writes: > > However, on the third attempt to run it I get > > > > System call error 12 (Cannot allocate memory) > > [Condition of type SB-POSIX:SYSCALL-ERROR] > > > > is there a way around this? > > Hmm... looks like an off-by-4096, the last mmaped page was never freed > -> address space becomes fragmented. I wonder why it didn't fail on > Linux. Maybe the following would help? > Yep, that fixed it - BTW, I'm running FreeBSD 6.2. > It's not obvious to me that Gary's benchmark accurately models most > processes I've seen that process large quantities of text > data. Usually they for example don't hang on to all of the data for > the duration of the whole process. I can't say for sure, but it's sure fast when running experiments, or for keeping a big lookup table in a production app. > I'm going to be committing a nicer implementation of READ-LINE once I > get some machine with my ssl keys back on the net. This is basically > just fixing the issues that the current implementation has with long > lines (this reduces the amount of time for reading in the input file > from 5.5s to 3.5s). Great - thanks! > I don't think doing any base-string specific handling in READ-LINE > would make much sense as long as the fd-stream character input buffer > is implemented as a (SIMPLE-ARRAY CHARACTER). Hmmm, does that imply that there's no way (using classic functions like read-line) to pull in a plain old ASCII text file w/out ending up with the unicode bloat issue? > The simple solution would be to just wrap the mmap hack into a couple > of gray streams for base-char or (unsigned-byte 8) input. Should be > trivial, it's only one more data copy than that direct solution, but > it would hide all the ugliness. I would rather see this in a user > library though, since once fu-streams have been implemented (ha, ha), > those gray streams would basically be obsolete. Seems like a wonderfully powerful tool to have available. > > or LispWorks, > > Somewhat improbable in the near future on that exact program out of > the box. With slight rewriting and using a gray stream like above, > sure, right now. Can I ask why? > > or in the best case Juho's program? > > That program itself is obviously not going to go into SBCL :-) I'm not > wild about adding support for mmaping stuff as lisp arrays in SBCL > either, though it has been proposed before. > Again, seems like a wonderfully powerful tool. Bottomline, I understand that SBCL can't beat every language on every benchmark, but on a task as common as reading a text file, it seems that there'd be great benefit in: - the mods to read-line that you suggested, as well as a way to get plain vanilla ascii strings - the mmaping stuff - something like Franz has implemented http://franz.com/support/tech_corner/cons-tricks-121306.lhtml Many thanks for your help, we use SBCL here everyday in prodution, and truly value the improvements that are being made. Cheers, David |
From: Gary K. <gw...@me...> - 2007-12-09 19:21:23
|
Hi Nathan, Can you say more or point me towards a reference for how Python does handle strings? Why can't Python be using destructive operations? thanks, On Nov 21, 2007, at 4:22 PM, Nathan Froyd wrote: >> 1. (Common) Lisp doesn't include any convenient destructive string >> operations or even a buffered read-line approach (Franz has one now >> but it's not standard). Having these around would, I think, be a good >> thing. > > The Python code you posted earlier doesn't use destructive string > operations--indeed, it can't, due to the nature of Python strings. -- Gary Warren King, metabang.com Cell: (413) 559 8738 Fax: (206) 338-4052 gwkkwg on Skype * garethsan on AIM |
From: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX - 2007-12-09 21:49:55
|
On Dec 9, 2007 8:21 PM, Gary King <gw...@me...> wrote: > Hi Nathan, > > Can you say more or point me towards a reference for how Python does > handle strings? Why can't Python be using destructive operations? Because its strings are by definition immutable. bye, Erik. > > thanks, > > On Nov 21, 2007, at 4:22 PM, Nathan Froyd wrote: > > >> 1. (Common) Lisp doesn't include any convenient destructive string > >> operations or even a buffered read-line approach (Franz has one now > >> but it's not standard). Having these around would, I think, be a good > >> thing. > > > > The Python code you posted earlier doesn't use destructive string > > operations--indeed, it can't, due to the nature of Python strings. > > -- > Gary Warren King, metabang.com > Cell: (413) 559 8738 > Fax: (206) 338-4052 > gwkkwg on Skype * garethsan on AIM > > > > > > ------------------------------------------------------------------------- > SF.Net email is sponsored by: > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php > > _______________________________________________ > Sbcl-devel mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-devel > |