From: Pete K. <pet...@ka...> - 2005-01-22 02:53:27
|
[disclaimer: I'm still a lisp newbie, please be gentle.] I am interested in continuing my voyage into CL and as part of the process I am in a position to slowly introduce it at my workplace. The first project that I decided to undertake was replacing some log file processing utilities that had been written in Python. I was especially excited to discover that SBCL and cl-ppcre were able to process the logs faster than my python program without actually trying to optimize either programs and while keeping them very simple and easy to understand. The only problem is that READ-LINE in dreadfully slow. It takes about 16s to simply read my log file in SBCL, whereas a simple python program could do the same in about 1s. So although Lisp can process the data faster, getting it the data is another story altogether. After chatting with some folks on IRC, they suggested various ways of improving the speed ranging from doing one large read, switching CL implementations, and such. I did test the speed of the other implementations [1] and although they were faster, compared to my original python program, they are still very slow. I'd really prefer to use READ-LINE versus trying to hand optimize this function as I believe the underlying implementation should be doing the optimization in this case, and I won't have much luck convincing others at work to use lisp if I have to go through hoops for this one utility. That leads me to my question, are there some settings that I am not aware of to tune SBCL and the IO subsystem? Here is my fn: (defun tester (file) (declare (optimize (speed 3) (safety 0) (debug 0))) (with-open-file (in file) (do ((line (read-line in nil :eof) (read-line in nil :eof))) ((eql line :eof) t) ;; do something here ))) I ran strace on both the python program and SBCL to see how many read(2)s were issued and how much each read was gathering (8k chunks for python and 4k chunks for SBCL). At that time I really had no idea where the bottleneck was so I wanted to make sure SBCL was at least using buffered streams and not calling read(2) for every character. This path of reasoning did not yield any insight into the large discrepancy in speeds, so I googled for SBCL profiling and discovered sb-sprof. I'm not quite sure how to interpret the full results [2], but here is the top list of functions consuming most of the time. Does this look normal? Could anyone help interpret the results? Is there some magic switch I can flip to improve my performance? I am using SBCL from CVS as of a day or so ago without unicode support running on a debian 2.4 box. Self Cumul Total Nr Count % Count % Count % Function ------------------------------------------------------------------------ 1 400 27.8 460 31.9 400 27.8 SB-KERNEL:HAIRY-DATA-VECTOR-SET 2 289 20.1 1084 75.2 689 47.8 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII 3 241 16.7 324 22.5 930 64.5 ARRAY-DIMENSION 4 220 15.3 1362 94.5 1150 79.8 "hairy arg processor for top level local call READ-LINE" 5 147 10.2 147 10.2 1297 90.0 ARRAY-RANK 6 60 4.2 60 4.2 1357 94.2 "no debug information for frame" 7 37 2.6 37 2.6 1394 96.7 SB-KERNEL:BIT-BASH-COPY In summary, I'm loving my voyage into the Lisp world, but perhaps I'm trying to use it for the wrong application in this case (large amounts of log file processing). Am I trying to make a square peg fit into a circle? Or is there something that I'm missing (very likely)? Thanks for reading this far! Pete [1] Results of my crude test running on various CL implementations. SBCL: CL-USER> (time (tester "/tmp/pgw-logs/trapd.log.ovnyc00p")) Evaluation took: 15.455 seconds of real time 13.17 seconds of user run time 0.4 seconds of system run time 0 page faults and 87,576,968 bytes consed. CMUCL: * (time (tester "/tmp/pgw-logs/trapd.log.ovnyc00p")) Evaluation took: 9.96 seconds of real time 6.98 seconds of user run time 0.38 seconds of system run time 6,955,855,634 CPU cycles [Run times include 0.53 seconds GC run time] 0 page faults and 92,049,536 bytes consed. Clisp: [5]> (time (tester "/tmp/pgw-logs/trapd.log.ovnyc00p")) Real time: 7.922794 sec. Run time: 7.51 sec. Space: 134691576 Bytes GC: 257, GC time: 0.52 sec. [2] Full output of sb-sprof:report. Number of samples: 1441 Sample interval: 0.01 seconds Total sampling time: 14.41 seconds Number of cycles: 3 Cycle Count % Parts ------------------------------------------------------------------------ 0 0.0 <Cycle 1>... 0 0.0 "#'(LAMBDA NIL (LET # #))" [35] 0 0.0 SWANK::HANDLE-REQUEST [37] 0 0.0 SWANK::CALL-WITH-REDIRECTED-IO [36] ------------------------------------------------------------------------ 7 0.5 <Cycle 2>... 7 0.5 SB-KERNEL:SUB-GC [11] ------------------------------------------------------------------------ 289 20.1 <Cycle 3>... 289 20.1 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII [2] ------------------------------------------------------------------------ Callers Cumul. Function Count % Count % Callees ------------------------------------------------------------------------ 3 0.2 SB-THREAD:CONDITION-WAIT [32] 3 0.2 3 0.2 "FLET #:CLEANUP-FUN-397" [12] ------------------------------------------------------------------------ 3 0.2 (SB-PCL::FAST-METHOD SWANK-BACKEND:RECEIVE NIL) [33] 0 0.0 3 0.2 SB-THREAD:CONDITION-WAIT [32] 3 0.2 "FLET #:CLEANUP-FUN-397" [12] ------------------------------------------------------------------------ 3 0.2 SWANK::READ-FROM-EMACS [34] 0 0.0 3 0.2 (SB-PCL::FAST-METHOD SWANK-BACKEND:RECEIVE NIL) [33] 3 0.2 SB-THREAD:CONDITION-WAIT [32] ------------------------------------------------------------------------ 2 0.1 SWANK::HANDLE-REQUEST [37] 1 0.1 "#'(LAMBDA NIL (LET # #))" [35] 0 0.0 3 0.2 SWANK::READ-FROM-EMACS [34] 3 0.2 (SB-PCL::FAST-METHOD SWANK-BACKEND:RECEIVE NIL) [33] ------------------------------------------------------------------------ 0 0.0 0 0.0 "<Cycle 1>" [1] 1 0.1 SWANK::READ-FROM-EMACS [34] 2 0.1 SWANK::READ-FROM-EMACS [34] ------------------------------------------------------------------------ 1 0.1 SB-KERNEL:SUB-GC [11] 1 0.1 1 0.1 "FLET #:CLEANUP-FUN-219" [14] ------------------------------------------------------------------------ 7 0.5 0 0.0 "<Cycle 2>" [2] 1 0.1 "FLET #:CLEANUP-FUN-219" [14] ------------------------------------------------------------------------ 12 0.8 SB-IMPL::REFILL-FD-STREAM-BUFFER [13] 12 0.8 12 0.8 "foreign function __read" [9] ------------------------------------------------------------------------ 14 1.0 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII [2] 2 0.1 14 1.0 SB-IMPL::REFILL-FD-STREAM-BUFFER [13] 12 0.8 "foreign function __read" [9] ------------------------------------------------------------------------ 60 4.2 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] 75 5.2 ARRAY-DIMENSION [3] 12 0.8 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII [2] 147 10.2 147 10.2 ARRAY-RANK [5] ------------------------------------------------------------------------ 8 0.6 ARRAY-DIMENSION [3] 445 30.9 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII [2] 7 0.5 SB-INT:FAST-READ-CHAR-REFILL [15] 400 27.8 460 31.9 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] 60 4.2 ARRAY-RANK [5] ------------------------------------------------------------------------ 324 22.5 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII [2] 241 16.7 324 22.5 ARRAY-DIMENSION [3] 75 5.2 ARRAY-RANK [5] 8 0.6 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] ------------------------------------------------------------------------ 289 20.1 0 0.0 "<Cycle 3>" [3] 14 1.0 SB-IMPL::REFILL-FD-STREAM-BUFFER [13] 12 0.8 ARRAY-RANK [5] 324 22.5 ARRAY-DIMENSION [3] 445 30.9 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] ------------------------------------------------------------------------ 60 4.2 TESTER [21] 60 4.2 60 4.2 "no debug information for frame" [6] ------------------------------------------------------------------------ 37 2.6 "hairy arg processor for top level local call READ-LINE" [4] 37 2.6 37 2.6 SB-KERNEL:BIT-BASH-COPY [7] ------------------------------------------------------------------------ 13 0.9 "hairy arg processor for top level local call READ-LINE" [4] 13 0.9 13 0.9 SB-KERNEL:SHRINK-VECTOR [8] ------------------------------------------------------------------------ 1092 75.8 "hairy arg processor for top level local call READ-LINE" [4] 1 0.1 1092 75.8 SB-INT:FAST-READ-CHAR-REFILL [15] 7 0.5 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] 1084 75.2 "<Cycle 3>" [3] ------------------------------------------------------------------------ 1362 94.5 TESTER [21] 220 15.3 1362 94.5 "hairy arg processor for top level local call READ-LINE" [4] 37 2.6 SB-KERNEL:BIT-BASH-COPY [7] 13 0.9 SB-KERNEL:SHRINK-VECTOR [8] 1092 75.8 SB-INT:FAST-READ-CHAR-REFILL [15] ------------------------------------------------------------------------ 1422 98.7 SB-IMPL::%TIME [22] 0 0.0 1422 98.7 TESTER [21] 60 4.2 "no debug information for frame" [6] 1362 94.5 "hairy arg processor for top level local call READ-LINE" [4] ------------------------------------------------------------------------ 1422 98.7 SB-INT:EVAL-IN-LEXENV [23] 0 0.0 1422 98.7 SB-IMPL::%TIME [22] 1422 98.7 TESTER [21] ------------------------------------------------------------------------ 1267 87.9 "hairy arg processor for SWANK::EVAL-REGION" [24] 0 0.0 1422 98.7 SB-INT:EVAL-IN-LEXENV [23] 1422 98.7 SB-IMPL::%TIME [22] ------------------------------------------------------------------------ 622 43.2 "#'(LAMBDA NIL (MULTIPLE-VALUE-BIND # # ...))" [25] 0 0.0 1267 87.9 "hairy arg processor for SWANK::EVAL-REGION" [24] 1267 87.9 SB-INT:EVAL-IN-LEXENV [23] ------------------------------------------------------------------------ 331 23.0 (SB-PCL::FAST-METHOD SWANK-BACKEND:CALL-WITH-SYNTAX-HOOKS (T)) [26] 0 0.0 622 43.2 "#'(LAMBDA NIL (MULTIPLE-VALUE-BIND # # ...))" [25] 622 43.2 "hairy arg processor for SWANK::EVAL-REGION" [24] ------------------------------------------------------------------------ 280 19.4 SWANK:LISTENER-EVAL [27] 0 0.0 331 23.0 (SB-PCL::FAST-METHOD SWANK-BACKEND:CALL-WITH-SYNTAX-HOOKS (T)) [26] 331 23.0 "#'(LAMBDA NIL (MULTIPLE-VALUE-BIND # # ...))" [25] ------------------------------------------------------------------------ 0 0.0 280 19.4 SWANK:LISTENER-EVAL [27] 280 19.4 (SB-PCL::FAST-METHOD SWANK-BACKEND:CALL-WITH-SYNTAX-HOOKS (T)) [26] ------------------------------------------------------------------------ 2 0.1 SB-SPROF::SIGPROF-HANDLER [30] 0 0.0 2 0.1 "FLET #:CLEANUP-FUN-2014" [29] 2 0.1 "<Cycle 2>" [2] ------------------------------------------------------------------------ 1 0.1 "foreign function call_into_lisp" [16] 0 0.0 2 0.1 SB-SPROF::SIGPROF-HANDLER [30] 2 0.1 "FLET #:CLEANUP-FUN-2014" [29] ------------------------------------------------------------------------ 1 0.1 "foreign function funcall3" [31] 14 1.0 "foreign function funcall0" [17] 0 0.0 15 1.0 "foreign function call_into_lisp" [16] 1 0.1 SB-SPROF::SIGPROF-HANDLER [30] 14 1.0 "<Cycle 2>" [2] ------------------------------------------------------------------------ 1 0.1 "foreign function interrupt_handle_now" [28] 0 0.0 1 0.1 "foreign function funcall3" [31] 1 0.1 "foreign function call_into_lisp" [16] ------------------------------------------------------------------------ 8 0.6 "foreign function interrupt_handle_now" [28] 8 0.6 8 0.6 "foreign function sigprocmask" [10] ------------------------------------------------------------------------ 8 0.6 "foreign function interrupt_handle_pending" [19] 0 0.0 9 0.6 "foreign function interrupt_handle_now" [28] 1 0.1 "foreign function funcall3" [31] 8 0.6 "foreign function sigprocmask" [10] ------------------------------------------------------------------------ 7 0.5 "foreign function interrupt_maybe_gc_int" [18] 0 0.0 14 1.0 "foreign function funcall0" [17] 14 1.0 "foreign function call_into_lisp" [16] ------------------------------------------------------------------------ 7 0.5 "foreign function interrupt_handle_pending" [19] 0 0.0 7 0.5 "foreign function interrupt_maybe_gc_int" [18] 7 0.5 "foreign function funcall0" [17] ------------------------------------------------------------------------ 15 1.0 "foreign function sigtrap_handler" [20] 0 0.0 15 1.0 "foreign function interrupt_handle_pending" [19] 8 0.6 "foreign function interrupt_handle_now" [28] 7 0.5 "foreign function interrupt_maybe_gc_int" [18] ------------------------------------------------------------------------ 0 0.0 15 1.0 "foreign function sigtrap_handler" [20] 15 1.0 "foreign function interrupt_handle_pending" [19] ------------------------------------------------------------------------ Self Cumul Total Nr Count % Count % Count % Function ------------------------------------------------------------------------ 1 400 27.8 460 31.9 400 27.8 SB-KERNEL:HAIRY-DATA-VECTOR-SET 2 289 20.1 1084 75.2 689 47.8 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII 3 241 16.7 324 22.5 930 64.5 ARRAY-DIMENSION 4 220 15.3 1362 94.5 1150 79.8 "hairy arg processor for top level local call READ-LINE" 5 147 10.2 147 10.2 1297 90.0 ARRAY-RANK 6 60 4.2 60 4.2 1357 94.2 "no debug information for frame" 7 37 2.6 37 2.6 1394 96.7 SB-KERNEL:BIT-BASH-COPY 8 13 0.9 13 0.9 1407 97.6 SB-KERNEL:SHRINK-VECTOR 9 12 0.8 12 0.8 1419 98.5 "foreign function __read" 10 8 0.6 8 0.6 1427 99.0 "foreign function sigprocmask" 11 7 0.5 8 0.6 1434 99.5 SB-KERNEL:SUB-GC 12 3 0.2 3 0.2 1437 99.7 "FLET #:CLEANUP-FUN-397" 13 2 0.1 14 1.0 1439 99.9 SB-IMPL::REFILL-FD-STREAM-BUFFER 14 1 0.1 1 0.1 1440 99.9 "FLET #:CLEANUP-FUN-219" 15 1 0.1 1092 75.8 1441 100.0 SB-INT:FAST-READ-CHAR-REFILL 16 0 0.0 15 1.0 1441 100.0 "foreign function call_into_lisp" 17 0 0.0 14 1.0 1441 100.0 "foreign function funcall0" 18 0 0.0 7 0.5 1441 100.0 "foreign function interrupt_maybe_gc_int" 19 0 0.0 15 1.0 1441 100.0 "foreign function interrupt_handle_pending" 20 0 0.0 15 1.0 1441 100.0 "foreign function sigtrap_handler" 21 0 0.0 1422 98.7 1441 100.0 TESTER 22 0 0.0 1422 98.7 1441 100.0 SB-IMPL::%TIME 23 0 0.0 1422 98.7 1441 100.0 SB-INT:EVAL-IN-LEXENV 24 0 0.0 1267 87.9 1441 100.0 "hairy arg processor for SWANK::EVAL-REGION" 25 0 0.0 622 43.2 1441 100.0 "#'(LAMBDA NIL (MULTIPLE-VALUE-BIND # # ...))" 26 0 0.0 331 23.0 1441 100.0 (SB-PCL::FAST-METHOD SWANK-BACKEND:CALL-WITH-SYNTAX-HOOKS (T)) 27 0 0.0 280 19.4 1441 100.0 SWANK:LISTENER-EVAL 28 0 0.0 9 0.6 1441 100.0 "foreign function interrupt_handle_now" 29 0 0.0 2 0.1 1441 100.0 "FLET #:CLEANUP-FUN-2014" 30 0 0.0 2 0.1 1441 100.0 SB-SPROF::SIGPROF-HANDLER 31 0 0.0 1 0.1 1441 100.0 "foreign function funcall3" 32 0 0.0 3 0.2 1441 100.0 SB-THREAD:CONDITION-WAIT 33 0 0.0 3 0.2 1441 100.0 (SB-PCL::FAST-METHOD SWANK-BACKEND:RECEIVE NIL) 34 0 0.0 3 0.2 1441 100.0 SWANK::READ-FROM-EMACS 35 0 0.0 3 0.2 1441 100.0 "#'(LAMBDA NIL (LET # #))" 36 0 0.0 3 0.2 1441 100.0 SWANK::CALL-WITH-REDIRECTED-IO 37 0 0.0 3 0.2 1441 100.0 SWANK::HANDLE-REQUEST ------------------------------------------------------------------------ 0 0.0 elsewhere |
From: Peter D. <pet...@ni...> - 2005-01-22 17:03:15
|
Hi, I'm not an expert in the internals of SBCL, so I can't comment on most of this, but I see that you are running under SLIME. I have a LispWorks application that runs twice as fast standalone as it does under SLIME. It might be instructive to take that variable out of the measurement. On Friday 21 January 2005 21:41, Pete Kazmier wrote: > [disclaimer: I'm still a lisp newbie, please be gentle.] > > I am interested in continuing my voyage into CL and as part of the > process I am in a position to slowly introduce it at my workplace. > The first project that I decided to undertake was replacing some log > file processing utilities that had been written in Python. > > I was especially excited to discover that SBCL and cl-ppcre were able > to process the logs faster than my python program without actually > trying to optimize either programs and while keeping them very simple > and easy to understand. The only problem is that READ-LINE in > dreadfully slow. It takes about 16s to simply read my log file in > SBCL, whereas a simple python program could do the same in about 1s. > So although Lisp can process the data faster, getting it the data is > another story altogether. > > After chatting with some folks on IRC, they suggested various ways of > improving the speed ranging from doing one large read, switching CL > implementations, and such. I did test the speed of the other > implementations [1] and although they were faster, compared to my > original python program, they are still very slow. > > I'd really prefer to use READ-LINE versus trying to hand optimize this > function as I believe the underlying implementation should be doing > the optimization in this case, and I won't have much luck convincing > others at work to use lisp if I have to go through hoops for this one > utility. That leads me to my question, are there some settings that I > am not aware of to tune SBCL and the IO subsystem? > > Here is my fn: > > (defun tester (file) > (declare (optimize (speed 3) (safety 0) (debug 0))) > (with-open-file (in file) > (do ((line (read-line in nil :eof) (read-line in nil :eof))) > ((eql line :eof) t) > ;; do something here > ))) > > I ran strace on both the python program and SBCL to see how many > read(2)s were issued and how much each read was gathering (8k chunks > for python and 4k chunks for SBCL). At that time I really had no idea > where the bottleneck was so I wanted to make sure SBCL was at least > using buffered streams and not calling read(2) for every character. > > This path of reasoning did not yield any insight into the large > discrepancy in speeds, so I googled for SBCL profiling and discovered > sb-sprof. I'm not quite sure how to interpret the full results [2], > but here is the top list of functions consuming most of the time. > Does this look normal? Could anyone help interpret the results? Is > there some magic switch I can flip to improve my performance? I am > using SBCL from CVS as of a day or so ago without unicode support > running on a debian 2.4 box. > > > Self Cumul Total > Nr Count % Count % Count % Function > ------------------------------------------------------------------------ > 1 400 27.8 460 31.9 400 27.8 SB-KERNEL:HAIRY-DATA-VECTOR-SET > 2 289 20.1 1084 75.2 689 47.8 > SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII 3 241 16.7 324 22.5 > 930 64.5 ARRAY-DIMENSION > 4 220 15.3 1362 94.5 1150 79.8 "hairy arg processor for top > level local call READ-LINE" 5 147 10.2 147 10.2 1297 90.0 > ARRAY-RANK > 6 60 4.2 60 4.2 1357 94.2 "no debug information for > frame" 7 37 2.6 37 2.6 1394 96.7 SB-KERNEL:BIT-BASH-COPY > > > In summary, I'm loving my voyage into the Lisp world, but perhaps I'm > trying to use it for the wrong application in this case (large amounts > of log file processing). Am I trying to make a square peg fit into a > circle? Or is there something that I'm missing (very likely)? > > Thanks for reading this far! > Pete > > > [1] Results of my crude test running on various CL implementations. > > SBCL: > CL-USER> (time (tester "/tmp/pgw-logs/trapd.log.ovnyc00p")) > Evaluation took: > 15.455 seconds of real time > 13.17 seconds of user run time > 0.4 seconds of system run time > 0 page faults and > 87,576,968 bytes consed. > > CMUCL: > * (time (tester "/tmp/pgw-logs/trapd.log.ovnyc00p")) > Evaluation took: > 9.96 seconds of real time > 6.98 seconds of user run time > 0.38 seconds of system run time > 6,955,855,634 CPU cycles > [Run times include 0.53 seconds GC run time] > 0 page faults and > 92,049,536 bytes consed. > > Clisp: > [5]> (time (tester "/tmp/pgw-logs/trapd.log.ovnyc00p")) > Real time: 7.922794 sec. > Run time: 7.51 sec. > Space: 134691576 Bytes > GC: 257, GC time: 0.52 sec. > > > [2] Full output of sb-sprof:report. > > Number of samples: 1441 > Sample interval: 0.01 seconds > Total sampling time: 14.41 seconds > Number of cycles: 3 > > Cycle > Count % Parts > ------------------------------------------------------------------------ > 0 0.0 <Cycle 1>... > 0 0.0 "#'(LAMBDA NIL (LET # #))" [35] > 0 0.0 SWANK::HANDLE-REQUEST [37] > 0 0.0 SWANK::CALL-WITH-REDIRECTED-IO [36] > ------------------------------------------------------------------------ > 7 0.5 <Cycle 2>... > 7 0.5 SB-KERNEL:SUB-GC [11] > ------------------------------------------------------------------------ > 289 20.1 <Cycle 3>... > 289 20.1 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII > [2] > ------------------------------------------------------------------------ > > > Callers > Cumul. Function > Count % Count % Callees > ------------------------------------------------------------------------ > 3 0.2 SB-THREAD:CONDITION-WAIT [32] > 3 0.2 3 0.2 "FLET #:CLEANUP-FUN-397" [12] > ------------------------------------------------------------------------ > 3 0.2 (SB-PCL::FAST-METHOD SWANK-BACKEND:RECEIVE > NIL) [33] 0 0.0 3 0.2 SB-THREAD:CONDITION-WAIT [32] > 3 0.2 "FLET #:CLEANUP-FUN-397" [12] > ------------------------------------------------------------------------ > 3 0.2 SWANK::READ-FROM-EMACS [34] > 0 0.0 3 0.2 (SB-PCL::FAST-METHOD SWANK-BACKEND:RECEIVE NIL) > [33] 3 0.2 SB-THREAD:CONDITION-WAIT [32] > ------------------------------------------------------------------------ > 2 0.1 SWANK::HANDLE-REQUEST [37] > 1 0.1 "#'(LAMBDA NIL (LET # #))" [35] > 0 0.0 3 0.2 SWANK::READ-FROM-EMACS [34] > 3 0.2 (SB-PCL::FAST-METHOD SWANK-BACKEND:RECEIVE > NIL) [33] > ------------------------------------------------------------------------ 0 > 0.0 0 0.0 "<Cycle 1>" [1] > 1 0.1 SWANK::READ-FROM-EMACS [34] > 2 0.1 SWANK::READ-FROM-EMACS [34] > ------------------------------------------------------------------------ > 1 0.1 SB-KERNEL:SUB-GC [11] > 1 0.1 1 0.1 "FLET #:CLEANUP-FUN-219" [14] > ------------------------------------------------------------------------ > 7 0.5 0 0.0 "<Cycle 2>" [2] > 1 0.1 "FLET #:CLEANUP-FUN-219" [14] > ------------------------------------------------------------------------ > 12 0.8 SB-IMPL::REFILL-FD-STREAM-BUFFER [13] > 12 0.8 12 0.8 "foreign function __read" [9] > ------------------------------------------------------------------------ > 14 1.0 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII > [2] 2 0.1 14 1.0 SB-IMPL::REFILL-FD-STREAM-BUFFER [13] > 12 0.8 "foreign function __read" [9] > ------------------------------------------------------------------------ > 60 4.2 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] > 75 5.2 ARRAY-DIMENSION [3] > 12 0.8 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII > [2] 147 10.2 147 10.2 ARRAY-RANK [5] > ------------------------------------------------------------------------ > 8 0.6 ARRAY-DIMENSION [3] > 445 30.9 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII > [2] 7 0.5 SB-INT:FAST-READ-CHAR-REFILL [15] > 400 27.8 460 31.9 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] > 60 4.2 ARRAY-RANK [5] > ------------------------------------------------------------------------ > 324 22.5 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII > [2] 241 16.7 324 22.5 ARRAY-DIMENSION [3] > 75 5.2 ARRAY-RANK [5] > 8 0.6 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] > ------------------------------------------------------------------------ > 289 20.1 0 0.0 "<Cycle 3>" [3] > 14 1.0 SB-IMPL::REFILL-FD-STREAM-BUFFER [13] > 12 0.8 ARRAY-RANK [5] > 324 22.5 ARRAY-DIMENSION [3] > 445 30.9 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] > ------------------------------------------------------------------------ > 60 4.2 TESTER [21] > 60 4.2 60 4.2 "no debug information for frame" [6] > ------------------------------------------------------------------------ > 37 2.6 "hairy arg processor for top level local > call READ-LINE" [4] 37 2.6 37 2.6 SB-KERNEL:BIT-BASH-COPY [7] > ------------------------------------------------------------------------ > 13 0.9 "hairy arg processor for top level local > call READ-LINE" [4] 13 0.9 13 0.9 SB-KERNEL:SHRINK-VECTOR [8] > ------------------------------------------------------------------------ > 1092 75.8 "hairy arg processor for top level local > call READ-LINE" [4] 1 0.1 1092 75.8 SB-INT:FAST-READ-CHAR-REFILL > [15] > 7 0.5 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] > 1084 75.2 "<Cycle 3>" [3] > ------------------------------------------------------------------------ > 1362 94.5 TESTER [21] > 220 15.3 1362 94.5 "hairy arg processor for top level local call > READ-LINE" [4] 37 2.6 SB-KERNEL:BIT-BASH-COPY [7] > 13 0.9 SB-KERNEL:SHRINK-VECTOR [8] > 1092 75.8 SB-INT:FAST-READ-CHAR-REFILL [15] > ------------------------------------------------------------------------ > 1422 98.7 SB-IMPL::%TIME [22] > 0 0.0 1422 98.7 TESTER [21] > 60 4.2 "no debug information for frame" [6] > 1362 94.5 "hairy arg processor for top level local > call READ-LINE" [4] > ------------------------------------------------------------------------ > 1422 98.7 SB-INT:EVAL-IN-LEXENV [23] > 0 0.0 1422 98.7 SB-IMPL::%TIME [22] > 1422 98.7 TESTER [21] > ------------------------------------------------------------------------ > 1267 87.9 "hairy arg processor for SWANK::EVAL-REGION" > [24] 0 0.0 1422 98.7 SB-INT:EVAL-IN-LEXENV [23] > 1422 98.7 SB-IMPL::%TIME [22] > ------------------------------------------------------------------------ > 622 43.2 "#'(LAMBDA NIL (MULTIPLE-VALUE-BIND # # > ...))" [25] 0 0.0 1267 87.9 "hairy arg processor for > SWANK::EVAL-REGION" [24] 1267 87.9 SB-INT:EVAL-IN-LEXENV > [23] > ------------------------------------------------------------------------ > 331 23.0 (SB-PCL::FAST-METHOD > SWANK-BACKEND:CALL-WITH-SYNTAX-HOOKS (T)) [26] 0 0.0 622 43.2 > "#'(LAMBDA NIL (MULTIPLE-VALUE-BIND # # ...))" [25] 622 43.2 > "hairy arg processor for SWANK::EVAL-REGION" [24] > ------------------------------------------------------------------------ > 280 19.4 SWANK:LISTENER-EVAL [27] > 0 0.0 331 23.0 (SB-PCL::FAST-METHOD > SWANK-BACKEND:CALL-WITH-SYNTAX-HOOKS (T)) [26] 331 23.0 > "#'(LAMBDA NIL (MULTIPLE-VALUE-BIND # # ...))" [25] > ------------------------------------------------------------------------ 0 > 0.0 280 19.4 SWANK:LISTENER-EVAL [27] > 280 19.4 (SB-PCL::FAST-METHOD > SWANK-BACKEND:CALL-WITH-SYNTAX-HOOKS (T)) [26] > ------------------------------------------------------------------------ 2 > 0.1 SB-SPROF::SIGPROF-HANDLER [30] > 0 0.0 2 0.1 "FLET #:CLEANUP-FUN-2014" [29] > 2 0.1 "<Cycle 2>" [2] > ------------------------------------------------------------------------ > 1 0.1 "foreign function call_into_lisp" [16] > 0 0.0 2 0.1 SB-SPROF::SIGPROF-HANDLER [30] > 2 0.1 "FLET #:CLEANUP-FUN-2014" [29] > ------------------------------------------------------------------------ > 1 0.1 "foreign function funcall3" [31] > 14 1.0 "foreign function funcall0" [17] > 0 0.0 15 1.0 "foreign function call_into_lisp" [16] > 1 0.1 SB-SPROF::SIGPROF-HANDLER [30] > 14 1.0 "<Cycle 2>" [2] > ------------------------------------------------------------------------ > 1 0.1 "foreign function interrupt_handle_now" [28] > 0 0.0 1 0.1 "foreign function funcall3" [31] > 1 0.1 "foreign function call_into_lisp" [16] > ------------------------------------------------------------------------ > 8 0.6 "foreign function interrupt_handle_now" [28] > 8 0.6 8 0.6 "foreign function sigprocmask" [10] > ------------------------------------------------------------------------ > 8 0.6 "foreign function interrupt_handle_pending" > [19] 0 0.0 9 0.6 "foreign function interrupt_handle_now" [28] 1 > 0.1 "foreign function funcall3" [31] > 8 0.6 "foreign function sigprocmask" [10] > ------------------------------------------------------------------------ > 7 0.5 "foreign function interrupt_maybe_gc_int" > [18] 0 0.0 14 1.0 "foreign function funcall0" [17] > 14 1.0 "foreign function call_into_lisp" [16] > ------------------------------------------------------------------------ > 7 0.5 "foreign function interrupt_handle_pending" > [19] 0 0.0 7 0.5 "foreign function interrupt_maybe_gc_int" [18] > 7 0.5 "foreign function funcall0" [17] > ------------------------------------------------------------------------ > 15 1.0 "foreign function sigtrap_handler" [20] > 0 0.0 15 1.0 "foreign function interrupt_handle_pending" > [19] 8 0.6 "foreign function interrupt_handle_now" [28] > 7 0.5 "foreign function interrupt_maybe_gc_int" [18] > ------------------------------------------------------------------------ 0 > 0.0 15 1.0 "foreign function sigtrap_handler" [20] 15 1.0 > "foreign function interrupt_handle_pending" [19] > ------------------------------------------------------------------------ > > > Self Cumul Total > Nr Count % Count % Count % Function > ------------------------------------------------------------------------ > 1 400 27.8 460 31.9 400 27.8 SB-KERNEL:HAIRY-DATA-VECTOR-SET > 2 289 20.1 1084 75.2 689 47.8 > SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII 3 241 16.7 324 22.5 > 930 64.5 ARRAY-DIMENSION > 4 220 15.3 1362 94.5 1150 79.8 "hairy arg processor for top > level local call READ-LINE" 5 147 10.2 147 10.2 1297 90.0 > ARRAY-RANK > 6 60 4.2 60 4.2 1357 94.2 "no debug information for > frame" 7 37 2.6 37 2.6 1394 96.7 SB-KERNEL:BIT-BASH-COPY 8 > 13 0.9 13 0.9 1407 97.6 SB-KERNEL:SHRINK-VECTOR 9 12 0.8 > 12 0.8 1419 98.5 "foreign function __read" 10 8 0.6 8 > 0.6 1427 99.0 "foreign function sigprocmask" 11 7 0.5 8 > 0.6 1434 99.5 SB-KERNEL:SUB-GC > 12 3 0.2 3 0.2 1437 99.7 "FLET #:CLEANUP-FUN-397" > 13 2 0.1 14 1.0 1439 99.9 > SB-IMPL::REFILL-FD-STREAM-BUFFER 14 1 0.1 1 0.1 1440 99.9 > "FLET #:CLEANUP-FUN-219" 15 1 0.1 1092 75.8 1441 100.0 > SB-INT:FAST-READ-CHAR-REFILL 16 0 0.0 15 1.0 1441 100.0 > "foreign function call_into_lisp" 17 0 0.0 14 1.0 1441 100.0 > "foreign function funcall0" 18 0 0.0 7 0.5 1441 100.0 > "foreign function interrupt_maybe_gc_int" 19 0 0.0 15 1.0 > 1441 100.0 "foreign function interrupt_handle_pending" 20 0 0.0 > 15 1.0 1441 100.0 "foreign function sigtrap_handler" 21 0 0.0 > 1422 98.7 1441 100.0 TESTER > 22 0 0.0 1422 98.7 1441 100.0 SB-IMPL::%TIME > 23 0 0.0 1422 98.7 1441 100.0 SB-INT:EVAL-IN-LEXENV > 24 0 0.0 1267 87.9 1441 100.0 "hairy arg processor for > SWANK::EVAL-REGION" 25 0 0.0 622 43.2 1441 100.0 "#'(LAMBDA > NIL (MULTIPLE-VALUE-BIND # # ...))" 26 0 0.0 331 23.0 1441 > 100.0 (SB-PCL::FAST-METHOD SWANK-BACKEND:CALL-WITH-SYNTAX-HOOKS (T)) 27 > 0 0.0 280 19.4 1441 100.0 SWANK:LISTENER-EVAL > 28 0 0.0 9 0.6 1441 100.0 "foreign function > interrupt_handle_now" 29 0 0.0 2 0.1 1441 100.0 "FLET > #:CLEANUP-FUN-2014" 30 0 0.0 2 0.1 1441 100.0 > SB-SPROF::SIGPROF-HANDLER 31 0 0.0 1 0.1 1441 100.0 > "foreign function funcall3" 32 0 0.0 3 0.2 1441 100.0 > SB-THREAD:CONDITION-WAIT 33 0 0.0 3 0.2 1441 100.0 > (SB-PCL::FAST-METHOD SWANK-BACKEND:RECEIVE NIL) 34 0 0.0 3 > 0.2 1441 100.0 SWANK::READ-FROM-EMACS 35 0 0.0 3 0.2 1441 > 100.0 "#'(LAMBDA NIL (LET # #))" 36 0 0.0 3 0.2 1441 100.0 > SWANK::CALL-WITH-REDIRECTED-IO 37 0 0.0 3 0.2 1441 100.0 > SWANK::HANDLE-REQUEST > ------------------------------------------------------------------------ > 0 0.0 elsewhere > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting > Tool for open source databases. Create drag-&-drop reports. Save time > by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. > Download a FREE copy at http://www.intelliview.com/go/osdn_nl > _______________________________________________ > Sbcl-help mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-help -- Best Regards, - Peter Peter Denno National Institute of Standards and Technology, Manufacturing System Integration Division, 100 Bureau Drive, Mail Stop 8264 Tel: +1 301-975-3595 Gaithersburg, MD, USA 20899-8264 FAX: +1 301-975-4694 |
From: Pete K. <pet...@ka...> - 2005-01-26 02:00:39
|
Peter Denno <pet...@ni...> writes: > > I'm not an expert in the internals of SBCL, so I can't comment on > most of this, but I see that you are running under SLIME. I have a > LispWorks application that runs twice as fast standalone as it does > under SLIME. Thanks for the suggestion, although in this case, it did not have a significant impact on the times. Since my original post, I have tested reading the same log file in both Allegro and LW (per Edi's suggestion in a thread on the cl-ppcre list) to see how their READ-LINE speeds compare to the free CL implementations [1]. Both were faster of course, but LW was by far the fastest and on par with the equivalent python program. Of course, life would be much better if SBCLs READ-LINE was much faster. Any developers have any input on my previous post? Thanks, Pete [1] Speeds from LW and Allegro LispWorks: CL-USER 17 > (time (tester "/tmp/pgw-logs/trapd.log.ovnyc00p")) Timing the evaluation of (TESTER "/tmp/pgw-logs/trapd.log.ovnyc00p") user time = 0.880 system time = 0.180 Elapsed time = 0:00:01 Allocation = 36822608 bytes standard / 4070 bytes conses 0 Page faults Calls to %EVAL 34 T Allegro: CL-USER(7): (time (tester "/tmp/pgw-logs/trapd.log.ovnyc00p")) ; cpu time (non-gc) 2,730 msec user, 90 msec system ; cpu time (gc) 80 msec user, 0 msec system ; cpu time (total) 2,810 msec user, 90 msec system ; real time 3,493 msec ; space allocation: ; 67 cons cells, 67,825,256 other bytes, 0 static bytes T |
From: Marco A. <ma...@cs...> - 2005-01-26 16:53:11
|
As an aside, on the issue of READ-LINE I would really like to have a READ-BUFFERED-LINE line-buffer &optional stream eof-error-p eof-value recursive-p that behaved like READ-LINE, but that would fill LINE-BUFFER (this being an adjustable character array with a fill pointer - the fill pointer being reset to 0 on each call) Using READ-SEQUENCE for the intended use does not quite cut it IMHO. Of course the performance issue should be dealt with at the implementation level. Cheers Marco On Jan 25, 2005, at 9:00 PM, Pete Kazmier wrote: > Peter Denno <pet...@ni...> writes: >> >> I'm not an expert in the internals of SBCL, so I can't comment on >> most of this, but I see that you are running under SLIME. I have a >> LispWorks application that runs twice as fast standalone as it does >> under SLIME. > > Thanks for the suggestion, although in this case, it did not have a > significant impact on the times. Since my original post, I have > tested reading the same log file in both Allegro and LW (per Edi's > suggestion in a thread on the cl-ppcre list) to see how their > READ-LINE speeds compare to the free CL implementations [1]. Both > were faster of course, but LW was by far the fastest and on par with > the equivalent python program. Of course, life would be much better > if SBCLs READ-LINE was much faster. Any developers have any input on > my previous post? > > Thanks, > Pete > > [1] Speeds from LW and Allegro > > LispWorks: > CL-USER 17 > (time (tester "/tmp/pgw-logs/trapd.log.ovnyc00p")) > Timing the evaluation of (TESTER "/tmp/pgw-logs/trapd.log.ovnyc00p") > > user time = 0.880 > system time = 0.180 > Elapsed time = 0:00:01 > Allocation = 36822608 bytes standard / 4070 bytes conses > 0 Page faults > Calls to %EVAL 34 > T > > Allegro: > CL-USER(7): (time (tester "/tmp/pgw-logs/trapd.log.ovnyc00p")) > ; cpu time (non-gc) 2,730 msec user, 90 msec system > ; cpu time (gc) 80 msec user, 0 msec system > ; cpu time (total) 2,810 msec user, 90 msec system > ; real time 3,493 msec > ; space allocation: > ; 67 cons cells, 67,825,256 other bytes, 0 static bytes > T > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting > Tool for open source databases. Create drag-&-drop reports. Save time > by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. > Download a FREE copy at http://www.intelliview.com/go/osdn_nl > _______________________________________________ > Sbcl-help mailing list > Sbc...@li... > https://lists.sourceforge.net/lists/listinfo/sbcl-help > -- Marco Antoniotti http://bioinformatics.nyu.edu NYU Courant Bioinformatics Group tel. +1 - 212 - 998 3488 715 Broadway 10th FL fax. +1 - 212 - 998 3484 New York, NY, 10003, U.S.A. |
From: Pete K. <pet...@ka...> - 2005-01-26 16:24:31
|
Continuing my saga, Bulent Murtezaoglu sent me a link to a macro he wrote to speed things up in CMUCL in terms of reading [1]. I tested the macro under both CMUCL and SBCL. Under CMUCL, I was able to achieve very good read times [2] (still not as fast as LW, but faster than Allegro). I tried the macro under SBCL, but performance was worse than before. In terms of the little office bet I had going, technically I lost as there were no open source CL implementations that had decent READ-LINE performance without resorting to hand optimizing the code to take advantage cl-ppcre's great speed. By cheating, using the above macro with CMUCL, I was able to achieve comparable results to my original python program which is great I suppose, but it still leaves me wondering why the default READ-LINE implementations are so slow (in particular SBCL as that in my preferred CL). Or, am I not flipping the "turbo" switch? Does that profiling sample (in my first post) look about right for this scenario? Thanks, Pete [1] Macro by Bulent Murtezaoglu http://groups-beta.google.com/group/comp.lang.lisp/msg/f22b458b22598666 [2] CMUCL Results using macro CL-USER> (time (fast-tester "/tmp/pgw-logs/trapd.log.ovnyc00p")) ; Compiling LAMBDA NIL: ; Compiling Top-Level Form: ; Evaluation took: ; 2.63 seconds of real time ; 2.01 seconds of user run time ; 0.49 seconds of system run time ; 1,841,333,669 CPU cycles ; [Run times include 0.47 seconds GC run time] ; 0 page faults and ; 76,958,032 bytes consed. ; T |
From: Christophe R. <cs...@ca...> - 2005-01-26 16:58:40
|
Pete Kazmier <pet...@ka...> writes: > In terms of the little office bet I had going, technically I lost as > there were no open source CL implementations that had decent READ-LINE > performance without resorting to hand optimizing the code to take > advantage cl-ppcre's great speed. Because READ-LINE allocates a new string each time, you are almost doomed to lose against environments which are free to recycle storage (as in e.g. Perl's $_ and probably Python's equivalent.) Comparing apples to apples is recommended -- it's certainly not cheating to read into a buffer and do matches between start and end indices. > By cheating, using the above macro with CMUCL, I was able to achieve > comparable results to my original python program which is great I > suppose, but it still leaves me wondering why the default READ-LINE > implementations are so slow (in particular SBCL as that in my > preferred CL). Or, am I not flipping the "turbo" switch? Does that > profiling sample (in my first post) look about right for this > scenario? SBCL's READ-LINE is so slow because file IO is in flux; Unicode and external-format support are both recently-implemented, and we're still shaking out bugs -- performance problems only figure on the radar in as much as they affect real applications; I'm afraid an office bet isn't enough to interest me for now. In any case, as I say, it's never going to be as fast as something which is allowed to share a buffer between calls. That said, I would expect the fastest READ-LINE available in sbcl to be in an image built without :sb-unicode, running in an environment where nl_langinfo(CODESET) returns "ISO-8859-1" -- both of these factors are important. I don't know what your setup was, but maybe that helps somewhat. Cheers, Christophe |