From: Pete K. <pet...@ka...> - 2005-01-22 02:53:27
|
[disclaimer: I'm still a lisp newbie, please be gentle.] I am interested in continuing my voyage into CL and as part of the process I am in a position to slowly introduce it at my workplace. The first project that I decided to undertake was replacing some log file processing utilities that had been written in Python. I was especially excited to discover that SBCL and cl-ppcre were able to process the logs faster than my python program without actually trying to optimize either programs and while keeping them very simple and easy to understand. The only problem is that READ-LINE in dreadfully slow. It takes about 16s to simply read my log file in SBCL, whereas a simple python program could do the same in about 1s. So although Lisp can process the data faster, getting it the data is another story altogether. After chatting with some folks on IRC, they suggested various ways of improving the speed ranging from doing one large read, switching CL implementations, and such. I did test the speed of the other implementations [1] and although they were faster, compared to my original python program, they are still very slow. I'd really prefer to use READ-LINE versus trying to hand optimize this function as I believe the underlying implementation should be doing the optimization in this case, and I won't have much luck convincing others at work to use lisp if I have to go through hoops for this one utility. That leads me to my question, are there some settings that I am not aware of to tune SBCL and the IO subsystem? Here is my fn: (defun tester (file) (declare (optimize (speed 3) (safety 0) (debug 0))) (with-open-file (in file) (do ((line (read-line in nil :eof) (read-line in nil :eof))) ((eql line :eof) t) ;; do something here ))) I ran strace on both the python program and SBCL to see how many read(2)s were issued and how much each read was gathering (8k chunks for python and 4k chunks for SBCL). At that time I really had no idea where the bottleneck was so I wanted to make sure SBCL was at least using buffered streams and not calling read(2) for every character. This path of reasoning did not yield any insight into the large discrepancy in speeds, so I googled for SBCL profiling and discovered sb-sprof. I'm not quite sure how to interpret the full results [2], but here is the top list of functions consuming most of the time. Does this look normal? Could anyone help interpret the results? Is there some magic switch I can flip to improve my performance? I am using SBCL from CVS as of a day or so ago without unicode support running on a debian 2.4 box. Self Cumul Total Nr Count % Count % Count % Function ------------------------------------------------------------------------ 1 400 27.8 460 31.9 400 27.8 SB-KERNEL:HAIRY-DATA-VECTOR-SET 2 289 20.1 1084 75.2 689 47.8 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII 3 241 16.7 324 22.5 930 64.5 ARRAY-DIMENSION 4 220 15.3 1362 94.5 1150 79.8 "hairy arg processor for top level local call READ-LINE" 5 147 10.2 147 10.2 1297 90.0 ARRAY-RANK 6 60 4.2 60 4.2 1357 94.2 "no debug information for frame" 7 37 2.6 37 2.6 1394 96.7 SB-KERNEL:BIT-BASH-COPY In summary, I'm loving my voyage into the Lisp world, but perhaps I'm trying to use it for the wrong application in this case (large amounts of log file processing). Am I trying to make a square peg fit into a circle? Or is there something that I'm missing (very likely)? Thanks for reading this far! Pete [1] Results of my crude test running on various CL implementations. SBCL: CL-USER> (time (tester "/tmp/pgw-logs/trapd.log.ovnyc00p")) Evaluation took: 15.455 seconds of real time 13.17 seconds of user run time 0.4 seconds of system run time 0 page faults and 87,576,968 bytes consed. CMUCL: * (time (tester "/tmp/pgw-logs/trapd.log.ovnyc00p")) Evaluation took: 9.96 seconds of real time 6.98 seconds of user run time 0.38 seconds of system run time 6,955,855,634 CPU cycles [Run times include 0.53 seconds GC run time] 0 page faults and 92,049,536 bytes consed. Clisp: [5]> (time (tester "/tmp/pgw-logs/trapd.log.ovnyc00p")) Real time: 7.922794 sec. Run time: 7.51 sec. Space: 134691576 Bytes GC: 257, GC time: 0.52 sec. [2] Full output of sb-sprof:report. Number of samples: 1441 Sample interval: 0.01 seconds Total sampling time: 14.41 seconds Number of cycles: 3 Cycle Count % Parts ------------------------------------------------------------------------ 0 0.0 <Cycle 1>... 0 0.0 "#'(LAMBDA NIL (LET # #))" [35] 0 0.0 SWANK::HANDLE-REQUEST [37] 0 0.0 SWANK::CALL-WITH-REDIRECTED-IO [36] ------------------------------------------------------------------------ 7 0.5 <Cycle 2>... 7 0.5 SB-KERNEL:SUB-GC [11] ------------------------------------------------------------------------ 289 20.1 <Cycle 3>... 289 20.1 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII [2] ------------------------------------------------------------------------ Callers Cumul. Function Count % Count % Callees ------------------------------------------------------------------------ 3 0.2 SB-THREAD:CONDITION-WAIT [32] 3 0.2 3 0.2 "FLET #:CLEANUP-FUN-397" [12] ------------------------------------------------------------------------ 3 0.2 (SB-PCL::FAST-METHOD SWANK-BACKEND:RECEIVE NIL) [33] 0 0.0 3 0.2 SB-THREAD:CONDITION-WAIT [32] 3 0.2 "FLET #:CLEANUP-FUN-397" [12] ------------------------------------------------------------------------ 3 0.2 SWANK::READ-FROM-EMACS [34] 0 0.0 3 0.2 (SB-PCL::FAST-METHOD SWANK-BACKEND:RECEIVE NIL) [33] 3 0.2 SB-THREAD:CONDITION-WAIT [32] ------------------------------------------------------------------------ 2 0.1 SWANK::HANDLE-REQUEST [37] 1 0.1 "#'(LAMBDA NIL (LET # #))" [35] 0 0.0 3 0.2 SWANK::READ-FROM-EMACS [34] 3 0.2 (SB-PCL::FAST-METHOD SWANK-BACKEND:RECEIVE NIL) [33] ------------------------------------------------------------------------ 0 0.0 0 0.0 "<Cycle 1>" [1] 1 0.1 SWANK::READ-FROM-EMACS [34] 2 0.1 SWANK::READ-FROM-EMACS [34] ------------------------------------------------------------------------ 1 0.1 SB-KERNEL:SUB-GC [11] 1 0.1 1 0.1 "FLET #:CLEANUP-FUN-219" [14] ------------------------------------------------------------------------ 7 0.5 0 0.0 "<Cycle 2>" [2] 1 0.1 "FLET #:CLEANUP-FUN-219" [14] ------------------------------------------------------------------------ 12 0.8 SB-IMPL::REFILL-FD-STREAM-BUFFER [13] 12 0.8 12 0.8 "foreign function __read" [9] ------------------------------------------------------------------------ 14 1.0 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII [2] 2 0.1 14 1.0 SB-IMPL::REFILL-FD-STREAM-BUFFER [13] 12 0.8 "foreign function __read" [9] ------------------------------------------------------------------------ 60 4.2 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] 75 5.2 ARRAY-DIMENSION [3] 12 0.8 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII [2] 147 10.2 147 10.2 ARRAY-RANK [5] ------------------------------------------------------------------------ 8 0.6 ARRAY-DIMENSION [3] 445 30.9 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII [2] 7 0.5 SB-INT:FAST-READ-CHAR-REFILL [15] 400 27.8 460 31.9 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] 60 4.2 ARRAY-RANK [5] ------------------------------------------------------------------------ 324 22.5 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII [2] 241 16.7 324 22.5 ARRAY-DIMENSION [3] 75 5.2 ARRAY-RANK [5] 8 0.6 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] ------------------------------------------------------------------------ 289 20.1 0 0.0 "<Cycle 3>" [3] 14 1.0 SB-IMPL::REFILL-FD-STREAM-BUFFER [13] 12 0.8 ARRAY-RANK [5] 324 22.5 ARRAY-DIMENSION [3] 445 30.9 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] ------------------------------------------------------------------------ 60 4.2 TESTER [21] 60 4.2 60 4.2 "no debug information for frame" [6] ------------------------------------------------------------------------ 37 2.6 "hairy arg processor for top level local call READ-LINE" [4] 37 2.6 37 2.6 SB-KERNEL:BIT-BASH-COPY [7] ------------------------------------------------------------------------ 13 0.9 "hairy arg processor for top level local call READ-LINE" [4] 13 0.9 13 0.9 SB-KERNEL:SHRINK-VECTOR [8] ------------------------------------------------------------------------ 1092 75.8 "hairy arg processor for top level local call READ-LINE" [4] 1 0.1 1092 75.8 SB-INT:FAST-READ-CHAR-REFILL [15] 7 0.5 SB-KERNEL:HAIRY-DATA-VECTOR-SET [1] 1084 75.2 "<Cycle 3>" [3] ------------------------------------------------------------------------ 1362 94.5 TESTER [21] 220 15.3 1362 94.5 "hairy arg processor for top level local call READ-LINE" [4] 37 2.6 SB-KERNEL:BIT-BASH-COPY [7] 13 0.9 SB-KERNEL:SHRINK-VECTOR [8] 1092 75.8 SB-INT:FAST-READ-CHAR-REFILL [15] ------------------------------------------------------------------------ 1422 98.7 SB-IMPL::%TIME [22] 0 0.0 1422 98.7 TESTER [21] 60 4.2 "no debug information for frame" [6] 1362 94.5 "hairy arg processor for top level local call READ-LINE" [4] ------------------------------------------------------------------------ 1422 98.7 SB-INT:EVAL-IN-LEXENV [23] 0 0.0 1422 98.7 SB-IMPL::%TIME [22] 1422 98.7 TESTER [21] ------------------------------------------------------------------------ 1267 87.9 "hairy arg processor for SWANK::EVAL-REGION" [24] 0 0.0 1422 98.7 SB-INT:EVAL-IN-LEXENV [23] 1422 98.7 SB-IMPL::%TIME [22] ------------------------------------------------------------------------ 622 43.2 "#'(LAMBDA NIL (MULTIPLE-VALUE-BIND # # ...))" [25] 0 0.0 1267 87.9 "hairy arg processor for SWANK::EVAL-REGION" [24] 1267 87.9 SB-INT:EVAL-IN-LEXENV [23] ------------------------------------------------------------------------ 331 23.0 (SB-PCL::FAST-METHOD SWANK-BACKEND:CALL-WITH-SYNTAX-HOOKS (T)) [26] 0 0.0 622 43.2 "#'(LAMBDA NIL (MULTIPLE-VALUE-BIND # # ...))" [25] 622 43.2 "hairy arg processor for SWANK::EVAL-REGION" [24] ------------------------------------------------------------------------ 280 19.4 SWANK:LISTENER-EVAL [27] 0 0.0 331 23.0 (SB-PCL::FAST-METHOD SWANK-BACKEND:CALL-WITH-SYNTAX-HOOKS (T)) [26] 331 23.0 "#'(LAMBDA NIL (MULTIPLE-VALUE-BIND # # ...))" [25] ------------------------------------------------------------------------ 0 0.0 280 19.4 SWANK:LISTENER-EVAL [27] 280 19.4 (SB-PCL::FAST-METHOD SWANK-BACKEND:CALL-WITH-SYNTAX-HOOKS (T)) [26] ------------------------------------------------------------------------ 2 0.1 SB-SPROF::SIGPROF-HANDLER [30] 0 0.0 2 0.1 "FLET #:CLEANUP-FUN-2014" [29] 2 0.1 "<Cycle 2>" [2] ------------------------------------------------------------------------ 1 0.1 "foreign function call_into_lisp" [16] 0 0.0 2 0.1 SB-SPROF::SIGPROF-HANDLER [30] 2 0.1 "FLET #:CLEANUP-FUN-2014" [29] ------------------------------------------------------------------------ 1 0.1 "foreign function funcall3" [31] 14 1.0 "foreign function funcall0" [17] 0 0.0 15 1.0 "foreign function call_into_lisp" [16] 1 0.1 SB-SPROF::SIGPROF-HANDLER [30] 14 1.0 "<Cycle 2>" [2] ------------------------------------------------------------------------ 1 0.1 "foreign function interrupt_handle_now" [28] 0 0.0 1 0.1 "foreign function funcall3" [31] 1 0.1 "foreign function call_into_lisp" [16] ------------------------------------------------------------------------ 8 0.6 "foreign function interrupt_handle_now" [28] 8 0.6 8 0.6 "foreign function sigprocmask" [10] ------------------------------------------------------------------------ 8 0.6 "foreign function interrupt_handle_pending" [19] 0 0.0 9 0.6 "foreign function interrupt_handle_now" [28] 1 0.1 "foreign function funcall3" [31] 8 0.6 "foreign function sigprocmask" [10] ------------------------------------------------------------------------ 7 0.5 "foreign function interrupt_maybe_gc_int" [18] 0 0.0 14 1.0 "foreign function funcall0" [17] 14 1.0 "foreign function call_into_lisp" [16] ------------------------------------------------------------------------ 7 0.5 "foreign function interrupt_handle_pending" [19] 0 0.0 7 0.5 "foreign function interrupt_maybe_gc_int" [18] 7 0.5 "foreign function funcall0" [17] ------------------------------------------------------------------------ 15 1.0 "foreign function sigtrap_handler" [20] 0 0.0 15 1.0 "foreign function interrupt_handle_pending" [19] 8 0.6 "foreign function interrupt_handle_now" [28] 7 0.5 "foreign function interrupt_maybe_gc_int" [18] ------------------------------------------------------------------------ 0 0.0 15 1.0 "foreign function sigtrap_handler" [20] 15 1.0 "foreign function interrupt_handle_pending" [19] ------------------------------------------------------------------------ Self Cumul Total Nr Count % Count % Count % Function ------------------------------------------------------------------------ 1 400 27.8 460 31.9 400 27.8 SB-KERNEL:HAIRY-DATA-VECTOR-SET 2 289 20.1 1084 75.2 689 47.8 SB-IMPL::FD-STREAM-READ-N-CHARACTERS/ASCII 3 241 16.7 324 22.5 930 64.5 ARRAY-DIMENSION 4 220 15.3 1362 94.5 1150 79.8 "hairy arg processor for top level local call READ-LINE" 5 147 10.2 147 10.2 1297 90.0 ARRAY-RANK 6 60 4.2 60 4.2 1357 94.2 "no debug information for frame" 7 37 2.6 37 2.6 1394 96.7 SB-KERNEL:BIT-BASH-COPY 8 13 0.9 13 0.9 1407 97.6 SB-KERNEL:SHRINK-VECTOR 9 12 0.8 12 0.8 1419 98.5 "foreign function __read" 10 8 0.6 8 0.6 1427 99.0 "foreign function sigprocmask" 11 7 0.5 8 0.6 1434 99.5 SB-KERNEL:SUB-GC 12 3 0.2 3 0.2 1437 99.7 "FLET #:CLEANUP-FUN-397" 13 2 0.1 14 1.0 1439 99.9 SB-IMPL::REFILL-FD-STREAM-BUFFER 14 1 0.1 1 0.1 1440 99.9 "FLET #:CLEANUP-FUN-219" 15 1 0.1 1092 75.8 1441 100.0 SB-INT:FAST-READ-CHAR-REFILL 16 0 0.0 15 1.0 1441 100.0 "foreign function call_into_lisp" 17 0 0.0 14 1.0 1441 100.0 "foreign function funcall0" 18 0 0.0 7 0.5 1441 100.0 "foreign function interrupt_maybe_gc_int" 19 0 0.0 15 1.0 1441 100.0 "foreign function interrupt_handle_pending" 20 0 0.0 15 1.0 1441 100.0 "foreign function sigtrap_handler" 21 0 0.0 1422 98.7 1441 100.0 TESTER 22 0 0.0 1422 98.7 1441 100.0 SB-IMPL::%TIME 23 0 0.0 1422 98.7 1441 100.0 SB-INT:EVAL-IN-LEXENV 24 0 0.0 1267 87.9 1441 100.0 "hairy arg processor for SWANK::EVAL-REGION" 25 0 0.0 622 43.2 1441 100.0 "#'(LAMBDA NIL (MULTIPLE-VALUE-BIND # # ...))" 26 0 0.0 331 23.0 1441 100.0 (SB-PCL::FAST-METHOD SWANK-BACKEND:CALL-WITH-SYNTAX-HOOKS (T)) 27 0 0.0 280 19.4 1441 100.0 SWANK:LISTENER-EVAL 28 0 0.0 9 0.6 1441 100.0 "foreign function interrupt_handle_now" 29 0 0.0 2 0.1 1441 100.0 "FLET #:CLEANUP-FUN-2014" 30 0 0.0 2 0.1 1441 100.0 SB-SPROF::SIGPROF-HANDLER 31 0 0.0 1 0.1 1441 100.0 "foreign function funcall3" 32 0 0.0 3 0.2 1441 100.0 SB-THREAD:CONDITION-WAIT 33 0 0.0 3 0.2 1441 100.0 (SB-PCL::FAST-METHOD SWANK-BACKEND:RECEIVE NIL) 34 0 0.0 3 0.2 1441 100.0 SWANK::READ-FROM-EMACS 35 0 0.0 3 0.2 1441 100.0 "#'(LAMBDA NIL (LET # #))" 36 0 0.0 3 0.2 1441 100.0 SWANK::CALL-WITH-REDIRECTED-IO 37 0 0.0 3 0.2 1441 100.0 SWANK::HANDLE-REQUEST ------------------------------------------------------------------------ 0 0.0 elsewhere |