From: David J. Neu <davidjneu@gm...>  20071130 23:03:41

Hi Attila, Thanks for all of your work! We're very interested in improving the performance of the program that Gary posted, and have tried some comparisions with and without your changes, and using Python. I used SBCL 1.0.12.7 from the git repo, and the fakedata.tgz file that Gary provided. Here's what I found: 1. Using the code as Gary provided: Evaluation took: 22.975 seconds of real time 20.319805 seconds of user run time 2.187422 seconds of system run time [Run times include 7.712 seconds GC run time.] 0 calls to %EVAL 0 page faults and 4,192,713,792 bytes consed. 2. Applying your sbcl.diff, using your splitsequence, and using your version of parsetext, I got: Evaluation took: 12.328 seconds of real time 11.457298 seconds of user run time 0.800819 seconds of system run time [Run times include 2.122 seconds GC run time.] 0 calls to %EVAL 0 page faults and 911,974,696 bytes consed 3. Python: 4.188586 seconds I'm wondering if you or anyone else had any thoughts on the following questions: a. While this is a /dramatic/ improvement:  It's still much slower than Python, and you mentioned that your results were comparable to Python. Any thoughts why? Did you get the same results on Gary's fakedata.tgz?  The reduction in consing in the example, was dramatic, but didn't compare to your results. Again, any thoughts why? Did you get the same results on Gary's fakedata.tgz? b. If I run (sbext:gc :full t) after each run, I can repeatedly call PARSETEXT with no issues, if I don't make the gc call, I get dropped into the debugger, is this to be expected? c. Juho had mentioned that tuning the garbage collector was discussed on this list, would this address point b., and could someone point me to the appropriate post? I found this post http://sourceforge.net/mailarchive/message.php?msg_id=l37iskysld.fsf%40kyle.netcamp.se but am not sure how to directly apply the discussion. d. You wrote: > >*** and an interesting last piece: using (ppcre:split "~" line > >:sharedp t), adding a (declare (type simplebasestring > >targetstring)) to split and avoiding apply #'subseq, it's hardly > >slower: To clarify, did you put (declare (type simplebasestring targetstring)) in api.lisp? Could you explain how you: "avoiding apply #'subseq"? Again, many thanks for your work on this, and any answers you can provide! Cheers, David On Sun, Nov 25, 2007 at 03:33:17PM 0500, Gary King wrote: > Hi Attila, > > Thanks for all this work. Interesting notes. > > On Nov 22, 2007, at 9:19 AM, Attila Lendvai wrote: > > >hi, > > > >i've optimized it a bit which i think is in par with the python speed > >if not faster now. two things are attached to the mail: one is the > >optimized splitsequence and the other is the diff to sbcl HEAD. this > >latter diff contains the previously posted basechar changes to > >readline and an optimization to stringtrim. (the big change in > >stream.lisp is only the makeresultstring macrolet and the (declare > >(type index len index)) type declaration). > > > >i was working with a smaller version of the file to keep it in the > >disk cache and avoid swapping. it went down from 9.927 to 2.597. > > > >the final form of the defun: > > > >(defun parsetext (&optional (filename "/home/ati/fakedata.txt")) > > (declare (optimize speed (debug 0)) > > (inline splitsequence:splitsequence > > stringtrim)) > > (withopenfile (in filename > > :elementtype 'basechar > > :externalformat :ascii > > :direction :input > > :ifdoesnotexist :error) > > (let ((ht (makehashtable :test 'equal))) > > (loop > > for line oftype simplebasestring = (readline in nil) > > while line do > > (let ((fields (splitsequence:splitsequence #\~ line))) > > (when (= (length fields) 3) > > (let ((id (stringtrim " " (the simplebasestring > >(first fields)))) > > (attribute (stringtrim " " (the > >simplebasestring (second fields)))) > > (value (stringtrim " " (the simplebasestring > >(third fields))))) > > (when (not (gethash id ht)) > > (setf (gethash id ht) (makehashtable :test > >'equal))) > > (let ((fieldsht (gethash id ht))) > > (setf (gethash attribute fieldsht) value)))))) > > ;;(print (hashtablecount ht)) > > (values)))) > > > > > >i've recorded the interesting steps: > > > >CLUSER> (progn > > (sbext:gc :full t) > > (time (parsetext))) > > > >*** unoptimized: > > > >Evaluation took: > > 9.927 seconds of real time > > 9.056566 seconds of user run time > > 0.856054 seconds of system run time > > [Run times include 2.352 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 1,633,159,856 bytes consed. > > > > > > > > > >*** after applying the patch for readline to return basestring's: > > > >difference is below measuring error (but it's important for the > >splitsequence inlining) > > > > > > > > > >*** after adding (declare (optimize speed (debug 0))) > > > >difference is below measuring error > > > > > > > > > >*** after optimizing and inlining splitsequence (it even had apply > >#'position calls!) > > > >Evaluation took: > > 4.916 seconds of real time > > 4.504282 seconds of user run time > > 0.408026 seconds of system run time > > [Run times include 1.044 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 796,540,592 bytes consed. > > > > > > > > > >*** after adding optimization to stringtrim that avoids calling > >subseq if there's nothing to be trimmed (note: this is explicitly > >allowed by the spec): > > > >Evaluation took: > > 3.372 seconds of real time > > 2.940184 seconds of user run time > > 0.428027 seconds of system run time > > [Run times include 1.048 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 728,696,000 bytes consed. > > > > > > > > > >*** after inlining stringtrim: > > > >time difference is below measuring error, but consing went down a > >little. > > > >Evaluation took: > > 3.28 seconds of real time > > 2.864179 seconds of user run time > > 0.396024 seconds of system run time > > [Run times include 1.024 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 718,985,664 bytes consed. > > > > > > > > > >*** annotating (the simplebasestring ...) for the args of the > >stringtrim calls: > > > >Evaluation took: > > 3.107 seconds of real time > > 2.740172 seconds of user run time > > 0.360022 seconds of system run time > > [Run times include 1.048 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 718,925,760 bytes consed. > > > > > > > > > > > >*** at this point the profiling looks like this (sorry, it's > >unreadable without fixed font, but plain text mails are preferred): > > > >Rank Name Self% > >Cumul% Total% > >1 READLINE 39.34 > >53.08 39.34 > > Callers > > PARSETEXT > >53.08 > > Calls > > SBINT:FASTREADCHAR > >REFILL 9.00 > > SBKERNEL:%SHRINK > >VECTOR 3.32 > > SBKERNEL:UB32BASH > >COPY 1.42 > >2 (FLET #:CLEANUPFUN[CALLWITHOUTINTERRUPTS]24) 24.64 > >24.64 63.98 > >3 SBIMPL::FDSTREAMREADNCHARACTERS/ASCII > >7.58 9.00 71.56 > > Callers > > SBINT:FASTREADCHAR > >REFILL 9.00 > > Calls > > SBIMPL::REFILLINPUT > >BUFFER 1.42 > >4 PARSETEXT 6.16 > >93.84 77.73 > > Callers > > NIL > >93.84 > > Calls > > READLINE > >53.08 > > SB > >IMPL::GETHASH3 5.21 > > SBVM::GENERIC > >+ 4.74 > > MAKEHASH > >TABLE 3.32 > > SBKERNEL:UB8BASH > >COPY 2.84 > > SBKERNEL:% > >PUTHASH 0.95 > >5 SBVM::GENERIC+ > >4.74 4.74 82.46 > >6 SBKERNEL:%SHRINKVECTOR > >3.32 3.32 85.78 > >7 SBKERNEL:UB8BASHCOPY > >2.84 2.84 88.63 > >8 SBKERNEL:%SPSTRINGCOMPARE > >2.37 2.37 91.00 > >9 MAKEHASHTABLE > >2.37 3.32 93.36 > >10 SBKERNEL:STRING=* > >1.42 3.79 94.79 > >11 SBKERNEL:UB32BASHCOPY > >1.42 1.42 96.21 > >12 (FLET #:BODYFUN[GETHASH3]1076) > >0.95 5.21 97.16 > >13 (FLET SBIMPL::TRICK) > >0.47 0.47 97.63 > >14 SBIMPL::REFILLINPUTBUFFER > >0.47 1.42 98.10 > >15 SBIMPL::CEILPOWEROFTWO > >0.47 0.47 98.58 > >16 (FLET #:CLEANUPFUN[%PUTHASH]1409) > >0.47 0.47 99.05 > >17 SBKERNEL:%PUTHASH > >0.47 0.95 99.53 > >18 SBIMPL::%MAKEHASHTABLE > >0.47 0.47 100.00 > > > > > > > > > > > >*** after rising +ansistreaminbufferlength+ to 4096 and > >recompiling sbcl > > > >note that consing fall down drastically but due to the problem with > >REPLACE that Christoph mentioned it got slower. but hopefully Juho has > >something to say here. > > > >Evaluation took: > > 3.59 seconds of real time > > 3.272205 seconds of user run time > > 0.316019 seconds of system run time > > [Run times include 0.528 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 260,260,240 bytes consed. > > > >Rank Name Self% > >Cumul% Total% > >1 REPLACE 16.42 > >49.25 16.42 > > Callers > > READLINE > >49.25 > > > >REPLACE 0.37 > > Calls > > SBIMPL::OPTIMIZEDDATAVECTORSET > >10.07 > > SBKERNEL:HAIRYDATAVECTOR > >SET 9.70 > > SBKERNEL:HAIRYDATAVECTOR > >REF 7.84 > > SBIMPL::OPTIMIZEDDATAVECTOR > >REF 2.61 > > SBIMPL::OPTIMIZEDDATAVECTOR > >REF 2.24 > > > >LENGTH 0.37 > > > >REPLACE 0.37 > >2 SBIMPL::FDSTREAMREADNCHARACTERS/ASCII 11.57 > >16.42 27.99 > > Callers > > SBINT:FASTREADCHARREFILL > >16.42 > > Calls > > SBIMPL::REFILLINPUT > >BUFFER 4.85 > >3 (FLET #:CLEANUPFUN[CALLWITHOUTINTERRUPTS]24) 10.82 > >10.82 38.81 > >4 READLINE 10.07 > >76.12 48.88 > > Callers > > PARSETEXT > >76.12 > > Calls > > REPLACE > >49.25 > > SBINT:FASTREADCHARREFILL > >16.42 > > SBKERNEL:%SHRINK > >VECTOR 0.37 > >5 SBIMPL::OPTIMIZEDDATAVECTORSET 10.07 > >10.07 58.96 > > Callers > > REPLACE > >10.07 > >6 SBKERNEL:HAIRYDATAVECTORSET > >9.70 9.70 68.66 > >7 PARSETEXT 7.84 > >95.90 76.49 > >8 SBKERNEL:HAIRYDATAVECTORREF > >7.84 7.84 84.33 > >9 SBIMPL::OPTIMIZEDDATAVECTORREF > >2.61 2.61 86.94 > >10 SBIMPL::OPTIMIZEDDATAVECTORREF > >2.24 2.24 89.18 > >11 SBKERNEL:UB8BASHCOPY > >1.87 1.87 91.04 > >12 SBKERNEL:STRING=* > >1.49 2.24 92.54 > >13 SBKERNEL:%PUTHASH > >0.75 1.12 93.28 > >14 SBKERNEL:%SPSTRINGCOMPARE > >0.75 0.75 94.03 > > > > > > > > > >*** after getting rid of the REPLACE problem with an ugly hack in > >ansistreamreadline that helps propagating the type to REPLACE: > > > >Evaluation took: > > 2.597 seconds of real time > > 2.356147 seconds of user run time > > 0.240015 seconds of system run time > > [Run times include 0.532 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 260,262,768 bytes consed. > > > >Rank Name Self% > >Cumul% Total% > >1 PARSETEXT 16.40 > >96.83 16.40 > >2 SBIMPL::FDSTREAMREADNCHARACTERS/ASCII 14.81 > >18.52 31.22 > > Callers > > SBINT:FASTREADCHARREFILL > >18.52 > > Calls > > SBIMPL::REFILLINPUT > >BUFFER 3.70 > >3 READLINE 13.76 > >61.90 44.97 > > Callers > > PARSETEXT > >61.90 > > Calls > > REPLACE > >29.63 > > SBINT:FASTREADCHARREFILL > >18.52 > >4 (FLET #:CLEANUPFUN[CALLWITHOUTINTERRUPTS]24) 13.76 > >13.76 58.73 > > Callers > > SBUNIX::CALLWITHOUTINTERRUPTS > >13.76 > >5 REPLACE 8.47 > >29.63 67.20 > > Callers > > READLINE > >29.63 > > > >REPLACE 0.53 > > Calls > > SBKERNEL:HAIRYDATAVECTOR > >REF 7.41 > > SBKERNEL:HAIRYDATAVECTOR > >SET 5.82 > > SBIMPL::OPTIMIZEDDATAVECTOR > >SET 3.70 > > SBIMPL::OPTIMIZEDDATAVECTOR > >REF 3.70 > > > >REPLACE 0.53 > > > >LENGTH 0.53 > >6 SBKERNEL:HAIRYDATAVECTORREF > >7.41 7.41 74.60 > >7 SBKERNEL:HAIRYDATAVECTORSET > >5.82 5.82 80.42 > >8 SBKERNEL:UB8BASHCOPY > >4.23 4.23 84.66 > >9 SBIMPL::OPTIMIZEDDATAVECTORREF > >3.70 3.70 88.36 > >10 SBIMPL::OPTIMIZEDDATAVECTORSET > >3.70 3.70 92.06 > >11 (FLET #:BODYFUN[%PUTHASH]1355) > >1.59 2.12 93.65 > >12 SBKERNEL:%SPSTRINGCOMPARE > >1.06 1.06 94.71 > >13 (FLET #:BODYFUN[GETHASH3]1076) > >1.06 2.65 95.77 > >14 LENGTH > >0.53 0.53 96.30 > >15 SBKERNEL::%MAKEINSTANCEWITHLAYOUT > >0.53 0.53 96.83 > >16 SBIMPL::REFILLINPUTBUFFER > >0.53 3.70 97.35 > >17 SBKERNEL:%UNARYTRUNCATE > >0.53 0.53 97.88 > >18 SBTHREAD::MAKESPINLOCK > >0.53 0.53 98.41 > >19 (FLET SBIMPL::TRICK) > >0.53 0.53 98.94 > >20 SBKERNEL:%SXHASHSIMPLESTRING > >0.53 1.06 99.47 > > > > > > > > > >*** and an interesting last piece: using (ppcre:split "~" line > >:sharedp t), adding a (declare (type simplebasestring > >targetstring)) to split and avoiding apply #'subseq, it's hardly > >slower: > > > >Evaluation took: > > 3.028 seconds of real time > > 2.772173 seconds of user run time > > 0.256016 seconds of system run time > > [Run times include 0.556 seconds GC run time.] > > 0 calls to %EVAL > > 0 page faults and > > 269,862,768 bytes consed. > > > > > > attila<split > >sequence.lisp><sbcl.diff> > > > >This SF.net email is sponsored by: Microsoft > >Defy all challenges. Microsoft(R) Visual Studio 2005. > >http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > >_______________________________________________ > >Sbcldevel mailing list > >Sbcldevel@... > >https://lists.sourceforge.net/lists/listinfo/sbcldevel > >  > Gary Warren King, metabang.com > Cell: (413) 559 8738 > Fax: (206) 3384052 > gwkkwg on Skype * garethsan on AIM > > > > >  > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Sbcldevel mailing list > Sbcldevel@... > https://lists.sourceforge.net/lists/listinfo/sbcldevel 