From: Karol L. <kar...@kn...> - 2007-05-07 08:28:09
|
On Thursday 03 May 2007 13:21, Noel O'Boyle wrote: > > > How we test each line has a large effect on efficiency. I point out > > > again that using line[x:y]=="jklj" is much faster than using "word in > > > line", or line.find(), and so these should be some of the first > > > targets for improving efficiency. >> > > langner@slim:~/tmp/python/cclib/trunk/data/GAMESS/basicGAMESS-US$ python > > Python 2.5 (r25:51908, Apr 30 2007, 15:03:13) > > [GCC 3.4.6 (Debian 3.4.6-5)] on linux2 > > Type "help", "copyright", "credits" or "license" for more information. > > > > >>> import cclib > > >>> a = cclib.parser.ccopen("C_bigbasis.out") > > >>> import profile > > >>> profile.run("a.parse()", "parse.prof") > > >>> import pstats > > >>> s = pstats.Stats("parse.prof") > > >>> s.sort_stats("time") > > >>> s.print_stats(.12) > > > > Thu May 3 14:43:04 2007 parse.prof > > > > 199815 function calls in 9.069 CPU seconds > > > > Ordered by: internal time > > List reduced from 96 to 12 due to restriction <0.12> > > > > ncalls tottime percall cumtime percall filename:lineno(function) > > 8581 4.548 0.001 8.625 > > 0.001 > > /home/langner/apps/python/lib/python2.5/site-packages/cclib/parser/gamess > >parser.py:90 (extract) > > 137355 3.080 0.000 3.080 0.000 :0(find) > > 20310 0.480 0.000 0.480 0.000 :0(len) > > 1 0.316 0.316 9.069 > > 9.069 > > /home/langner/apps/python/lib/python2.5/site-packages/cclib/parser/logfil > >eparser.py:165 (parse) > > 8600 0.184 0.000 0.184 0.000 :0(rstrip) > > 2143 0.140 0.000 0.140 0.000 :0(split) > > 2055 0.124 0.000 0.124 0.000 :0(range) > > 9145 0.076 0.000 0.076 0.000 :0(strip) > > 8868 0.060 0.000 0.060 > > 0.000 > > /home/langner/apps/python/lib/python2.5/site-packages/cclib/parser/logfil > >eparser.py:375 (updateprogress) > > 370 0.016 0.000 0.016 0.000 :0(append) > > 218 0.004 0.000 0.004 0.000 :0(replace) > > 31 0.004 0.000 0.032 > > 0.001 > > /home/langner/apps/python/lib/python2.5/site-packages/cclib/parser/logfil > >eparser.py:153 (__setattr__) > > I've never used the profiler. Can you interpret this for me in simple > language? The profiler measures the time used for function calls when executing a command. In the columns you have: ncalls - number of times a function ws called tottime - time spent in the given function (excluding time in sub-functions) percall - tottime/ncalls cumtime - time in function including subfunctions (from invocation to exit) percall (2nd) - cumtime/ncalls Now that I think about all this, though, statements such as "word in line" and "line[i:j] = word" are not measured here, since they are not function calls (the time is cumulated into the time of extract). A simple little test shows that find() is in fact the worst, but "word in line" is at least comparable to "line[i:j] == word": >>> import timeit >>> t1 = timeit.Timer("'a' in 'abcdefg'") >>> t2 = timeit.Timer("'abcdefg'[:1] == 'a'") >>> t3 = timeit.Timer("'abcdefg'.find('a')") >>> min(t1.repeat(repeat=100, number=1000000)) 0.18727612495422363 >>> min(t2.repeat(repeat=100, number=1000000)) 0.3044281005859375 >>> min(t3.repeat(repeat=100, number=1000000)) 0.7338860034942627 - Karol -- written by Karol Langner Mon May 7 11:47:55 CEST 2007 |