Re: [pygccxml-development] Another way to speed up the generation process...
Brought to you by:
mbaas,
roman_yakovenko
From: Matthias B. <ba...@ir...> - 2006-10-24 09:42:37
|
Hi, I spent a little more time on reducing the execution time of the code generation process and now I'm below 1 minute on both, Linux and OSX. :) Linux: 59s (previously 4:40 minutes) OSX: 44s (previously 7:20 minutes) Here are the timings of the individual steps on OSX: Parsing: 3s Decoration: 13s Building code creators: 14s Writing files: 14s (this is a run where all caches could be fully utilized and where no file had to be written) What I did were two things: pruning the declaration tree and caching query operations. I don't use the pruning function from my earlier mail anymore because this pruning is only done after the header files were already parsed and stored in the cache. So while the function could speed up the decoration stage it could not speed up the parsing stage (which took more than a minute on OSX using the regular cache). So now I prune the declarations at an earlier stage, namely right after the XML file was created by gccxml and before pygccxml reads it. I wrote a little utility that directly operates on the XML file and outputs a pruned XML file which I use as input for Py++ (I think I'll put it into contrib eventually). This should now even be safer than before because I also keep all dependencies even when they are not inside any of the allowed headers. With the pruned tree alone I am already at the above parsing time (when the cache is used which required a small modification to pygccxml) and decoration was also noticeably faster (~1:40min) because there were less declarations to consider (as already mentioned in my earlier mail). The next step in speeding things up was to cache the query operations. I did an experimental implementation in pypp_api that works in conjunction with the regular cache. Using this query cache, I get the above results and the decoration stage is no longer the bottleneck in my case. I'm afraid any further optimizations require quite a bit of restructuring inside Py++. Running the profiler on the last two stages (building code creators and writing files) reports the following "hot spots" (the list is sorted by total time): 26608350 function calls (26440374 primitive calls) in 341.650 CPU seconds Ordered by: internal time List reduced from 856 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 9370513 60.550 0.000 60.550 0.000 :0(isinstance) 4574860 53.200 0.000 82.480 0.000 decl_wrappers/algorithm.py:87(<lambda>) 61199/49370 36.710 0.001 194.330 0.004 :0(filter) 948954 20.230 0.000 35.240 0.000 type_traits.py:42(remove_alias) 320939 7.480 0.000 23.030 0.000 matchers.py:205(__call__) 211696 6.120 0.000 12.180 0.000 matchers.py:224(check_name) 398580/387050 5.300 0.000 8.350 0.000 code_creators/algorithm.py:42(proceed_single) 369799 5.190 0.000 13.540 0.000 code_creators/algorithm.py:40(make_flatten_generator) 105595 4.990 0.000 8.060 0.000 class_declaration.py:105(_get_name_impl) 102200 4.790 0.000 36.450 0.000 container_traits.py:40(get_container_or_none) - Matthias - |