Re: [pygccxml-development] Another performance tweak
Brought to you by:
mbaas,
roman_yakovenko
From: Allen B. <al...@vr...> - 2006-08-27 19:11:12
|
Roman Yakovenko wrote: > On 8/27/06, Allen Bierbaum <al...@vr...> wrote: > >> I have included comments below. Before that though... >> >> I have found a few more optimizations: >> >> - Making create_identifier return identity adds about 15% to performance >> - Adding a cache to module_builder that caches the global namespace tree >> after parsing, joining, and running the query optimizer. This boosts >> performance by another 40%. (both by saving more startup and making the >> query optimization initialization less costly so it can be used) > > > Can you explain relationship between it and declarations cache in > pygccxml? The module builder cache supercedes it. The module builder cache caches the decl tree after all loading, parsing in pygccxml and the additional processing in pyplusplus (optimization initialization). The decl cache can/is still used if you don't use the module builder cache, but when using the module builder cache the decl cache is not really needed. >> So now I have my run time down to aroung 59 seconds. Although I would >> really like to make it go even faster I think taking the running time >> down from the original 12 minutes to 1 minute now is a good enough >> improvement to make it so I can use py++ much easier in normal >> development. > > > :-) > >> Roman: Do you want a patch for module_builder caching or do you just >> want me to commit it? (all the changes are in the __init__ method and >> are very clear so you can see what is in there and change it if you >> prefer) > > > Please commit them, but still explain the relationship :-) Okay. I should have time to commit it later today. > >> Now on to the comments.... >> >> Roman Yakovenko wrote: >> >> > On 8/26/06, Allen Bierbaum <al...@vr...> wrote: >> > >> >> >> - Caching the results of type_traits.remove_alias improved >> >> performance >> >> >> by 20% >> >> > >> >> > >> >> > Yes, this algorithms is used every where. Please don't commit this >> >> > change, I'd like >> >> > to think about it. I see next problem here: there are a lot of >> types, >> >> > so the cache >> >> > memory size could go out of control. Can you find out how many >> memory >> >> > Py++ takes with\without the cache? >> >> >> >> I can try to take a look at this. I don't have an easy way to do it >> >> right now but I will try later. >> > >> > >> > I am think that I don't want to introduce cache of results of >> > type_traits. >> > I don't feel comfortable with them. One of the reasons is that it is >> > not that easy to >> > disable them. >> >> Maybe I am missing something, but why can't we control them by >> introducing a flag in type_traits like "caching_enabled" or something >> and then just testing that whenever the method is called. >> >> It would be a shame not to optimize this method when it gives a 20% >> performance boost. > > > You are right, I don't have good reason. I will commit the patch. By > the way it was > a tricky one. One of my test failed. It was very difficult ( few hours > with debugger ) > to find the problem. That is exactly the reason why I hate cache, but > what can I do > if you like it :-) ? > >> Are you suggesting I add this as a new method and change the code that >> calls it? > > > No. I just committed the results. > >> In my own local copy I have replaced "created_identifier" with an >> identify function and am getting very good results. > > > Why do you have goodies? For a time being please put it in a goodies. I had trouble putting it into goodies as an override because it looks like the method comes in through multiple module aliases. In other words I haven't found a way to override all the of the uses of the method. I have run into a problem like this on another project and it is really tough to track down. Basically it probably means there are multiple places where the module with create_identifier are being imported and the imports are using slightly different names to import the module. So sys.modules[] ends up with multiple entries that are both pointing to different imports of the same code. I think future versions of python are going to fix this problem by making you have to use the full package.module.name import no matter what file you are in, but for now this is a hard issue to track down. >> Right now I am not quite ready to update to your changes. I have some >> concerns about the complexity and performance of the way you implemented >> this. > > > Do you have the numbers? I did not read the benchmark results. I will > do this later. I don't have numbers. I didn't want to update and deal with the conflicts only to find I didn't want the update. With the changes you have made now I think it will probably be safe for me to update. I will let you know if something goes bad. > You are right, I introduced one more "get attribute". I have 1 good > reason > for this maintainability. I'd like to keep all "cached" values in > single place. > Thus, I don't have to scan sources for "what I am caching" and more over > it is very easy to create documentation to it. I am sure you will not > see this > "get attribute" in the benchmark > > >> - Why is there an "algorithms_cache_t" object as the base class to >> "declaration_algs_cache_t"? The split with a base class doesn't seem to >> serve much of a purpose. > > > Enable\disable functionality. But I think it is useless now. The > better idea is to > introduce type_algs_cache_t class with class variable , that will > control all > instances of type_algs_cache_t. > >> - What is the performance implication of using "properties"? I know you >> like to use them in your code but how do they affect performance? > > > I tried with and without them and I did not see the difference. May be > you will > see. If you do see the difference, we can leave set_* methods and remove > property > >> - Handling of enabled flag. I think the handling of the enabled flag >> should be done in the "set" method instead of the get method. As it >> stands with your change, our optimized path will require two if tests >> (one in the local method to test for None, and one in the get method to >> test enabled). If we moved the enabled test to the set method we would >> only pay the cost of that if when we have optimizations diabled. > > > You are right. Fixed. > >> >> > I left some interesting problem to you: it is possible to optimize >> > declaration_path >> > algorithm even more. In most cases its complexity could be O(1). This >> > could be done >> > by saving intermediate results. Another way to say it: to re-use >> > parent declaration path >> > cache. > > > What about this? I haven't looked at this at all. I am not sure I understand what is going on here well enough to do it right and it isn't showing up high in any of my traces anymore. So it really isn't a high priority for me. -Allen |