Re: [pygccxml-development] Another performance tweak

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Roman Yakovenko wrote:

> On 8/27/06, Allen Bierbaum <al...@vr...> wrote:
>
>> I have included comments below.  Before that though...
>>
>> I have found a few more optimizations:
>>
>> - Making create_identifier return identity adds about 15% to performance
>> - Adding a cache to module_builder that caches the global namespace tree
>> after parsing, joining, and running the query optimizer.  This boosts
>> performance by another 40%.  (both by saving more startup and making the
>> query optimization initialization less costly so it can be used)
>
>
> Can you  explain relationship between it and declarations cache in 
> pygccxml?

The module builder cache supercedes it. 

The module builder cache caches the decl tree after all loading, parsing 
in pygccxml and the additional processing in pyplusplus (optimization 
initialization).  The decl cache can/is still used if you don't use the 
module builder cache, but when using the module builder cache the decl 
cache is not really needed.

>> So now I have my run time down to aroung 59 seconds.  Although I would
>> really like to make it go even faster I think taking the running time
>> down from the original 12 minutes to 1 minute now is a good enough
>> improvement to make it so I can use py++ much easier in normal 
>> development.
>
>
> :-)
>
>> Roman:  Do you want a patch for module_builder caching or do you just
>> want me to commit it?  (all the changes are in the __init__ method and
>> are very clear so you can see what is in there and change it if you 
>> prefer)
>
>
> Please commit them, but still explain the relationship :-)

Okay.  I should have time to commit it later today.

>
>> Now on to the comments....
>>
>> Roman Yakovenko wrote:
>>
>> > On 8/26/06, Allen Bierbaum <al...@vr...> wrote:
>> >
>> >> >> - Caching the results of type_traits.remove_alias improved
>> >> performance
>> >> >> by 20%
>> >> >
>> >> >
>> >> > Yes, this algorithms is used every where. Please don't commit this
>> >> > change, I'd like
>> >> > to think about it. I see next problem here: there are a lot of 
>> types,
>> >> > so the cache
>> >> > memory size could go out of control. Can you find out how many 
>> memory
>> >> > Py++ takes with\without the cache?
>> >>
>> >> I can try to take a look at this.  I don't have an easy way to do it
>> >> right now but I will try later.
>> >
>> >
>> > I am think that I don't want to introduce cache of results of
>> > type_traits.
>> > I don't feel comfortable with them. One of the reasons is that it is
>> > not that easy to
>> > disable them.
>>
>> Maybe I am missing something, but why can't we control them by
>> introducing a flag in type_traits like "caching_enabled" or something
>> and then just testing that whenever the method is called.
>>
>> It would be a shame not to optimize this method when it gives a 20%
>> performance boost.
>
>
> You are right, I don't have good reason. I will commit the patch. By
> the way it was
> a tricky one. One of my test failed. It was very difficult ( few hours
> with debugger )
> to find the problem. That is exactly the reason why I hate cache, but
> what can I do
> if you like it :-) ?
>
>> Are you suggesting I add this as a new method and change the code that
>> calls it?
>
>
> No. I just committed the results.
>
>> In my own local copy I have replaced "created_identifier" with an
>> identify function and am getting very good results.
>
>
> Why do you have goodies? For a time being please put it in a goodies.

I had trouble putting it into goodies as an override because it looks 
like the method comes in through multiple module aliases.  In other 
words I haven't found a way to override all the of the uses of the method. 

I have run into a problem like this on another project and it is really 
tough to track down.  Basically it probably means there are multiple 
places where the module with create_identifier are being imported and 
the imports are using slightly different names to import the module.  So 
sys.modules[] ends up with multiple entries that are both pointing to 
different imports of the same code.  I think future versions of python 
are going to fix this problem by making you have to use the full 
package.module.name import no matter what file you are in, but for now 
this is a hard issue to track down.

>> Right now I am not quite ready to update to your changes.  I have some
>> concerns about the complexity and performance of the way you implemented
>> this.
>
>
> Do you have the numbers? I did not read the benchmark results. I will
> do this later.

I don't have numbers.  I didn't want to update and deal with the 
conflicts only to find I didn't want the update.  With the changes you 
have made now I think it will probably be safe for me to update.  I will 
let you know if something goes bad.

> You are right, I introduced one more "get attribute". I have 1 good 
> reason
> for this maintainability. I'd like to keep all "cached" values in 
> single place.
> Thus, I don't have to scan sources for "what I am caching" and more over
> it is very easy to create documentation to it. I am sure you will not 
> see this
> "get attribute" in the benchmark
>
>
>> - Why is there an "algorithms_cache_t" object as the base class to
>> "declaration_algs_cache_t"?  The split with a base class doesn't seem to
>> serve much of a purpose.
>
>
> Enable\disable functionality. But I think it is useless now. The
> better idea is to
> introduce type_algs_cache_t class with class variable , that will 
> control all
> instances of type_algs_cache_t.
>
>> - What is the performance implication of using "properties"?  I know you
>> like to use them in your code but how do they affect performance?
>
>
> I tried with and without them and I did not see the difference. May be 
> you will
> see. If you do see the difference, we can leave set_* methods and remove
> property
>
>> - Handling of enabled flag.  I think the handling of the enabled flag
>> should be done in the "set" method instead of the get method.  As it
>> stands with your change, our optimized path will require two if tests
>> (one in the local method to test for None, and one in the get method to
>> test enabled).  If we moved the enabled test to the set method we would
>> only pay the cost of that if when we have optimizations diabled.
>
>
> You are right. Fixed.
>
>>
>> > I left some interesting problem to you: it is possible to optimize
>> > declaration_path
>> > algorithm even more. In most cases its complexity could be O(1). This
>> > could be done
>> > by saving intermediate results. Another way to say it: to re-use
>> > parent declaration path
>> > cache.
>
>
> What about this?

I haven't looked at this at all.  I am not sure I understand what is 
going on here well enough to do it right and it isn't showing up high in 
any of my traces anymore.  So it really isn't a high priority for me.

-Allen