Thread: [pygccxml-development] Performance problems and fixes
Brought to you by:
mbaas,
roman_yakovenko
From: Allen B. <al...@vr...> - 2006-08-25 03:14:05
|
Today I ran my generation script through the python hotspot profiler and found some interesting results. The majority of the time was being spent in algorithm.declaration_path. I fixed this up and double the speed of my script. :) After that I decided it may be worth investigating some other spots. What I have found is that the following areas are consuming a lot of time. (not sure if it can be fixed or not, but this is where the time is spent) (all of these are in pygccxml) - Getting names from declarations - Testing declarations and calldefs for equality - Getting the decl_string (and creating it) for cpptypes. declarations.py _get_name line 151 class_declarations.py _get_name_impl line 105 declaration.py _eq_ line 119 calldef.py _eq_ line 121 and 53 cpptypes.py _get_decl_string line 37 algorithm.py full_name line 30 Once I get into the office tomorrow I will upload my hotspot trace so other people can see what I ran into and hopefully find some other places to improve the performance of py++. :) -Allen |
From: Roman Y. <rom...@gm...> - 2006-08-25 17:30:44
|
On 8/25/06, Allen Bierbaum <al...@vr...> wrote: > Today I ran my generation script through the python hotspot profiler and > found some interesting results. > > The majority of the time was being spent in algorithm.declaration_path. > I fixed this up and double the speed of my script. :) These are good news! > After that I decided it may be worth investigating some other spots. Before you do this, there is some work left on the previous patch :-(/ > What I have found is that the following areas are consuming a lot of > time. (not sure if it can be fixed or not, but this is where the time > is spent) > > (all of these are in pygccxml) > - Getting names from declarations > - Testing declarations and calldefs for equality > - Getting the decl_string (and creating it) for cpptypes. > > declarations.py _get_name line 151 > class_declarations.py _get_name_impl line 105 > declaration.py _eq_ line 119 > calldef.py _eq_ line 121 and 53 > cpptypes.py _get_decl_string line 37 > algorithm.py full_name line 30 > > Once I get into the office tomorrow I will upload my hotspot trace so > other people can see what I ran into and hopefully find some other > places to improve the performance of py++. :) The issue that left. Consider next use case ( a real one ). I am not able to run a gccxml on boost.date_time library on windows. So I run the gccxml on Linux and than use created xml file on Windows. The only issue I has is next: there is a template class that is instantiated with different values. So, on Windows I give class new name. Another issue is free template function on return type: template< class X> X do_smth(int, double); int do_smth( int ); By default the generated code will not compile. User will have to rename instantiated function to "do_smth<xyz>". You can take a look on pyboost_dev/dev/date_time/generate_code.py script. The point is that if user rename a declaration, the cache value you introduced should be reset to None. In case of namespace and class the internal declaration cache values should be reset too. If you don't do this, Py++ will generate bad code and it will take a lot of time to find out the error. The work, that still should be done: 1. To add new variable to declaration_t class. 2. To reset it and internal values on name change. 3. To write unit test ( pygccxml ). Please, do this before you introduce another changes. Thank you. -- Roman Yakovenko C++ Python language binding http://www.language-binding.net/ |
From: Allen B. <al...@vr...> - 2006-08-25 18:26:26
|
Roman Yakovenko wrote: > On 8/25/06, Allen Bierbaum <al...@vr...> wrote: > >> Today I ran my generation script through the python hotspot profiler and >> found some interesting results. >> >> The majority of the time was being spent in algorithm.declaration_path. >> I fixed this up and double the speed of my script. :) > > > These are good news! > >> After that I decided it may be worth investigating some other spots. > > > Before you do this, there is some work left on the previous patch :-(/ > >> What I have found is that the following areas are consuming a lot of >> time. (not sure if it can be fixed or not, but this is where the time >> is spent) >> >> (all of these are in pygccxml) >> - Getting names from declarations >> - Testing declarations and calldefs for equality >> - Getting the decl_string (and creating it) for cpptypes. >> >> declarations.py _get_name line 151 >> class_declarations.py _get_name_impl line 105 >> declaration.py _eq_ line 119 >> calldef.py _eq_ line 121 and 53 >> cpptypes.py _get_decl_string line 37 >> algorithm.py full_name line 30 >> >> Once I get into the office tomorrow I will upload my hotspot trace so >> other people can see what I ran into and hopefully find some other >> places to improve the performance of py++. :) > > > The issue that left. > Consider next use case ( a real one ). > I am not able to run a gccxml on boost.date_time library on windows. So > I run the gccxml on Linux and than use created xml file on Windows. > The only > issue I has is next: there is a template class that is instantiated > with different values. So, on Windows I give class new name. Another > issue is free template > function on return type: > > template< class X> > X do_smth(int, double); > > int do_smth( int ); > > By default the generated code will not compile. User will have to rename > instantiated function to "do_smth<xyz>". > > You can take a look on pyboost_dev/dev/date_time/generate_code.py script. > > The point is that if user rename a declaration, the cache value you > introduced > should be reset to None. In case of namespace and class the internal > declaration > cache values should be reset too. If you don't do this, Py++ will > generate bad > code and it will take a lot of time to find out the error. > > The work, that still should be done: > 1. To add new variable to declaration_t class. > 2. To reset it and internal values on name change. > 3. To write unit test ( pygccxml ). > > Please, do this before you introduce another changes. Thank you. I sounds like you understand these issues much more then I do. I don't even think I can replicate the problem you describe since I don't understand it. If you can check in a testcase that causes the problem, I could look at it but otherwise I think I am going to need you to make these changes. -Allen > > |
From: Roman Y. <rom...@gm...> - 2006-08-25 18:13:40
|
On 8/25/06, Allen Bierbaum <al...@vr...> wrote: > (all of these are in pygccxml) > - Getting names from declarations Please remove comments you added to the declaration_t class. We don't really need them. Thank you. -- Roman Yakovenko C++ Python language binding http://www.language-binding.net/ |
From: Allen B. <al...@vr...> - 2006-08-25 18:30:00
|
Roman Yakovenko wrote: > On 8/25/06, Allen Bierbaum <al...@vr...> wrote: > >> (all of these are in pygccxml) >> - Getting names from declarations > > > Please remove comments you added to the declaration_t class. We don't > really > need them. I was hoping to get some feedback on that commented out code in declaration.py. See the diff: http://svn.sourceforge.net/viewvc/pygccxml/pygccxml_dev/pygccxml/declarations/declaration.py?r1=466&r2=465&pathrev=466 Can you explain what makes the commented out code fail, and is there anything that can be done to fix it up? Also, why is there a _get_name_impl method at all. Why isn't it just named _get_name? It looked to me last night like a polymorphic _get_name method should work fine. -Allen |