Re: [pygccxml-development] Re: recent changes to pygccxml\pyplusplus
Brought to you by:
mbaas,
roman_yakovenko
From: Matthias B. <ba...@ir...> - 2006-03-05 18:41:53
|
Roman Yakovenko wrote: >> first needs to be resolved before I can finish the class (so is >> source_reader_t.__parse_gccxml_created_file() supposed to return the >> files as a dictionary or as a list?) > > May be I missed something, but what cache do you implement? > Any way the answer: list, I already fixed this bug. I've implemented a new cache class because I had some issues with the file_cache_t class: - The cache file is about 39MB and on a machine with 512MB main memory I couldn't use that cache anymore. The machine started memory thrashing while parsing the headers (when the cache was already there) and CPU usage dropped to around 1%. - When I was creating wrappers for only a few selected classes using the cache took much more time than without cache (because the cache always loads all cached declarations, no matter if they are required or not). That's why I had a look at the caching mechanism and implemented my own class that fixed the above issues. Instead of one single cache file the cache now uses a directory and stores individual files (one file per header). Here is a comparison of the time it takes to do the parse() step for 222 headers from the Maya SDK (the table is best viewed with a fixed pitch font): | Parsing time | Cache size | Parameters | (min:sec) | (MB) | ----------------------+--------------+------------+-------------- Without cache | 2:53 | - | | | | File cache (initial) | 4:12 | 39.1 | File cache (cached) | 1:58 | 39.1 | | | | Dir cache (initial) | 3:40 | 38.4 | -compression Dir cache (cached) | 0:34 | 38.4 | -compression Dir cache (initial) | 4:03 | 11.8 | +compression Dir cache (cached) | 2:18 | 11.8 | +compression ----------------------+--------------+------------+-------------- The "initial" rows refer to the cases when the cache didn't exist yet and had to be build. But of course, this only has to be done once, so the "cached" rows are the more important ones. The directory cache has an option to compress the cache files which was used in the last two rows (so in my case, compression isn't really useful for me). Memory usage of the directory cache is much lower, so I could also use that cache on the machine with "only" 512MB main memory. There's also no disadvantage anymore when only a few headers are parsed while the cache actually contains a lot more headers. Cached declarations that are not requested by the main program are never touched. Roman, is it ok when I commit that cache into the pygccxml directory? The implementation consists of one single file "directory_cache.py" which I would put into the pygccxml.parser directory. The new class is called "directory_cache_t" (because the user has to specify a directory name instead of a file name). There are also a few internal helper classes, but those aren't meant to be instantiated by the user. I was using your naming conventions as far as I could figure them out (lower case class/method names with underscores between words and the classes have a "_t" suffix. Private methods have a leading underscore). Doc strings are available. I haven't modified any other file, so everything will still work as before (and the file cache is still the default). To activate the directory cache the user currently has to instantiate a class himself and pass this instance to the parse() method. I can also email you the file first if you want to have a look at it before it is actually added to the repository. There's still one more question I have regarding the source_reader_t.__parse_gccxml_created_file() method. What is the exact meaning of the returned file list? The update() method of a cache class receives this list as "included_files" argument, so one might think the list only contains the files that were included from the corresponding header file. But I noticed that the list also contains the header file itself. Is this intentional (and can I rely on that behavior) or is this a bug and the header file does not belong into this list? > 2. I updated setup file before release, when I have something like > "feature freeze" period. > I suppose, that every one who use CVS will be able to use it > without setup. I could be > wrong, but right now I prefer to concentrate my attention on something else Well, it's only two lines that need to be added to setup_pyplusplus.py: , 'pyplusplus.decl_wrappers' , 'pyplusplus.module_builder' If you want I can commit that change back into the repository myself. >> Well, you could fill in some extra words (such as class_wrapper_t) but >> as Python already organizes code in a hierarchy I guess it would be a >> better idea to use decl_wrapper.class_t explicitly instead of importing >> the classes into the namespace of another module. Then it's also clear >> to the reader which class is being referred to. > > So, basically you would like to stay with name I proposed, but user > is forced to use "fully qualified" names, am I right? No. I don't want the user having to deal with all those classes anyway, so the user should only need *one* such class (of each kind) which he might import into his namespace if he wishes to. My suggestion to use the fully qualified names only refers to the internal implementation (of course, this is entirely up to you what conventions you want to use as you're the maintainer of the package. I was just thinking that it might prevent confusion among those people who also want to have a look at the sources of pyplusplus... :) >> In the Maya SDK there are a couple of related classes that basically >> have the same interface (e.g. vector, float vector, point, float point, >> color, then the same thing for array versions etc). When decorating >> those classes I can treat every class of such a group the same and apply >> the same operations. This is where being able to select stuff from >> several classes at once can be quite handy. > > May I give you a small advice? You can combine between power of pyplusplus and > power of C++. I think that using creating single template for every > group is better solution. I'm not sure if I understand what you mean. The above classes aren't implemented by myself, they are part of the Maya SDK and, of course, I'm not in the position to change that SDK. - Matthias - |