Thread: [Doxygen-develop] Doxygen taking longer to run? (long msg)
Brought to you by:
dimitri
From: Edmund G. <ed...@gr...> - 2003-08-10 16:51:55
|
Doxygen seems to have gotten quite a lot slower in the more recent versions than it used to - with one of our projects it now takes 3 times as long to document it as to compile it! I'm considering looking in to this, and am posting this to see if I'm heading in the right direction, and if anybody else has already looked at the problem. First of all I've tried downloading an older version of doxygen to check whether it really has got that much slower or not. Both doxygens were configured --english-only and --release, and compiled with gcc3.2.2. doxygen-1.2.10 (./configure --english-only --release) small project big project real 0m55.423s real 10m5.203s user 0m26.060s user 4m40.320s sys 0m3.010s sys 0m33.020s latest doxygen 1.3.3-20030808 (./configure --english-only --release) small project big project real 1m8.084s real 136m3.918s user 0m33.610s user 66m38.160s sys 0m3.710s sys 0m39.490s approximate ratios of (user+sys) time: small project big project 1.3 times longer 13 times longer note that they only got about 50% of the CPU as I'm profiling doxygen on the large project in the background, so the "real" time is not that meaningful. As a (very) rough guide to the sizes of these project, the small one is outputting 363 html files (without verbatim header), the big project 1843 html files (with verbatim headers). Both projects are using nested structs and namespaces everywhere, and the bigger one is also using a lot of templates and typedefs. Both projects are also using dot for heirarchy and collaboration charts. I've profiled doxygen on the small project. | aside : I did this by editing tmake/lib/linux-g++/tmake.cong to set | the following and then configuring/building as --debug | | TMAKE_CFLAGS_DEBUG = -g -fprofile-arcs -ftest-coverage -pg | TMAKE_LFLAGS_DEBUG = -g -pg | is this the easiest way to do it? and a sensible set of options? The relevant looking parts of the output from gprof ("..." where I have cut uninteresting looking bits) are: index % time self children called name <spontaneous> [1] 30.8 0.00 15.24 main [1] 0.00 11.45 1/1 generateOutput() [2] 0.00 3.78 1/1 parseInput() [18] ... ----------------------------------------------- 0.00 11.45 1/1 main [1] [2] 23.1 0.00 11.45 1 generateOutput() [2] 0.00 7.75 1/1 generateNamespaceDocs() [9] 0.00 3.15 1/1 generateFileDocs() [20] 0.00 0.49 1/1 generateClassDocs() [59] ... ----------------------------------------------- <spontaneous> [3] 22.4 11.07 0.00 __fetch_gcov_type [3] ----------------------------------------------- 0.00 0.00 9/30685 codeYYlex() [307] 0.00 0.01 18/30685 generateClassOrGlobalLink(BaseCodeDocInterface&, char*, bool) [478] 0.00 0.11 320/30685 isVarWithConstructor(Entry*) [132] 0.00 0.41 1152/30685 getDefs(QCString const&, QCString const&, char const*, MemberDef*&, ClassDef*&, FileDef*&, NamespaceDef*&, GroupDef*&, bool, FileDef*, bool) [61] 0.00 0.81 2269/30685 findClassRelation(Entry*, ClassDef*, BaseInfo*, QDict<int>*, FindBaseClassRelation_Mode, bool) <cycle 1> [29] 0.00 1.05 2925/30685 findClassWithinClassContext(ClassDef*, QCString const&) [37] 0.00 8.59 23992/30685 linkifyText(TextGeneratorIntf const&, Definition*, char const*, char const*, bool, bool) [7] [4] 22.2 0.00 10.98 30685 getResolvedClass(Definition*, char const*, bool*, QCString*) [4] 0.13 10.82 30685/30685 getResolvedClassRecursive(Definition*, char const*, bool*, QCString*) [5] 0.00 0.03 30685/30687 QDict<Definition>::clear() [256] ----------------------------------------------- 112827 getResolvedClassRecursive(Definition*, char const*, bool*, QCString*) [5] 0.13 10.82 30685/30685 getResolvedClass(Definition*, char const*, bool*, QCString*) [4] [5] 22.1 0.13 10.82 30685+112827 getResolvedClassRecursive(Definition*, char const*, bool*, QCString*) [5] 0.48 7.42 461962/462331 resolveTypeDef(Definition*, QCString const&, Definition**) [8] 0.02 1.35 902261/1038104 QString::QString[in-charge](QArray<char> const&) [30] 0.01 0.43 462255/535733 SDict<ClassDef>::find(char const*) [58] 0.01 0.36 676607/2105537 QDict<Definition>::find(QString const&) const [36] ----------------------------------------------- <spontaneous> [6] 21.2 10.50 0.00 __store_gcov_type [6] ----------------------------------------------- ... [7] 19.2 0.00 9.50 3487 linkifyText(TextGeneratorIntf const&, Definition*, char const*, char const*, bool, bool) [7] 0.00 8.59 23992/30685 getResolvedClass(Definition*, char const*, bool*, QCString*) [4] ... ----------------------------------------------- 1493786 resolveTypeDef(Definition*, QCString const&, Definition**) [8] 0.00 0.01 369/462331 findUsedClassesForClass(Entry*, ClassDef*, ClassDef*, bool, ArgumentList*, QDict<int>*) <cycle 1> [78] 0.48 7.42 461962/462331 getResolvedClassRecursive(Definition*, char const*, bool*, QCString*) [5] [8] 16.0 0.48 7.43 462331+1493786 resolveTypeDef(Definition*, QCString const&, Definition**) [8] 0.11 2.47 2633099/2633099 SDict<MemberName>::find(char const*) [23] 0.02 1.38 1428683/1428930 NamespaceDef::findInnerCompound(char const*) [34] 0.34 0.39 1764027/1777806 getScopeFragment(QCString const&, int, int*) [40] ... (I think my mail program will wrap the above at 80 columns, so apologies if it is mangled somewhat) My interpretation of this is that of the 15.2 seconds spent exectuting doxygen code, 10.8 of those seconds were spent in getResolvedClassRecursive(Definition*, char const*, bool*, QCString*) Of those spent in getResolvedClassRecursive, 7.4 were then spent in resolveTypeDef(Definition*, QCString const&, Definition**) [8] This is the first time I've delved in to the doxygen code, and so am not very familier with any of its data structures yet. Can Dmitri confirm that I'm understanding the overal functionality of these two functions so far, based on this pseodo code/english description (ignoring for now lots of technicalities handled in the code) /* * find the fully qualified class name refered to by the input class * or typedef name against the input scope. * loops through scope and each of its parent scopes looking for a * match against the input name. Also recursivly calls itself to check * against any imported namespaces in each scope being checked. */ ClassDef *getResolvedClassRecursive( Definition *scope, QCString name, ... ) { do { check for typedef with call to resolveTypeDef with name against scope if name was a typedef { if result is itself a typedef { repeatedly call resolveTypeDef with result against scope, until a non-typedef result is found } return result. } else name was not a typedef { try ${scope}::${name} and return that if found. otherwise { check name against all classes imported with a "using CLASSNAME", and return if found for each namespace imported in this scope with a "using namespace", recursivly try getResolvedClassRecursive with the used namespace, returning result if found } } scope = scope->getOuterScope() } while(scope) no match found, return 0 } /* * loop through context and each of its parent contexts looking for a match against qualified name */ QCString resolveTypeDef(Definition *context, const QCString &qualifiedName, Definition **typedefContext) { resName = qualifiedName stripped of any explicit scope prefixes MemberDef *md = 0; while(context && !md) { // step 1: get the right scope setup resScope to be the right scope from context, and any explicit scope in qualifiedName // step 2: get the member lookup resName in resScope in appropriate dictionary, putting result in md context=context->getOuterScope() } if md, then return mf->typeString() otherwise return 0; } Questions/suggestions so far: If "resolveTypeDef" is checking against the input context and all of its parent contexts, then the entire check for a typedef in getResolvedClassRecursive can be taken outside of the "do{ } while(scope)" loop. I've tried this change and re-run doxygen on the small project - in this case it reduced the time spent in resolveTypeDef by over 50%, an alledged 25% improvement on the total execution time spent in doxygen. Is this really a safe change? There are no changes that I can see by diff'ing the output with and without the change for the small project (after filtering out the "Generated on" lines, obviously), but that project is not using many typedefs. Bits I don't understand : I'm still somewhat lost in trying to understand the "resolveTypeDef" function. "step 1: get the right scope" is doing something far more complicated than merely taking the current scope and appending any explicitly specified scope from the qualifiedName to it. Why is this necessary, or why is the code for resolving typedefs so different from that for resolving classes? - I guess maybe I should start again and look at doxygen from the beginnning rather than just diving straight in like this though. why isn't there a "goto found;" at util.cpp:672 instead of a break after successfully matching against an imported class? otherwise can't util.cpp:686 still get exectuted and the result lost? or have I missed something here. Edmund Green. |
From: Edmund G. <ed...@gr...> - 2003-08-11 12:06:15
Attachments:
minorSpeedIncreasePatch.txt
|
Attached is my suggested patch for eliminating a few redundant calls to resolveTypeDef I've also been thinking a little more about the speed problem, and if I've understood it correctly then changing getResolveClassRecursive from iterating through all the visible namespaces searching for matching names, to doing a global lookup of all matching local names (i.e. RHS of the name, against RHS of all full names) and then searching within the list of matching names for the best matching namespace could be a lot faster in the typical case where there is only one instance of the local name. It would also remove the need for recursion. This would involve adding a global map of lists of fully qualified names indexed by local-name - e.g. some pseodo code for getting a matching class name with this mechanism (ignoring typedefs, and other technicalities) ClassDef *getResolvedClass( Definition *scope, QCString name, ... ) { String localName = name with any namespace prefix stripped off; String explicitNs = any namespace prefix from the name; List<ClassDef*> *possibleMatches = globalMapOfNames.find(localName); ClassDef* bestMatch = 0; for possibleName in possibleMatches { if possibleName is better match than bestMatch for name from scope { bestMatch = possibleName; } } return bestMatch; } Admittedely the test inside the loop is slightly awkard when taking in to account "using" namespaces, but it would seem to mostly be a case of matching the left hand side of the possible name against the left-most portion of the scope, and the right hand side against the name, and seeing if any remaining unaccounted for bits in possibleName can be made up from the available imported namespaces. I'd expect this to be a lot faster as possibleMatches would typically only have a single entry for most C++ and Java code - though there are a few cases, such as defining call-back interfaces as inner-classes/structs all with the same local name, which would then give longer lists to linearly search through which would also be expected to grow in length approximately linearly with the size of the project (Looking at one of our projects I only found a couple of duplicated class names out of several hundred classes - with the worst case being used in 13 different scopes) Edmund Green. http://www.greenius.ltd.uk/ |
From: Dimitri v. H. <di...@st...> - 2003-08-11 13:18:26
|
On Mon, Aug 11, 2003 at 12:52:53PM +0100, Edmund Green wrote: > Attached is my suggested patch for eliminating a few redundant calls to > resolveTypeDef Thanks for looking into this. As you have noticed, the resolution of typedefs and classes has evolved into something quite complex, and as you demonstrated that it is/can be a bottleneck, I agree it is time to rethink the design. > I've also been thinking a little more about the speed problem, and if > I've understood it correctly then changing getResolveClassRecursive from > iterating through all the visible namespaces searching for matching > names, to doing a global lookup of all matching local names (i.e. RHS of > the name, against RHS of all full names) and then searching within the > list of matching names for the best matching namespace could be a lot > faster in the typical case where there is only one instance of the local > name. It would also remove the need for recursion. > > This would involve adding a global map of lists of fully qualified names > indexed by local-name - e.g. some pseodo code for getting a matching > class name with this mechanism (ignoring typedefs, and other technicalities) > > ClassDef *getResolvedClass( > Definition *scope, > QCString name, > ... > ) > { > String localName = name with any namespace prefix stripped off; > String explicitNs = any namespace prefix from the name; > > List<ClassDef*> *possibleMatches = globalMapOfNames.find(localName); > > ClassDef* bestMatch = 0; > for possibleName in possibleMatches > { > if possibleName is better match than bestMatch for name from scope > { > bestMatch = possibleName; > } > } > return bestMatch; > } > > Admittedely the test inside the loop is slightly awkard when taking in > to account "using" namespaces, but it would seem to mostly be a case of > matching the left hand side of the possible name against the left-most > portion of the scope, and the right hand side against the name, and > seeing if any remaining unaccounted for bits in possibleName can be made > up from the available imported namespaces. > > I'd expect this to be a lot faster as possibleMatches would typically > only have a single entry for most C++ and Java code - though there are a > few cases, such as defining call-back interfaces as > inner-classes/structs all with the same local name, which would then > give longer lists to linearly search through which would also be > expected to grow in length approximately linearly with the size of the > project (Looking at one of our projects I only found a couple of > duplicated class names out of several hundred classes - with the worst > case being used in 13 different scopes) I do something like this for members now. I have two maps, one for file/namespace members and one for class members. What you suggest sounds like a generalization of this, making it one map for all symbols, which could indeed make a lot of things simpler in doxygen. The downside is that it is a lot of work, so we should find a way to introduce this gradually. Regards, Dimitri |
From: Edmund G. <ed...@gr...> - 2003-08-26 10:14:50
Attachments:
namespaceResolveProblem2.cpp
|
re: problems with "using" and resolving typedefs and classes > The new release I just committed to CVS should resolve the above problems. > Please verify this and let me know of any other problems. It (20030824 from CVS) seems to work O.K on linkifying the annotated source code for my new test-case (attached), with options: EXTRACT_ALL = YES EXTRACT_PRIVATE = YES EXTRACT_STATIC = YES SOURCE_BROWSER = YES GENERATE_LATEX = NO However there are still a few quirks elsewhere: 1) in "classTest1_1_1Test1Class.html", the parameter types for "Test1Class::testGlobalUsingNamespace" and "Test1Class::testGlobalUsingType" are not hyperlinked (even though they are in "namespaceResolveProblem2_8cpp-source.html") 2) Each class and typdef used to have a "referenced by" lists with their detailed documentation - this seems to have disappeared. The only other possible problem, which may be more of a feature rather than a bug is: 3) because typedefs are linked back to the root class which they refer to (which is probably what users want most of the time), typedefs to system/STL classes are not hyperlinked (line 102 of the annotated source) - I assume that I can work around this by forward declaring and documenting the relevant STL classes if needed. Edmund Green. Greenius Ltd. http://www.greenius.ltd.uk/ |