From: Braden M. <br...@en...> - 2008-09-27 06:20:27
|
The occurrence of a few particularly large source files in the OpenVRML codebase has been a point of annoyance for several (potential) users over the years. My inclination toward files of such magnitude (~20k lines) had to do with the fact that the only way really to hide a symbol in C++ was to put it in an unnamed namespace, making it local to a particular translation unit. So code that needs to share implementation code all needs to share the same file in order to keep the non-public symbols properly hidden. Inevitably, as code grows and matures under such a scheme, commonalities push related implementation code together. And sometimes there can be a *lot* of such related code. There are a couple of significant downsides to large translation units: * They parallelize poorly. While a single-processor machine might process more code in fewer files somewhat faster, parallel builds process multiple smaller files much more efficiently. * They consume a lot of memory when compiling. gcc's memory demands seem to have gone up significantly in recent years, making this even more of a problem than it once was. In the last few gcc releases, "symbol visibility" attributes have been introduced. These allow library authors to inform the linker specifically whether a symbol should be publicly exposed, rather than relying on details that may or may not be implied by particular language features. Using these attributes, it's no longer necessary to bury implementation details in unnamed namespaces (or similar) to keep them hidden in the compiled binary. So I've been taking advantage of this feature to attack the problem. The most egregious offender, vrml97node.cpp, was broken up some months ago before I started converting openvrml-xembed to use D-Bus. More recently I've broken up the other node implementation files and put a significant dent in the second worst offender, browser.cpp. I've added a namespace openvrml::local as a place to put things that will have hidden symbols (i.e., the OPENVRML_LOCAL macro is applied), yet need to be part of more than one translation unit. Note, though, that the headers associated with this namespace *do not get installed*. That means that no public headers are allowed to include them. So far I've pared browser.cpp down to a little more than 6000 lines. Compiling on my x86_64 Linux machine, its high-water mark in memory is around 1.3 GB. If that sounds big, consider that before this surgery it was taking at least 2.2 GB to compile. (And recall that for a 32-bit platform, you can expect to cut this memory footprint roughly in half.) I suspect that the better part of that 1.3 GB footprint has to do with the fact that two big Spirit parsers get instantiated in browser.cpp. Pushing these instantiations out to different translation units would probably be a significant win; and that's something I'll probably pursue before releasing 0.18. -- Braden McDaniel e-mail: <br...@en...> <http://endoframe.com> Jabber: <br...@ja...> |