From: John Whaley <jwhaley@st...> - 2004-09-29 08:31:42
I have a few possible performance updates to buddy that I can check in.
- Optimized cache entry structure. The current version of buddy wastes 1 to
2 words of space on every cache entry. I have a simple patch that fixes
this problem, improving performance by ~5% on programs that make use of the
cache, while also reducing memory usage. It will also allocate the caches
on demand, so operation caches will not be allocated until they are actually
- Hardware prefetch directives. BDDs are very memory-bound: Profiling the
BDD library with VTune shows that over 80% of the time is spent on L2 cache
misses. Although there are many true dependencies in the code, some amount
of prefetching is possible and beneficial (~5% performance improvement).
The prefetch calls do not change the semantics of the code and are ignored
on architectures that do not support them.
- Specialized code for apply operators. Right now, there is one function
(apply_rec) for all apply operators. This means at every node, it must do a
switch on the operator. By splitting operations into separate functions,
performance is improved due to tighter code and more optimization
opportunities. Also, using separate caches for different operations reduces
the cache entry size by a word, speeds up hash computations, improves cache
hit rates and reduces collisions.
I have been using these modifications for a while with no problems. What do
people think about them? The first two are fairly minor. The third is more
involved because it increases the amount of code, but is still
Also, was there any consensus on the other patches I described earlier
(support for automatic generation of BDD traces and preallocation of node
table)? I've been sitting on those in my local version for the last few
months and they seem rock solid.
One option is I could check in the changes on an experimental branch until
they are proven to be stable and worthwhile, at which point we could merge
it back into the trunk.
PS. The recent lrand48() change broke the Windows build so I'll check in a
small fix that makes it compile cleanly again on Windows.
Get latest updates about Open Source Projects, Conferences and News.