|
From: Frost, G. <Gar...@am...> - 2009-08-13 15:11:40
|
I was profiling some code which is using HSQLDB (I profiled with oprofile on Linux) and discovered (looking at the generated x64 code) that the org.hsqldb.Like.compareAt() method was consuming a lot of CPU cycles. So referring to ... http://hsqldb.cvs.sourceforge.net/viewvc/hsqldb/hsqldb/src/org/hsqldb/Li ke.java?revision=1.4&view=markup Even though compareAt() is a clean recursive solution to wildcard matching, unfortunately some recursive patterns don't get optimized particularly well by the JIT. In this case the code is recursive and is accessing fields of the instance and (as we may know) field accesses are slower than stack access. I may try and refactor to use a non recursive solution (and avoiding field accesses), however my first experiment yielded a 10% improvement on the application I was using, so I figured I would pass it along as a suggestion. If we wide the compareAt() API so that instance fields are passed as arguments, the JIT optimizer is likely to assign the array references to registers, which can then be kept in registers throughout the call sequence. As the fields are never modified, and the JVM avoids having to keep accessing memory we yields an improvement in performance. Especially as the code recurses deeper into the match. So my suggested change is to change private boolean compareAt(String s, int i, int j, int jLen) to private boolean compareAt(String s, int i, int j, int jLen, char cLike[], int[] iType) and then to widen the call site (in compare(String)) from return compareAt(s, 0, 0, s.length()); to return compareAt(s, 0, 0, s.length(), cLike, iType); Note that the code mody of compareLike is not changed (we are relying on the fact that the stack version of clike and iType are hiding the fields (I was too lazy to rename everything). One some local microbenchmarks on a 24 core machine (it's nice working at AMD ;) ) I have observed 57% improvement using this code transformation. On my laptop I see a few %, but of course every little bit helps. If you guys have a performance benchmark/regression suite that you use to measure performance regressions I would be interested in hearing what kind of performance delta you observe. Of course I would welcome comments/suggestions. Gary |