From: Juho S. <js...@us...> - 2007-09-30 23:18:57
|
Update of /cvsroot/sbcl/sbcl/src/compiler/generic In directory sc8-pr-cvs8.sourceforge.net:/tmp/cvs-serv12016/src/compiler/generic Modified Files: early-objdef.lisp genesis.lisp Log Message: 1.0.10.14: remove locking and gc inhibition from hash-tables, power of 2 sizes This commit removes a bunch of bottlenecks from the hash-table implementation. It speeds up GETHASH, (SETF GETHASH) and REMHASH by a factor of 2-4x (on platforms with a real WITH-PINNED-OBJECTS) depending on the operation. On the flip side, no automatic locking is done on tables any more, so multi-threaded applications must do their own locking. (The locking done by SBCL was always just an implementation detail, not a part of the external interface). By popular demand it's also still safe to have multiple readers on the same table without locking. Originally GCs were inhibited during most hash-table operations for two reasons. To prevent the GC from rehashing a table while a Lisp-side operation is going on, and to prevent the GC from moving the key after the hash-value has been calculated. More recently, most hash-tables operations have acquired a lock on the table in order to prevent two concurrent writers from corrupting the chains. While it's never been the intent for the standard data structures to be automatically thread-safe in SBCL, this locking had to be done since corrupt tables could lead to infinite GC loops. Both the locking and the without-gcing are expensive operations relative to the total cost of a hash-table lookup. This commit removes both the gc inhibition and the locks. Additionally we switch to power of two table size, which allows calculating a cheaper hash -> bucket with cheaper operations than MOD. * The GC no longer does the rehashing itself, but just marks the hash-table as needing a rehash, which will then be done Lisp-side when the table is next accessed. While it's possible to find cases where the former behaviour has better performance, they're very contrived. * The hash-table operations that work on the chains now check for loops in the chains, and signal an error if one is found. * The hash-table operations now pin the key before calculating the hash value (needed for EQ-based hash functions). * Add a GC epoch value that GETHASH can use to check whether a GC happened during the lookup. This is needed since another thread calling GETHASH on the same table might have caused it to be rehashed. * Kill the old MUST-REHASH vector header, and replace it with a slot in the HASH-TABLE structure. The overloading of the header caused missed rehashings when both the GC and %%PUTHASH modified it at the same time. * Switch to power of two table sizes, with a slightly more complex hash value -> bucket calculation than just taking the low bits, which in many cases have a very skewed distribution for the existing SBCL hash functions. Still a lot faster than using MOD. * Leave in locking and GC inhibition during rehashing (needed to allow multiple readers to coexist) and for weak hash-tables (they need some GC support, and the code is much simpler when all of the logic is in the GC instead of interleaved in the GC and Lisp-side). Neither of these cases is performance critical. Index: early-objdef.lisp =================================================================== RCS file: /cvsroot/sbcl/sbcl/src/compiler/generic/early-objdef.lisp,v retrieving revision 1.30 retrieving revision 1.31 diff -u -d -r1.30 -r1.31 --- early-objdef.lisp 10 Jun 2006 05:07:24 -0000 1.30 +++ early-objdef.lisp 30 Sep 2007 23:18:51 -0000 1.31 @@ -222,5 +222,4 @@ (defenum (:prefix vector- :suffix -subtype) normal unused - valid-hashing - must-rehash) + valid-hashing) Index: genesis.lisp =================================================================== RCS file: /cvsroot/sbcl/sbcl/src/compiler/generic/genesis.lisp,v retrieving revision 1.136 retrieving revision 1.137 diff -u -d -r1.136 -r1.137 --- genesis.lisp 1 Sep 2007 23:37:42 -0000 1.136 +++ genesis.lisp 30 Sep 2007 23:18:51 -0000 1.137 @@ -2729,7 +2729,6 @@ (symbol-value c) nil) constants)) - (setf constants (sort constants (lambda (const1 const2) |