Just Launched: You can now import projects and releases from Google Code onto SourceForge

We are excited to release new functionality to enable a 1-click import from Google Code onto the Allura platform on SourceForge. You can import tickets, wikis, source, releases, and more with a few simple steps.

[fd4e91]: impnotes / hash-dict.html Maximize Restore History

Download this file

hash-dict.html    134 lines (133 with data), 31.3 kB

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133``` ``` 18.1. The Hash Tables Dictionary [CLHS-18.2]
18.1. The Hash Tables Dictionary [CLHS-18.2]
Prev Chapter 18. Hash Tables [CLHS-18] Next

18.1. The Hash Tables Dictionary [CLHS-18.2]

18.1.1. Function MAKE-HASH-TABLE

MAKE-HASH-TABLE accepts two additional keyword arguments :INITIAL-CONTENTS and :WEAK:

(MAKE-HASH-TABLE &KEY :TEST :INITIAL-CONTENTS :SIZE                  :REHASH-SIZE :REHASH-THRESHOLD                  :WARN-IF-NEEDS-REHASH-AFTER-GC :WEAK)

The :TEST argument can be, other than one of the symbols EQ, EQL, EQUAL, EQUALP, one of the symbols EXT:FASTHASH-EQ and EXT:STABLEHASH-EQ. Both of these tests use EQ as the comparison function; they differ in their performance characteristics.

EXT:FASTHASH-EQ
This uses the fastest possible hash function. Its drawback is that its hash codes become invalid at every garbage-collection (except if all keys are immediate objects), thus requiring a reorganization of the hash table at the first access after each garbage-collection. Especially when generational garbage-collection is used, which leads to frequent small garbage-collections, large hash table with this test can lead to scalability problems.
EXT:STABLEHASH-EQ
This uses a slower hash function that has the property that its hash codes for instances of the classes SYMBOL, EXT:STANDARD-STABLEHASH (subclass of STANDARD-OBJECT) and EXT:STRUCTURE-STABLEHASH (subclass of STRUCTURE-OBJECT) are stable across GCs. This test can thus avoid the scalability problems if all keys, other than immediate objects, are SYMBOL, EXT:STANDARD-STABLEHASH or EXT:STRUCTURE-STABLEHASH instances.

One can recommend to use EXT:FASTHASH-EQ for short-lived hash tables. For tables with a longer lifespan which can be big or accessed frequently, it is recommended to use EXT:STABLEHASH-EQ, and to modify the objects that are used as its keys to become instances of EXT:STANDARD-STABLEHASH or EXT:STRUCTURE-STABLEHASH.

When the symbol EQ or the function #'eq is used as a :TEST argument, the value of the variable EXT:*EQ-HASHFUNCTION* is used instead. This value must be one of EXT:FASTHASH-EQ, EXT:STABLEHASH-EQ.

Similarly, the :TEST argument can also be one of the symbols EXT:FASTHASH-EQL, EXT:STABLEHASH-EQL, EXT:FASTHASH-EQUAL, EXT:STABLEHASH-EQUAL. The same remarks apply as for EXT:FASTHASH-EQ and EXT:STABLEHASH-EQ. When the symbol EQL or the function #'eql is used as a :TEST argument, the value of the variable EXT:*EQL-HASHFUNCTION* is used instead; this value must be one of EXT:FASTHASH-EQL, EXT:STABLEHASH-EQL. Similarly, when the symbol EQUAL or the function #'equal is used as a :TEST argument, the value of the variable EXT:*EQUAL-HASHFUNCTION* is used instead; this value must be one of EXT:FASTHASH-EQUAL, EXT:STABLEHASH-EQUAL.

The :WARN-IF-NEEDS-REHASH-AFTER-GC argument, if true, causes a WARNING to be SIGNALed when an object is stored into the table which will force table reorganizations at the first access of the table after each garbage-collection. This keyword argument can be used to check whether EXT:STABLEHASH-EQ should be preferred over EXT:FASTHASH-EQ for a particular table. Use HASH-TABLE-WARN-IF-NEEDS-REHASH-AFTER-GC to check and SETF this parameter after the table has been created.

The :INITIAL-CONTENTS argument is an alist that is used to initialize the new hash table.

The :REHASH-THRESHOLD argument is ignored.

The :WEAK argument can take the following values:

NIL (default)
:KEY
:VALUE
:KEY-AND-VALUE
:KEY-OR-VALUE

and specifies whether the HASH-TABLE is weak: if the key, value, either or both are not accessible for the garbage-collection purposes, i.e., if they are only accessible via weak HASH-TABLEs and EXT:WEAK-POINTERs, it is garbage-collected and removed from the weak HASH-TABLE.

The SETFable predicate EXT:HASH-TABLE-WEAK-P checks whether the HASH-TABLE is weak.

Note that the only test that makes sense for weak hash tables are EQ and its variants EXT:FASTHASH-EQ and EXT:STABLEHASH-EQ.

Just like all other weak objects, weak HASH-TABLEs cannot be printed readably.

See also Section 30.7.9, “Weak Hash Tables”.

18.1.1.1. Interaction between HASH-TABLEs and garbage-collection.

When a hash table contains keys to be compared by identity - such as NUMBERs in HASH-TABLEs with the HASH-TABLE-TEST EQ; or CONSes in tables which test with EQ or EQL; or VECTORs in tables which test with EQ, EQL or EQUAL; or STANDARD-OBJECT or STRUCTURE-OBJECT instances in tables which test with EQ, EQL, EQUAL or EQUALP; - the hash code will in general depend on the object's address in memory. Therefore it will in general be invalidated after a garbage-collection, and the hash table's internal structure must be recomputed at the next table access.

While :WARN-IF-NEEDS-REHASH-AFTER-GC can help checking the efficiency of a particular HASH-TABLE, the variable CUSTOM:*WARN-ON-HASHTABLE-NEEDING-REHASH-AFTER-GC* achieves the same effect for all HASH-TABLEs in the system at once: when CUSTOM:*WARN-ON-HASHTABLE-NEEDING-REHASH-AFTER-GC* is true and a HASH-TABLE needs to be rehashed after a garbage-collection, a warning is issued that shows the inefficient HASH-TABLE.

What can be done to avoid the inefficiencies detected by these warnings?

1. In many cases you can solve the problem by using the STABLEHASH variant of the hash test.
2. In other cases, namely STANDARD-OBJECT or STRUCTURE-OBJECT instances, you can solve the problem by making the key object classes inherit from EXT:STANDARD-STABLEHASH or EXT:STRUCTURE-STABLEHASH, respectively.
3. In the remaining cases, you should store a hash key inside the object, of which you can guarantee uniqueness through your application (for example the ID of an object in a database, or the serial number of an object), and use this key as hash key instead of the original object.

You can define a new hash table test using the macro EXT:DEFINE-HASH-TABLE-TEST: (EXT:DEFINE-HASH-TABLE-TEST test-name test-function hash-function), after which test-name can be passed as the :TEST argument to MAKE-HASH-TABLE. E.g.:

(EXT:DEFINE-HASH-TABLE-TEST string STRING= SXHASH)  (MAKE-HASH-TABLE :test 'string)

(which is not too useful because it is equivalent to an EQUAL HASH-TABLE but less efficient).

The fundamental requirement is that the test-function and hash-function are consistent:

(FUNCALL test-function x y) ⇒ (= (FUNCALL hash-function x) (FUNCALL hash-function y))

This means that the following definition:

(EXT:DEFINE-HASH-TABLE-TEST number = SXHASH) ; broken!

is not correct because (= 1 1d0) is T but (= (SXHASH 1) (SXHASH 1d0)) is NIL. The correct way is, e.g.:

(EXT:DEFINE-HASH-TABLE-TEST number = (LAMBDA (x) (SXHASH (COERCE x 'SHORT-FLOAT))))

(note that (COERCE x SHORT-FLOAT) does not cons up fresh objects while (COERCE x DOUBLE-FLOAT) does).

18.1.3. Function HASH-TABLE-TEST

Function HASH-TABLE-TEST returns either one of EXT:FASTHASH-EQ, EXT:STABLEHASH-EQ, EXT:FASTHASH-EQL, EXT:STABLEHASH-EQL, EXT:FASTHASH-EQUAL, EXT:STABLEHASH-EQUAL, EQUALP (but not EQ, EQL nor EQUAL anymore), or, for HASH-TABLEs created with a user-defined HASH-TABLE-TEST (see macro EXT:DEFINE-HASH-TABLE-TEST), a CONS cell (test-function . hash-function).

18.1.4. Macro EXT:DOHASH

For iteration through a HASH-TABLE, a macro EXT:DOHASH, similar to DOLIST, can be used instead of MAPHASH:

(EXT:DOHASH (key-var value-var hash-table-form [resultform])   {declaration}*   {tag|form}*)

EXT:DOHASH forms are iteration forms.

These notes document CLISP version 2.34.Last modified: 2005-07-20
Prev Up Next
Chapter 18. Hash Tables [CLHS-18] Home Chapter 19. Filenames [CLHS-19]
```