[cl-cookbook-contrib] Strings and things for hash table keys
Brought to you by:
jthing
From: Richard M K. <kr...@pr...> - 2006-11-30 03:20:22
|
Hello, In the Hash Tables chapter, it'd be worthwhile to explain a bit about the uses of different tests, especially with an eye toward string-keyed hash tables. Below is a stab at covering this. Regards, Richard M Kreuter --- hashes.html.~1.8.~ 2005-04-17 19:28:34.000000000 -0400 +++ hashes.html 2006-11-29 17:35:13.000000000 -0500 @@ -22,6 +22,7 @@ <li><a href="#del">Deleting from a Hash Table</a> <li><a href="#traverse">Traversing a Hash Table</a> <li><a href="#count">Counting the Entries in a Hash Table</a> +<li><a href="#strings">Strings and Things for Hash Table Keys</a> <li><a href="#size">Performance Issues: The Size of your Hash Table</a> </ul> @@ -44,7 +45,7 @@ <a href="http://www.lispworks.com/documentation/HyperSpec/Body/f_mk_has.htm"><code>MAKE-HASH-TABLE</code></a>. It has no required argument. Its most used optional keyword argument is <code>:TEST</code>, specifying the function used to test the -equality of keys. +equivalence of keys. The default test is <code>EQL</code>. <a name="get"></a> @@ -306,6 +307,86 @@ 0 </pre> +<a name="strings"></a> +<h3>Strings and Things for Hash Table Keys</h3> + +Because the default <code>:TEST</code> argument +to <code>MAKE-HASH-TABLE</code> is <code>EQL</code>, distinct objects +that have equal contents are not equivalent as hash table keys. For +example, + +<pre> +* (defparameter *eql-hash* (make-hash-table)) +*EQL-HASH* +* (setf (gethash "Groucho" *eql-hash*) "Dr. Hugo Hackenbush") +"Dr. Hugo Hackenbush" +* (setf (gethash "Groucho" *eql-hash*) "Rufus T. Firefly") +"Rufus T. Firefly" +* (maphash (lambda (key value) + (format t "~&~A -> ~A~%" key value)) + *eql-hash*) +Groucho -> Dr. Hugo Hackenbush +Groucho -> Rufus T. Firefly +NIL +</pre> + +By passing a more general equality predicate as the <code>:TEST</code> +argument to <code>MAKE-HASH-TABLE</code>, it's possible to treat +distinct objects with similar constituents as equivalent hash table +keys, for certain Lisp types. Of particular note, to construct a hash +table that equates strings whose characters are the same, +use <code>EQUAL</code>: + +<pre> +* (defparameter *equal-hash* (make-hash-table :test #'equal)) +*EQUAL-HASH* +* (setf (gethash "Groucho" *equal-hash*) "Dr. Hugo Hackenbush") +"Dr. Hugo Hackenbush" +* (setf (gethash "Groucho" *equal-hash*) "Rufus T. Firefly") +"Rufus T. Firefly" +* (maphash (lambda (key value) + (format t "~&~A -> ~A~%" key value)) + *equal-hash*) +Groucho -> Rufus T. Firefly +NIL +</pre> + +Performance aside, hash tables whose test is <code>EQUAL</code> are +probably adequate for almost all applications, and are roughly +semantically analogous to the hash tables in several other high level +languages. For a somewhat different equivalence +predicate, <code>EQUALP</code> can be supplied as the hash table test +(but note that EQUALP compares strings and characters +case-insensitively). +<p> + +Because keying on textual data is often important, there's another +approach for using tokens (in an abstract sense) as hash table keys: +to use symbols. + +<pre> +* (defparameter *eq-hash* (make-hash-table :test #'eq)) +*EQ-HASH* +* (setf (gethash :|Groucho| *eq-hash*) "Dr. Hugo Hackenbush") +"Dr. Hugo Hackenbush" +* (setf (gethash :|Groucho| *eq-hash*) "Rufus T. Firefly") +"Rufus T. Firefly" +* (maphash (lambda (key value) + (format t "~&~A -> ~A~%" key value)) + *equal-hash*) +Groucho -> Rufus T. Firefly +NIL +</pre> + +Using symbols instead of strings as hash table keys may result in +faster hashing, which could be of consequence in certain sorts of +programs. However, if the hash table keys are to be derived from the +program's input, then it will be necessary to <code>INTERN</code> +those inputs in order to get symbols to use as hash keys, and to pass +symbols throughout the program, rather than strings. Therefore, this +technique can probably only improve performance in case the domain of +keys for the table changes relatively little during the table's use. + <a name="size"></a> <h3>Performance Issues: The Size of your Hash Table</h3> |