From: Brian H. <bh...@sp...> - 2006-03-08 16:54:57
|
On Wed, 8 Mar 2006, Richard Jones wrote: > > This is a problem I often have - count "things" in an imperative way. > As an example, read a document and count the frequency of each word in > the document. That's not a bad way. If you want to do it functionally, you could use a Map instead of a hash table: module StringMap = Map.Make(String); let count_words (f: unit -> string option) = let rec loop cnt = match (f ()) with | None -> List.rev (StringMap.fold (fun k c l -> (k, c) :: l) cnt []) | Some(word) -> let cnt = try let c = StringMap.find word cnt in StringMap.add word (c+1) cnt with | Not_found -> StringMap.add word 1 cnt in loop cnt in loop StringMap.empty ;; Now, the disadvantage of doing it this way is that it'll probably be somewhat slower than using a hash table- doing the find and add are O(log N) operations vr.s O(1) operations for hash tables, however the Map based implementation has lower constant factors (comparing two strings is generally cheap compared to the cost of calculating a hash), so the performance loss isn't as large as you might think. The advantage of doing things this way is that the strings are returned in sorted order. I don't know if this is an advantage or not. I'd also be tempted to return the StringMap instead of the equivelent list- walking a string map is about as expensive as walking a list (OK, a little more expensive, but not much). And keeping it in the map allows for O(log N) accesses to any given element. Wether I'd do this or not depends upon what uses the calling code has for the table. I don't think this is a common enough problem to be worth putting into a standard library, however. Brian |