From: Martin J. <mar...@em...> - 2006-03-08 19:13:45
|
On Wed, 8 Mar 2006, Richard Jones wrote: > > This is a problem I often have - count "things" in an imperative way. > As an example, read a document and count the frequency of each word in > the document. > > The way I normally solve it is something like this: > > let results = Hashtbl.create 31 in > List.iter ( > fun word -> > try > let count = Hashtbl.find results word in > Hashtbl.replace results word (count+1) > with Not_found -> > Hashtbl.add results word 1 > ) words; > let results = > Hashtbl.fold (fun word count xs -> (count, word) :: xs) results [] in > (* ... *) > > It's not particularly elegant ... > > Is there a better structure that I should be using, or should we add > one to Extlib? Given the implementation of Hashtbl (buckets = immutable lists), it should be faster to not replace the table entry, but simply to increment an int ref. It also saves one hash operation instead of two, when the counter already exists. let r = try Hashtbl.find tbl key with Not_found -> let r = ref 0 in Hashtbl.add key r; r in incr r Martin > > Rich. > > PS. Note that "words" is only an example. In real life I'm processing > gigabytes of "things", and they don't live in a convenient list in > memory either -- hence the imperative approach. > > -- > Richard Jones, CTO Merjis Ltd. > Merjis - web marketing and technology - http://merjis.com > Team Notepad - intranets and extranets for business - http://team-notepad.com > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > ocaml-lib-devel mailing list > oca...@li... > https://lists.sourceforge.net/lists/listinfo/ocaml-lib-devel > -- Martin Jambon, PhD http://martin.jambon.free.fr Edit http://wikiomics.org, bioinformatics wiki |