From: Bruno De F. <br...@de...> - 2006-03-09 09:19:05
|
Hello Richard, On 08 Mar 2006, at 16:33, Richard Jones wrote: > It's not particularly elegant ... > > Is there a better structure that I should be using, or should we add > one to Extlib? There is definitely a way to do it in a more elegant functional style, provided you have a general group_by function (which I think Extlib still lacks, and which I therefore try to plug here): (*s [group_by f l] creates an associative list that groups the elements of l according to their image under f. *) val group_by : ('a -> 'b) -> 'a list -> ('b * 'a list) list For example: # group_by String.length ["aa";"bbb";"abc";"bb"] ;; - : (int * string list) list = [(3, ["abc"; "bbb"]); (2, ["bb"; "aa"])] Now, with two more auxiliary functions: let identity x = x ;; (* Already present in Std *) let map_snd f (a,b) = (a, f b) ;; A concise solution to your problem can be given as: let results = List.map (map_snd List.length) (group_by identity words) ;; While this is a nice prototype, I obviously doubt this is what you want when counting "gigabytes of things". But then again, it's not entirely clear what you're asking for... For reference, this is an implementation of group_by: let group_by f list = List.fold_left (fun accu el -> let img = f el and found = ref false in let new_accu = List.rev_map (fun grp -> if !found || (fst grp) <> img then grp else begin found := true; (img,el::(snd grp)) end ) accu in if !found then new_accu else (img,[el]) :: accu ) [] list ;; Perhaps a more general solution would have a signature like: val group_by : ('a -> 'b) -> 'a Enum.t -> ('b, 'a Enum.t) Hashtbl.t Bye, Bruno |