You can subscribe to this list here.
2003 |
Jan
|
Feb
(81) |
Mar
(97) |
Apr
(88) |
May
(80) |
Jun
(170) |
Jul
(9) |
Aug
|
Sep
(18) |
Oct
(58) |
Nov
(19) |
Dec
(7) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(22) |
Feb
(9) |
Mar
(28) |
Apr
(164) |
May
(186) |
Jun
(101) |
Jul
(143) |
Aug
(387) |
Sep
(69) |
Oct
(14) |
Nov
(8) |
Dec
(99) |
2005 |
Jan
(10) |
Feb
(34) |
Mar
(24) |
Apr
(7) |
May
(41) |
Jun
(20) |
Jul
(3) |
Aug
(23) |
Sep
(2) |
Oct
(26) |
Nov
(41) |
Dec
(7) |
2006 |
Jan
(6) |
Feb
(3) |
Mar
(11) |
Apr
|
May
|
Jun
(5) |
Jul
(8) |
Aug
(20) |
Sep
|
Oct
(6) |
Nov
(5) |
Dec
|
2007 |
Jan
|
Feb
(1) |
Mar
|
Apr
(3) |
May
(2) |
Jun
|
Jul
|
Aug
(1) |
Sep
(7) |
Oct
(6) |
Nov
(19) |
Dec
(11) |
2008 |
Jan
|
Feb
(7) |
Mar
(9) |
Apr
(21) |
May
(42) |
Jun
(27) |
Jul
(28) |
Aug
(26) |
Sep
(16) |
Oct
(32) |
Nov
(49) |
Dec
(65) |
2009 |
Jan
(35) |
Feb
(20) |
Mar
(36) |
Apr
(42) |
May
(111) |
Jun
(99) |
Jul
(70) |
Aug
(25) |
Sep
(15) |
Oct
(29) |
Nov
(3) |
Dec
(18) |
2010 |
Jan
(10) |
Feb
(4) |
Mar
(57) |
Apr
(63) |
May
(71) |
Jun
(64) |
Jul
(30) |
Aug
(49) |
Sep
(11) |
Oct
(4) |
Nov
(2) |
Dec
(3) |
2011 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
(1) |
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
2024 |
Jan
(1) |
Feb
(3) |
Mar
(6) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2025 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
(1) |
Sep
(5) |
Oct
|
Nov
|
Dec
|
From: Peter J. <pe...@jo...> - 2006-08-16 20:24:36
|
Bardur Arantsson wrote: > Could we simply rename the UTF8 module to something like SimpleUTF8 to > avoid confusion about whether it's supposed to 'emulate' Camomile's UTF8 > module? I don't think there's any need to rename it, with all the consequent breakage of existing code; there's no namespace clash since (by default) Camomile's module is called Camomile.UTF8, not just UTF8, and it _is_ supposed to emulate Camomile's module! :) Note that this incompatibility between ExtLib and Camomile's interfaces has not been caused by a recent change; if I'm reading the CVS logs correctly, Camomile.UTF8.first was added some two years ago, and the Camomile.UTF8 interface has not changed at all since then. Keeping the two modules synchronised (or, more precisely, keeping ExtLib's UTF8 module signature compatible with Camomile.UnicodeString.Type) should not be problematic now that it has been identified as a desirable thing. |
From: Bardur A. <sp...@sc...> - 2006-08-16 19:09:05
|
Richard W.M. Jones wrote: > On Wed, Aug 16, 2006 at 05:45:47PM +0200, Nicolas Cannasse wrote: [--snip--] >> I'm then against removing it from ExtLib. What are the compatibility >> problem ? I think they can be addressed without causing a lot of >> worries. > > The different, btw, is that my Camomile UTF8 contains UTF8.first > function, which is not in Extlib UTF8. > > We can keep updating ExtLib UTF8 to match Camomile UTF8, and accept > that it'll be out of synch, or people can just install Camomile when > they want UTF8 support. On Debian/Ubuntu, it's just a single command > to install Camomile. > I think the original rationale for the ExtLib UTF8 module was that it does not require one to link a huge library if one only requires a few simple functions which understand the basic structure of UTF8. Does this still apply? IOW, is Camomile one huge monolithic library where it is impossible to use the UTF8 module without linking everything else? Could we simply rename the UTF8 module to something like SimpleUTF8 to avoid confusion about whether it's supposed to 'emulate' Camomile's UTF8 module? [Of course after such a rename, you would be welcome to propose adding a 'first' function to SimpleUTF8 ;)] Cheers, -- Bardur Arantsson <bar...@TH...> - Erm, once upon a time, there, there was a big forest. And in the middle of the forest there lived... some trousers. Called... Dave. Richard Richard, 'Bottom' |
From: Richard W.M. J. <ri...@me...> - 2006-08-16 16:48:53
|
On Wed, Aug 16, 2006 at 05:45:47PM +0200, Nicolas Cannasse wrote: > > ExtLib's UTF8 isn't precisely compatible with Camomile's UTF8 (or at > > least not in the version of Debian I'm using). This makes it > > impossible to use Camomile with ExtLib, which is a bit of a > > showstopper for me at the moment. > > > > I think we should remove UTF8 from ExtLib to avoid this and future > > incompatibilities. > I'm currently using ExtLib UTF8 in several projects. Me too .. > I'm then against removing it from ExtLib. What are the compatibility > problem ? I think they can be addressed without causing a lot of > worries. The different, btw, is that my Camomile UTF8 contains UTF8.first function, which is not in Extlib UTF8. We can keep updating ExtLib UTF8 to match Camomile UTF8, and accept that it'll be out of synch, or people can just install Camomile when they want UTF8 support. On Debian/Ubuntu, it's just a single command to install Camomile. Rich. -- Richard Jones, CTO Merjis Ltd. Merjis - web marketing and technology - http://merjis.com Team Notepad - intranets and extranets for business - http://team-notepad.com |
From: Amit D. <ami...@gm...> - 2006-08-16 16:17:22
|
I'm not sure, but is this because Camomile also has a mobule called UTF8? -Amit On 8/16/06, Nicolas Cannasse <nca...@mo...> wrote: > > > ExtLib's UTF8 isn't precisely compatible with Camomile's UTF8 (or at > > least not in the version of Debian I'm using). This makes it > > impossible to use Camomile with ExtLib, which is a bit of a > > showstopper for me at the moment. > > > > I think we should remove UTF8 from ExtLib to avoid this and future > > incompatibilities. > > > > Rich. > > I'm currently using ExtLib UTF8 in several projects. I'm then against > removing it from ExtLib. What are the compatibility problem ? I think > they can be addressed without causing a lot of worries. > > Nicolas > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job > easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > ocaml-lib-devel mailing list > oca...@li... > https://lists.sourceforge.net/lists/listinfo/ocaml-lib-devel > |
From: Nicolas C. <nca...@mo...> - 2006-08-16 15:47:03
|
> ExtLib's UTF8 isn't precisely compatible with Camomile's UTF8 (or at > least not in the version of Debian I'm using). This makes it > impossible to use Camomile with ExtLib, which is a bit of a > showstopper for me at the moment. > > I think we should remove UTF8 from ExtLib to avoid this and future > incompatibilities. > > Rich. I'm currently using ExtLib UTF8 in several projects. I'm then against removing it from ExtLib. What are the compatibility problem ? I think they can be addressed without causing a lot of worries. Nicolas |
From: Richard W.M. J. <ri...@me...> - 2006-08-16 15:13:18
|
On Wed, Aug 16, 2006 at 05:06:30PM +0200, syl...@po... wrote: > I am the debian package maintainer of Camomile... Is camomile in debian too > old or too recent ? ( FYI, i think i will update camomile to 0.7.0 next > week or so). Actually it's not Debian, but (cough) Ubuntu, so you'll probably tell me to go away, but anyhow. My Debian system (where we will eventually upload & run this code) is running Debian/stable, and I can't find Camomile there at all, so in fact this may be a moot issue. Here are the Ubuntu versions anyway: Package: libcamomile-ocaml-dev Priority: optional Section: universe/libdevel Installed-Size: 2988 Maintainer: Sylvain Le Gall <syl...@po...> Architecture: amd64 Source: camomile Version: 0.6.3-3 Depends: ocaml-nox-3.09.1, libcamomile-ocaml-data (= 0.6.3-3) Filename: pool/universe/c/camomile/libcamomile-ocaml-dev_0.6.3-3_amd64.deb Size: 577914 MD5sum: a17177fc5c9a0c4682691bd4b043be6a Description: Unicode library for OCaml Camomile is a comprehensive Unicode library for objective caml language. The library is currently designed to conform Unicode Standard 3.2. . Normalisers (NFD, NFKD, NFC, NFKC) and collator (string comparison) pass the conformance tests defined Unicode Technical Reports. Collator is also tested to Canadian, Thai and Japanese standards with their locales. Bugs: mailto:ubu...@li... Origin: Ubuntu Package: libextlib-ocaml-dev Priority: optional Section: universe/libdevel Installed-Size: 1828 Maintainer: Stefano Zacchiroli <za...@de...> Architecture: amd64 Source: extlib Version: 1.4-5 Depends: ocaml-nox-3.09.1, ocaml-findlib (>= 1.1) Filename: pool/universe/e/extlib/libextlib-ocaml-dev_1.4-5_amd64.deb Size: 270454 MD5sum: fefaed5bdc926d1c07bf229903fecdc9 Description: extended standard library for OCaml ExtLib is a project aiming at providing a complete - yet small - standard library for the OCaml programming language. . The purpose of this library is to add new functions to OCaml Standard Library modules, to modify some functions in order to get better performances or more safety (tail-recursive) but also to provide new modules which should be useful for the average OCaml programmer. . ExtLib contains modules implementing: enumeration over abstract collection of elements, efficient bit sets, dynamic arrays, references on lists, Unicode characters and UTF-8 encoded strings, additional and improved functions for hashtables, strings, lists and option types. Bugs: mailto:ubu...@li... Origin: Ubuntu Rich. -- Richard Jones, CTO Merjis Ltd. Merjis - web marketing and technology - http://merjis.com Team Notepad - intranets and extranets for business - http://team-notepad.com |
From: Richard J. <ri...@an...> - 2006-08-16 14:52:07
|
ExtLib's UTF8 isn't precisely compatible with Camomile's UTF8 (or at least not in the version of Debian I'm using). This makes it impossible to use Camomile with ExtLib, which is a bit of a showstopper for me at the moment. I think we should remove UTF8 from ExtLib to avoid this and future incompatibilities. Rich. -- Richard Jones, CTO Merjis Ltd. Merjis - web marketing and technology - http://merjis.com Team Notepad - intranets and extranets for business - http://team-notepad.com |
From: Christophe P. <chr...@gm...> - 2006-08-05 18:36:04
|
As always when I submit code, I borked it. Here's the actual version, this time checked for correct behaviour. let nsplit str sep = let try_split s = try Some (String.split s sep) with Invalid_string -> None in let rec loop str result = match try_split str with | Some (a, b) -> loop b (a :: result) | None -> if result = [] then [str] else List.rev (str :: result) in loop str [] Sorry about the previous version. I should learn to be more careful. Christophe |
From: Christophe P. <chr...@gm...> - 2006-08-02 16:50:51
|
Greetings list, I recently tried to use String.nsplit on a rather large text file, which didn't go too well because the current implementation isn't tail recursive. Following is an implementation that I believe to be tail recursive and that should behave like the original version. Could you consider it for inclusion, if it doesn't break anything? let nsplit str sep = let try_split s = try Some (String.split s sep) with Invalid_string -> None in let rec loop str result = match try_split str with | Some (a, b) -> loop b (a :: result) | None -> if result = [] then [str] else List.rev result in loop str [] Comments and criticisms are of course most welcome. :-) Thanks a lot, Christophe |
From: Bardur A. <sp...@sc...> - 2006-07-31 17:38:57
|
Nicolas Cannasse wrote: >> A recent project required me to parse gzipped XML files using Xml-Light. >> The most convenient way to access the gzipped files was using Extlib's >> Unzip.inflate on an IO.input stream. However, since Xml-Light does not >> support IO or the standard OO file interface (it only supports input >> from files, Pervasives streams, strings, and lexbufs), it was necessary >> to define a wrapper function: >> >> let lexbuf_of_input i = >> Lexing.from_function >> (fun s n -> >> try >> IO.input i s 0 n >> with IO.No_more_input -> >> 0);; >> >> I just wonder whether it might be worth adding a couple of conversion >> functions along these lines to the IO module, since I can imagine them >> being generally useful? The code is trivial, but one of the things I >> value Extlib for is that it reduces the need for boilerplate like this. > > Hi Peter, > > The only problem is that this would create an IO -> Lexing dependency. > OCaml performs linkage on a per-module basis so not sure it's nice to > add such an overweight for a single simple function. > That reminds me... At one point I also wanted to add "input_of_file_descr" and "output_of_file_descr" to the IO module. In fact, I'd still be interested in adding these. Could we maybe add a new "IO_extras" module for such not-strictly-needed-but-nice-to-have functionality? -- Bardur Arantsson <bar...@TH...> - Ouch! That's going to bleed when my heart beats. Professor Farnsworth, 'Futurama' |
From: Nicolas C. <nca...@mo...> - 2006-07-30 13:43:13
|
> A recent project required me to parse gzipped XML files using Xml-Light. > The most convenient way to access the gzipped files was using Extlib's > Unzip.inflate on an IO.input stream. However, since Xml-Light does not > support IO or the standard OO file interface (it only supports input > from files, Pervasives streams, strings, and lexbufs), it was necessary > to define a wrapper function: > > let lexbuf_of_input i = > Lexing.from_function > (fun s n -> > try > IO.input i s 0 n > with IO.No_more_input -> > 0);; > > I just wonder whether it might be worth adding a couple of conversion > functions along these lines to the IO module, since I can imagine them > being generally useful? The code is trivial, but one of the things I > value Extlib for is that it reduces the need for boilerplate like this. Hi Peter, The only problem is that this would create an IO -> Lexing dependency. OCaml performs linkage on a per-module basis so not sure it's nice to add such an overweight for a single simple function. Nicolas |
From: Peter J. <pe...@jo...> - 2006-07-30 13:25:08
|
A recent project required me to parse gzipped XML files using Xml-Light. The most convenient way to access the gzipped files was using Extlib's Unzip.inflate on an IO.input stream. However, since Xml-Light does not support IO or the standard OO file interface (it only supports input from files, Pervasives streams, strings, and lexbufs), it was necessary to define a wrapper function: let lexbuf_of_input i = Lexing.from_function (fun s n -> try IO.input i s 0 n with IO.No_more_input -> 0);; I just wonder whether it might be worth adding a couple of conversion functions along these lines to the IO module, since I can imagine them being generally useful? The code is trivial, but one of the things I value Extlib for is that it reduces the need for boilerplate like this. |
From: Bardur A. <sp...@sc...> - 2006-07-24 07:11:19
|
Bardur Arantsson wrote: > Bardur Arantsson wrote: >> Hi all, >> >> Doing an >> >> ExtList.remove [1;2;3] 7 >> >> raises Not_found even though the interface doc says that it shouldn't. >> For consistency with both ExtList.remove_if and ExtList.remove_assoc no >> exception should be raised in this case. >> >> I can't seem to get in touch with SourceForge CVS right now, so I can't >> commit a fix, but I've attached the (trivial) patch to this message. I >> will commit this once SourceForge CVS comes up. >> >> Cheers, >> > > ... and of course I forget the attachment. I will commit tomorrow unless > there are any objections. > Committed. -- Bardur Arantsson <bar...@TH...> - God is dead. Friedrich Nietzsche |
From: Bardur A. <sp...@sc...> - 2006-07-22 21:09:31
|
Bardur Arantsson wrote: > Hi all, > > Doing an > > ExtList.remove [1;2;3] 7 > > raises Not_found even though the interface doc says that it shouldn't. > For consistency with both ExtList.remove_if and ExtList.remove_assoc no > exception should be raised in this case. > > I can't seem to get in touch with SourceForge CVS right now, so I can't > commit a fix, but I've attached the (trivial) patch to this message. I > will commit this once SourceForge CVS comes up. > > Cheers, > ... and of course I forget the attachment. I will commit tomorrow unless there are any objections. -- Bardur Arantsson <bar...@TH...> - Robots are just machines built to make people's lives easier. - I've never made anyone's life easier and you know it! Fry and Bender / Futurama |
From: Nicolas C. <nca...@mo...> - 2006-07-12 12:13:07
|
> Is there a reason why ExtString.String doesn't contain/export the > unsafe_* functions? > > Rich. No reason known. It should. Nicolas |
From: Richard J. <ri...@an...> - 2006-07-12 09:10:07
|
Is there a reason why ExtString.String doesn't contain/export the unsafe_* functions? Rich. -- Richard Jones, CTO Merjis Ltd. Merjis - web marketing and technology - http://merjis.com Team Notepad - intranets and extranets for business - http://team-notepad.com |
From: Bardur A. <sp...@sc...> - 2006-06-16 14:49:14
|
Bardur Arantsson wrote: > Richard Jones wrote: >> On Thu, Jun 15, 2006 at 04:46:03PM +0100, Amit Dubey wrote: >>> I actually have similar code, but I tend to use it over floats more than >>> ints. Maybe this calls for a functor-based design? >> I think it's a fine idea. Are there performance implications? For >> reasons which I can't really fathom, functors don't seem to be fully >> "inlined" when they are created, presumably resulting in extra >> overhead. >> > [--snip--] I said > Even though it may be a tad slower (no inlining), it does allow you to > create Counters which count objects of an identical type *without* being > "compatible". This allows a bit more type safety if you want it. I meant to say: > I would use the functor approach. Even though it may be [...] Cheers, -- Bardur Arantsson <bar...@TH...> Sticks and stones may break my bones, but hollow-points expand on impact. |
From: Bardur A. <sp...@sc...> - 2006-06-16 14:30:29
|
Richard Jones wrote: > On Thu, Jun 15, 2006 at 04:46:03PM +0100, Amit Dubey wrote: >> I actually have similar code, but I tend to use it over floats more than >> ints. Maybe this calls for a functor-based design? > > I think it's a fine idea. Are there performance implications? For > reasons which I can't really fathom, functors don't seem to be fully > "inlined" when they are created, presumably resulting in extra > overhead. > Yup, I believe OCaml currently uses dynamic dispatch for functors. I don't think there are any theoretical reasons that this necessarily has to be the case. > What do other people think about the general concept? +1, I like it and I think it should be added to ExtLib. I've used similar code a number of times, but have used the code reuse technique known as Copy and Paste, just because I couldn't be bothered to package something up. Even though it may be a tad slower (no inlining), it does allow you to create Counters which count objects of an identical type *without* being "compatible". This allows a bit more type safety if you want it. Finally, I don't like the name Counter; it's a bit too generic for my tastes. Maybe call it Histogram instead? Cheers, -- Bardur Arantsson <bar...@TH...> "Mr. T to pity fool." http://www.theonion.com |
From: Richard J. <ri...@an...> - 2006-06-16 12:09:12
|
On Thu, Jun 15, 2006 at 04:46:03PM +0100, Amit Dubey wrote: > I actually have similar code, but I tend to use it over floats more than > ints. Maybe this calls for a functor-based design? I think it's a fine idea. Are there performance implications? For reasons which I can't really fathom, functors don't seem to be fully "inlined" when they are created, presumably resulting in extra overhead. What do other people think about the general concept? Rich. -- Richard Jones, CTO Merjis Ltd. Merjis - web marketing and technology - http://merjis.com Team Notepad - intranets and extranets for business - http://team-notepad.com |
From: Amit D. <ami...@gm...> - 2006-06-15 15:46:09
|
I actually have similar code, but I tend to use it over floats more than ints. Maybe this calls for a functor-based design? module type NUMBER = sig type t val zero : t val unit : t val add : t -> t -> t val negate : t -> t end module Counter( N : NUMBER ) = struct let get_ref counter thing = try Hashtbl.find counter thing with Not_found -> let r = ref N.zero in Hashtbl.add counter thing r; r let incr counter thing = let r = get_ref counter thing in r := N.add !r N.one (...etc...) end On 6/9/06, Richard Jones < ri...@an...> wrote: > > > FWIW I wrote a simple Counter module, based on Hashtbl, which I find > useful. I wonder if this would be a good addition to Extlib? > > Rich. > > ---------------------------------------------------------------------- > (** Basic counting module. > * $Id: counter.mli,v 1.2 2006/06/08 15:24:47 rich Exp $ > *) > > type 'a t > (** Count items of type ['a]. *) > > val create : unit -> 'a t > (** Create a new counter. *) > > val incr : 'a t -> 'a -> unit > (** [incr counter thing] adds one to the count of [thing]s in [counter]. > *) > > val decr : 'a t -> 'a -> unit > (** [decr counter thing] subtracts one from the count of [thing]s in > [counter]. *) > > val add : 'a t -> 'a -> int -> unit > (** [add counter thing n] adds [n] to the count of [thing]s in [counter]. > *) > > val sub : 'a t -> 'a -> int -> unit > (** [sub counter thing n] subtracts [n] from the count of [thing]s in > [counter]. *) > > val set : 'a t -> 'a -> int -> unit > (** [set counter thing n] sets the count of [thing]s to [n]. *) > > val get : 'a t -> 'a -> int > (** [get counter thing] returns the count of [thing]s. *) > > val read : 'a t -> (int * 'a) list > (** [read counter] reads the frequency of each thing. They are sorted > * with the thing appearing most frequently first. > *) > > val length : 'a t -> int > (** Return the number of distinct things. See also {!Counter.total} *) > > val total : 'a t -> int > (** Return the number of things counted (the total number of counts). > * See also {!Counter.length} > *) > > val clear : 'a t -> unit > (** [clear counter] clears the counter. *) > ---------------------------------------------------------------------- > (* Basic counting module. > * $Id: counter.ml,v 1.2 2006/06/08 15:24:47 rich Exp $ > *) > > type 'a t = ('a, int ref) Hashtbl.t > > let create () = > Hashtbl.create 13 > > let get_ref counter thing = > try > Hashtbl.find counter thing > with > Not_found -> > let r = ref 0 in > Hashtbl.add counter thing r; > r > > let incr counter thing = > let r = get_ref counter thing in > incr r > > let decr counter thing = > let r = get_ref counter thing in > decr r > > let add counter thing n = > let r = get_ref counter thing in > r := !r + n > > let sub counter thing n = > let r = get_ref counter thing in > r := !r - n > > let set counter thing n = > let r = get_ref counter thing in > r := n > > let get counter thing = > try > !(Hashtbl.find counter thing) > with > Not_found -> 0 > > let read counter = > let counts = > Hashtbl.fold (fun thing r xs -> (!r, thing) :: xs) counter [] in > List.sort (fun (a, _) (b, _) -> compare (b : int) (a : int)) counts > > let length = Hashtbl.length > > let total counter = > let total = ref 0 in > Hashtbl.iter (fun _ r -> total := !total + !r) counter; > !total > > let clear counter = > Hashtbl.clear counter > ---------------------------------------------------------------------- > > -- > Richard Jones, CTO Merjis Ltd. > Merjis - web marketing and technology - http://merjis.com > Team Notepad - intranets and extranets for business - > http://team-notepad.com > > > _______________________________________________ > ocaml-lib-devel mailing list > oca...@li... > https://lists.sourceforge.net/lists/listinfo/ocaml-lib-devel > |
From: Richard J. <ri...@an...> - 2006-06-09 10:27:42
|
FWIW I wrote a simple Counter module, based on Hashtbl, which I find useful. I wonder if this would be a good addition to Extlib? Rich. ---------------------------------------------------------------------- (** Basic counting module. * $Id: counter.mli,v 1.2 2006/06/08 15:24:47 rich Exp $ *) type 'a t (** Count items of type ['a]. *) val create : unit -> 'a t (** Create a new counter. *) val incr : 'a t -> 'a -> unit (** [incr counter thing] adds one to the count of [thing]s in [counter]. *) val decr : 'a t -> 'a -> unit (** [decr counter thing] subtracts one from the count of [thing]s in [counter]. *) val add : 'a t -> 'a -> int -> unit (** [add counter thing n] adds [n] to the count of [thing]s in [counter]. *) val sub : 'a t -> 'a -> int -> unit (** [sub counter thing n] subtracts [n] from the count of [thing]s in [counter]. *) val set : 'a t -> 'a -> int -> unit (** [set counter thing n] sets the count of [thing]s to [n]. *) val get : 'a t -> 'a -> int (** [get counter thing] returns the count of [thing]s. *) val read : 'a t -> (int * 'a) list (** [read counter] reads the frequency of each thing. They are sorted * with the thing appearing most frequently first. *) val length : 'a t -> int (** Return the number of distinct things. See also {!Counter.total} *) val total : 'a t -> int (** Return the number of things counted (the total number of counts). * See also {!Counter.length} *) val clear : 'a t -> unit (** [clear counter] clears the counter. *) ---------------------------------------------------------------------- (* Basic counting module. * $Id: counter.ml,v 1.2 2006/06/08 15:24:47 rich Exp $ *) type 'a t = ('a, int ref) Hashtbl.t let create () = Hashtbl.create 13 let get_ref counter thing = try Hashtbl.find counter thing with Not_found -> let r = ref 0 in Hashtbl.add counter thing r; r let incr counter thing = let r = get_ref counter thing in incr r let decr counter thing = let r = get_ref counter thing in decr r let add counter thing n = let r = get_ref counter thing in r := !r + n let sub counter thing n = let r = get_ref counter thing in r := !r - n let set counter thing n = let r = get_ref counter thing in r := n let get counter thing = try !(Hashtbl.find counter thing) with Not_found -> 0 let read counter = let counts = Hashtbl.fold (fun thing r xs -> (!r, thing) :: xs) counter [] in List.sort (fun (a, _) (b, _) -> compare (b : int) (a : int)) counts let length = Hashtbl.length let total counter = let total = ref 0 in Hashtbl.iter (fun _ r -> total := !total + !r) counter; !total let clear counter = Hashtbl.clear counter ---------------------------------------------------------------------- -- Richard Jones, CTO Merjis Ltd. Merjis - web marketing and technology - http://merjis.com Team Notepad - intranets and extranets for business - http://team-notepad.com |
From: Bruno De F. <br...@de...> - 2006-03-09 11:33:57
|
Hello, On 09 Mar 2006, at 10:48, Richard Jones wrote: > Specifically we're parsing Apache logfiles from [a very large company > which you will have heard of]. They produce about 1 GB of raw > logfiles / day, which we read in, line at a time, and attempt to > deduce interesting things. There's no possibility of fitting the > logfiles into memory. Much of the problem involves counting how many > times certain events happen. OK, I see. But perhaps I should ask you then why exactly you think the current solution(s) are not elegant? By the way, notice that you can build something very similar to Brian Hurt's solution basically by using Enum.fold and Enum.from: module StringMap = Map.Make(String); let make_histogram : string Enum.t -> int StringMap.t = Enum.fold (fun word cnt -> try let c = StringMap.find word cnt in StringMap.add word (c+1) cnt with Not_found -> StringMap.add word 1 cnt ) StringMap.empty ;; let map_to_assoc_list m = StringMap.fold (fun k c l -> (k, c) :: l) m [] ;; let count_words (f: unit -> string) = map_to_assoc_list (make_histogram (Enum.from f)) ;; The argument to count_words should raise an exception when it is exhausted: # count_words (fun () -> try read_line () with End_of_file -> raise Enum.No_more_elements ) ;; aa bb abc bb abc <CTRL+D> - : (StringMap.key * int) list = [("aa", 1); ("abc", 2); ("bb", 2)] Bye, Bruno |
From: Richard J. <ri...@an...> - 2006-03-09 09:49:16
|
On Thu, Mar 09, 2006 at 10:18:55AM +0100, Bruno De Fraine wrote: > While this is a nice prototype, I obviously doubt this is what you > want when counting "gigabytes of things". But then again, it's not > entirely clear what you're asking for... Specifically we're parsing Apache logfiles from [a very large company which you will have heard of]. They produce about 1 GB of raw logfiles / day, which we read in, line at a time, and attempt to deduce interesting things. There's no possibility of fitting the logfiles into memory. Much of the problem involves counting how many times certain events happen. Rich. -- Richard Jones, CTO Merjis Ltd. Merjis - web marketing and technology - http://merjis.com Team Notepad - intranets and extranets for business - http://team-notepad.com |
From: Bruno De F. <br...@de...> - 2006-03-09 09:19:05
|
Hello Richard, On 08 Mar 2006, at 16:33, Richard Jones wrote: > It's not particularly elegant ... > > Is there a better structure that I should be using, or should we add > one to Extlib? There is definitely a way to do it in a more elegant functional style, provided you have a general group_by function (which I think Extlib still lacks, and which I therefore try to plug here): (*s [group_by f l] creates an associative list that groups the elements of l according to their image under f. *) val group_by : ('a -> 'b) -> 'a list -> ('b * 'a list) list For example: # group_by String.length ["aa";"bbb";"abc";"bb"] ;; - : (int * string list) list = [(3, ["abc"; "bbb"]); (2, ["bb"; "aa"])] Now, with two more auxiliary functions: let identity x = x ;; (* Already present in Std *) let map_snd f (a,b) = (a, f b) ;; A concise solution to your problem can be given as: let results = List.map (map_snd List.length) (group_by identity words) ;; While this is a nice prototype, I obviously doubt this is what you want when counting "gigabytes of things". But then again, it's not entirely clear what you're asking for... For reference, this is an implementation of group_by: let group_by f list = List.fold_left (fun accu el -> let img = f el and found = ref false in let new_accu = List.rev_map (fun grp -> if !found || (fst grp) <> img then grp else begin found := true; (img,el::(snd grp)) end ) accu in if !found then new_accu else (img,[el]) :: accu ) [] list ;; Perhaps a more general solution would have a signature like: val group_by : ('a -> 'b) -> 'a Enum.t -> ('b, 'a Enum.t) Hashtbl.t Bye, Bruno |
From: Martin J. <mar...@em...> - 2006-03-08 19:13:45
|
On Wed, 8 Mar 2006, Richard Jones wrote: > > This is a problem I often have - count "things" in an imperative way. > As an example, read a document and count the frequency of each word in > the document. > > The way I normally solve it is something like this: > > let results = Hashtbl.create 31 in > List.iter ( > fun word -> > try > let count = Hashtbl.find results word in > Hashtbl.replace results word (count+1) > with Not_found -> > Hashtbl.add results word 1 > ) words; > let results = > Hashtbl.fold (fun word count xs -> (count, word) :: xs) results [] in > (* ... *) > > It's not particularly elegant ... > > Is there a better structure that I should be using, or should we add > one to Extlib? Given the implementation of Hashtbl (buckets = immutable lists), it should be faster to not replace the table entry, but simply to increment an int ref. It also saves one hash operation instead of two, when the counter already exists. let r = try Hashtbl.find tbl key with Not_found -> let r = ref 0 in Hashtbl.add key r; r in incr r Martin > > Rich. > > PS. Note that "words" is only an example. In real life I'm processing > gigabytes of "things", and they don't live in a convenient list in > memory either -- hence the imperative approach. > > -- > Richard Jones, CTO Merjis Ltd. > Merjis - web marketing and technology - http://merjis.com > Team Notepad - intranets and extranets for business - http://team-notepad.com > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > _______________________________________________ > ocaml-lib-devel mailing list > oca...@li... > https://lists.sourceforge.net/lists/listinfo/ocaml-lib-devel > -- Martin Jambon, PhD http://martin.jambon.free.fr Edit http://wikiomics.org, bioinformatics wiki |