You can subscribe to this list here.
2003 |
Jan
|
Feb
(81) |
Mar
(97) |
Apr
(88) |
May
(80) |
Jun
(170) |
Jul
(9) |
Aug
|
Sep
(18) |
Oct
(58) |
Nov
(19) |
Dec
(7) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(22) |
Feb
(9) |
Mar
(28) |
Apr
(164) |
May
(186) |
Jun
(101) |
Jul
(143) |
Aug
(387) |
Sep
(69) |
Oct
(14) |
Nov
(8) |
Dec
(99) |
2005 |
Jan
(10) |
Feb
(34) |
Mar
(24) |
Apr
(7) |
May
(41) |
Jun
(20) |
Jul
(3) |
Aug
(23) |
Sep
(2) |
Oct
(26) |
Nov
(41) |
Dec
(7) |
2006 |
Jan
(6) |
Feb
(3) |
Mar
(11) |
Apr
|
May
|
Jun
(5) |
Jul
(8) |
Aug
(20) |
Sep
|
Oct
(6) |
Nov
(5) |
Dec
|
2007 |
Jan
|
Feb
(1) |
Mar
|
Apr
(3) |
May
(2) |
Jun
|
Jul
|
Aug
(1) |
Sep
(7) |
Oct
(6) |
Nov
(19) |
Dec
(11) |
2008 |
Jan
|
Feb
(7) |
Mar
(9) |
Apr
(21) |
May
(42) |
Jun
(27) |
Jul
(28) |
Aug
(26) |
Sep
(16) |
Oct
(32) |
Nov
(49) |
Dec
(65) |
2009 |
Jan
(35) |
Feb
(20) |
Mar
(36) |
Apr
(42) |
May
(111) |
Jun
(99) |
Jul
(70) |
Aug
(25) |
Sep
(15) |
Oct
(29) |
Nov
(3) |
Dec
(18) |
2010 |
Jan
(10) |
Feb
(4) |
Mar
(57) |
Apr
(63) |
May
(71) |
Jun
(64) |
Jul
(30) |
Aug
(49) |
Sep
(11) |
Oct
(4) |
Nov
(2) |
Dec
(3) |
2011 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
(1) |
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
2024 |
Jan
(1) |
Feb
(3) |
Mar
(6) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2025 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Alan P. <ap...@re...> - 2003-06-08 10:25:40
|
Here's a patch adding support for t.fast to Enum.iter*, fold*, and append. Note that I still think that having Enum and Stream as separate types would be a clearer way of doing this. I tried to follow the style being used in the file, though it is my opinion that tab characters are evil evil evil. I was really surprised that ocaml does not allow: for _ = 1 to 5 do print_string "hi\n" done ^^^ --- /usr/local/src/ocaml-extlib/extlib-dev/enum.ml Tue Jun 3 19:21:46 2003 +++ ./enum.ml Sun Jun 8 03:07:45 2003 @@ -24,7 +24,7 @@ mutable fast : bool; } -(* raised by 'next' functions, should NOT goes outside the API *) +(* raised by 'next' functions, should NOT go outside the API *) exception No_more_elements let _dummy () = assert false @@ -50,7 +50,7 @@ f (n - 1 - !count)); clone = (fun () -> init !count f); fast = true; - } + } let force t = let rec clone enum count = @@ -158,90 +158,153 @@ t.clone() let iter f t = - let rec loop () = - f (t.next()); - loop(); - in - try - loop(); - with - No_more_elements -> () + if t.fast then + (* 10% faster than the exception-based implementation. + * A tail-recursive approach was slower than the for loop. + * Unfortunately, "for _ =" is not allowed. + *) + for idx = 1 to t.count () do + f (t.next ()) + done + else + let rec loop () = + f (t.next()); + loop(); + in + try + loop(); + with + No_more_elements -> () let iteri f t = - let rec loop idx = - f idx (t.next()); - loop (idx+1); - in - try - loop 0; - with - No_more_elements -> () + if t.fast then + for idx = 0 to (t.count ()) - 1 do + f idx (t.next ()) + done + else + let rec loop idx = + f idx (t.next()); + loop (idx+1); + in + try + loop 0; + with + No_more_elements -> () let iter2 f t u = - let rec loop () = - f (t.next()) (u.next()); - loop () - in - try - loop () - with - No_more_elements -> () + if t.fast then + for idx = 1 to min (t.count ()) (u.count ()) do + f (t.next ()) (u.next ()) + done + else + let rec loop () = + f (t.next()) (u.next()); + loop () + in + try + loop () + with + No_more_elements -> () let iter2i f t u = - let rec loop idx = - f idx (t.next()) (u.next()); - loop (idx + 1) - in - try - loop 0 - with - No_more_elements -> () + if t.fast then + for idx = 0 to (min (t.count ()) (u.count ())) - 1 do + f idx (t.next ()) (u.next ()) + done + else + let rec loop idx = + f idx (t.next()) (u.next()); + loop (idx + 1) + in + try + loop 0 + with + No_more_elements -> () let fold f init t = - let acc = ref init in - let rec loop() = - acc := f (t.next()) !acc; - loop() - in - try - loop() - with - No_more_elements -> !acc + if t.fast then begin + (* Runs in 61% of the time required by the exception-based + * implementation. + * A tail-recursive approach was 5% slower than the for loop. + *) + let acc = ref init in + for idx = 1 to (t.count ()) do + acc := f (t.next ()) !acc + done; + !acc + end + else + let acc = ref init in + let rec loop() = + acc := f (t.next()) !acc; + loop() + in + try + loop() + with + No_more_elements -> !acc let foldi f init t = - let acc = ref init in - let rec loop idx = - acc := f idx (t.next()) !acc; - loop (idx + 1) - in - try - loop 0 - with - No_more_elements -> !acc + if t.fast then begin + let acc = ref init in + for idx = 0 to (t.count ()) - 1 do + acc := f idx (t.next ()) !acc + done; + !acc + end + else + let acc = ref init in + let rec loop idx = + acc := f idx (t.next()) !acc; + loop (idx + 1) + in + try + loop 0 + with + No_more_elements -> !acc let fold2 f init t u = - let acc = ref init in - let rec loop() = - acc := f (t.next()) (u.next()) !acc; - loop() - in - try - loop() - with - No_more_elements -> !acc + if t.fast then begin + let acc = ref init in + for idx = 0 to (min (t.count ()) (u.count ())) - 1 do + acc := f (t.next ()) (u.next ()) !acc + done; + !acc + end + else + let acc = ref init in + let rec loop() = + acc := f (t.next()) (u.next()) !acc; + loop() + in + try + loop() + with + No_more_elements -> !acc let fold2i f init t u = - let acc = ref init in - let rec loop idx = - acc := f idx (t.next()) (u.next()) !acc; - loop (idx + 1) - in - try - loop 0 - with - No_more_elements -> !acc + if t.fast then begin + let acc = ref init in + for idx = 0 to (min (t.count ()) (u.count ())) - 1 do + acc := f idx (t.next ()) (u.next ()) !acc + done; + !acc + end + else + let acc = ref init in + let rec loop idx = + acc := f idx (t.next()) (u.next()) !acc; + loop (idx + 1) + in + try + loop 0 + with + No_more_elements -> !acc let find f t = + (* Using count is slower on long enums, and only 5-10% faster on very + * short ones, so don't bother. + *) let rec loop () = let x = t.next() in if f x then x else loop() @@ -293,29 +356,62 @@ from2 next (fun () -> filter f (t.clone())) let rec filter_map f t = - let rec next () = - match f (t.next()) with - | None -> next() - | Some x -> x - in + let rec next () = + match f (t.next()) with + | None -> next() + | Some x -> x + in from2 next (fun () -> filter_map f (t.clone())) +(* + * Note that e.next can change, but only from false to true. + * + * TODO: given that ta.count is called O(N) times, perhaps its result should be + * cached locally instead. This could be significant in the case where + * ta is a deeply nested append. + *) let rec append ta tb = - let append_next = ref _dummy in - append_next := (fun () -> - try - ta.next() - with - No_more_elements -> - append_next := tb.next; - !append_next ()); - { - count = (fun () -> ta.count() + tb.count()); - next = (fun () -> !append_next ()); - clone = (fun () -> append (ta.clone()) (tb.clone())); + let e = { + count = _dummy; + next = _dummy; + clone = _dummy; fast = ta.fast && tb.fast; } + and append_count = ref (fun () -> ta.count() + tb.count()) + and append_clone = ref (fun () -> append (ta.clone()) (tb.clone())) + and append_next = ref _dummy + in + append_next := begin + if ta.fast then + (fun () -> + match ta.count () with + | 0 -> + append_count := tb.count; + append_clone := tb.clone; + append_next := tb.next; + e.fast <- tb.fast; + !append_next () + | _ -> ta.next ()) + else + (fun () -> + try + ta.next() + with + No_more_elements -> + append_count := tb.count; + append_next := tb.next; + append_clone := tb.clone; + e.fast <- tb.fast; + !append_next ()) + end; + e.count <- (fun () -> !append_count ()); + e.next <- (fun () -> !append_next ()); + e.clone <- (fun () -> !append_clone ()); + e +(* + * TODO: use t.fast + *) let rec concat t = let concat_ref = ref _dummy in let rec concat_next() = @@ -329,4 +425,4 @@ !concat_ref () in concat_ref := concat_next; - from2 (fun () -> !concat_ref ()) (fun () -> concat (t.clone())) \ No newline at end of file + from2 (fun () -> !concat_ref ()) (fun () -> concat (t.clone())) |
From: Brian H. <bh...@sp...> - 2003-06-06 18:12:43
|
On Fri, 6 Jun 2003, Nicolas Cannasse wrote: > Hi list, > > Here's a module proposal for adding to the ExtLib. > This is a binary search tree ( module BinTree ). > Operations add, remove, find are done in Log(n) when the tree is optimized. > This is a mutable version of the data structure. > Is there a reason you did this as a mutable tree and not the "standard" applicative version? Brian |
From: Amit D. <ad...@Co...> - 2003-06-06 11:22:46
|
> > is there any advantage of > > > > > t : ('a ,'b) tree ref; > > over: > > > > mutable t : ('a, 'b) tree; > It's for recursion, since are left and right trees are both ('a , 'b ) tree > ref. The problem is that you cannot pass a mutable field to a function > like a ref. I'm a bit confused... unless you want to make a change to a subtree from two places, this shouldn't be necessary. Taking a quick peek at the code, it doesn't seem like this happens (but maybe I missed something). If this isn't the case, you could just as easily pass around the unmutable structure, and assign it where you need. > > > Also, what are the relative advantages of having this vs. > > putting a mutable wrapper around Map? Map, if I'm not mistaken, > > makes some guarantees of the trees being balanced (both add > > and remove in Map call the balancing function). > > Map does not allow to have several items with the same key in it. > Calling add is replacing the current binded item if any. > I think it's more convenient to have a structure that can "shadow" items > just like Hashtbl do. Ok, makes sense! However, the main thrust behind my suggestion was more about the worst-case performance of insertion and removal... Map already has the balancing code written, even if a wrapper isn't appropriate, perhaps you could steal'n'hack the bal, add and remove functions from there? -Amit |
From: Nicolas C. <war...@fr...> - 2003-06-06 10:27:06
|
> is there any advantage of > > > type ('a,'b) t = { > > cmp : ('a -> 'a -> int); > > t : ('a ,'b) tree ref; > > } > > over: > > type ('a,'b) t = { > cmp : ('a -> 'a -> int); > mutable t : ('a, 'b) tree; > } > > Doesn't the first version require a little bit more memory? > I don't think it's a big deal, but... It's for recursion, since are left and right trees are both ('a , 'b ) tree ref. The problem is that you cannot pass a mutable field to a function like a ref. > Also, what are the relative advantages of having this vs. > putting a mutable wrapper around Map? Map, if I'm not mistaken, > makes some guarantees of the trees being balanced (both add > and remove in Map call the balancing function). Map does not allow to have several items with the same key in it. Calling add is replacing the current binded item if any. I think it's more convenient to have a structure that can "shadow" items just like Hashtbl do. Nicolas Cannasse |
From: Nicolas C. <war...@fr...> - 2003-06-06 10:26:10
|
> is there any advantage of > > > type ('a,'b) t = { > > cmp : ('a -> 'a -> int); > > t : ('a ,'b) tree ref; > > } > > over: > > type ('a,'b) t = { > cmp : ('a -> 'a -> int); > mutable t : ('a, 'b) tree; > } > > Doesn't the first version require a little bit more memory? > I don't think it's a big deal, but... It's for recursion, since are left and right trees are both ('a , 'b ) tree ref. The problem is that you cannot pass a mutable field to a function like a ref. > Also, what are the relative advantages of having this vs. > putting a mutable wrapper around Map? Map, if I'm not mistaken, > makes some guarantees of the trees being balanced (both add > and remove in Map call the balancing function). Map does not allow to have several items with the same key in it. Calling add is replacing the current binded item if any. I think it's more convenient to have a structure that can "shadow" items just like Hashtbl do. Nicolas Cannasse |
From: Amit D. <ad...@Co...> - 2003-06-06 09:35:20
|
Just a quick question, is there any advantage of > type ('a,'b) t = { > cmp : ('a -> 'a -> int); > t : ('a ,'b) tree ref; > } over: type ('a,'b) t = { cmp : ('a -> 'a -> int); mutable t : ('a, 'b) tree; } Doesn't the first version require a little bit more memory? I don't think it's a big deal, but... Also, what are the relative advantages of having this vs. putting a mutable wrapper around Map? Map, if I'm not mistaken, makes some guarantees of the trees being balanced (both add and remove in Map call the balancing function). -Amit > > This is a multi-part message in MIME format. > > ------=_NextPart_000_0015_01C32C45.52938130 > Content-Type: text/plain; > charset="iso-8859-1" > Content-Transfer-Encoding: 7bit > > Hi list, > > Here's a module proposal for adding to the ExtLib. > This is a binary search tree ( module BinTree ). > Operations add, remove, find are done in Log(n) when the tree is optimized. > This is a mutable version of the data structure. > > Nicolas Cannasse > > ------=_NextPart_000_0015_01C32C45.52938130 > Content-Type: application/octet-stream; > name="binTree.ml" > Content-Transfer-Encoding: 7bit > Content-Disposition: attachment; > filename="binTree.ml" > > > type ('a,'b) tree = > | Empty > | Tree of ('a * 'b * ('a,'b) tree ref * ('a,'b) tree ref) > > type ('a,'b) t = { > cmp : ('a -> 'a -> int); > t : ('a ,'b) tree ref; > } > > let empty() = > { > cmp = compare; > t = ref Empty; > } > > let make cmp = > { > cmp = cmp; > t = ref Empty; > } > > let rec add t key item = > let rec loop r = > match !r with > | Empty -> r := Tree(key,item,ref Empty,ref Empty) > | Tree(k,i,tl,tr) -> > let n = t.cmp k key in > if n = 0 then r := Tree(key,item,tl,ref (Tree(k,i,ref Empty,tr))) > else if n > 0 then loop tl > else loop tr > in > loop t.t > > let rec length t = > let rec loop r = > match !r with > | Empty -> 0 > | Tree(_,_,tl,tr) -> 1+(loop tl)+(loop tr) > in > loop t.t > > let rec depth t = > let rec loop r = > match !r with > | Empty -> 0 > | Tree(_,_,tl,tr) -> 1+(max (loop tl) (loop tr)) > in > loop t.t > > let rec find t key = > let rec loop r = > match !r with > | Empty -> raise Not_found > | Tree(k,i,tl,tr) -> > let n = t.cmp k key in > if n = 0 then i > else if n > 0 then loop tl > else loop tr > in > loop t.t > > let iter f t = > let rec loop r = > match !r with > | Empty -> () > | Tree(k,i,tl,tr) -> > loop tl; > f k i; > loop tr > in > loop t.t > > let copy t = > let rec loop r = > match !r with > | Empty -> ref Empty > | Tree (k,i,tl,tr) -> > ref (Tree(k,i,loop tl,loop tr)) > in > { > cmp = t.cmp; > t = loop t.t; > } > > let clear t = t.t := Empty > > let rec remove t key = > let p = ref Empty in > let rec remove_aux r = > match !r with > | Empty -> raise Not_found > | Tree(k,i,tleft,tright) -> > let n = t.cmp k key in > if n = 0 then r else begin > p := !r; > if n > 0 then remove_aux tleft else remove_aux tright > end > in > let found = remove_aux t.t in > match !found with > | Empty -> assert false > | Tree(k,i,fleft,fright) -> > let attach subtree = > match !p with > | Empty -> t.t := subtree > | Tree(_,_,pleft,pright) -> > if found == pleft then > pleft := subtree > else > pright := subtree > in > if !fright = Empty then attach !fleft > else begin > let lp = ref Empty in > let rec mostleft_aux t = > match !t with > | Empty -> assert false > | Tree(_,_,left,_) -> > if !left = Empty then t else begin > lp := !t; > mostleft_aux left; > end > in > let mostleft = mostleft_aux fright in > let mkey,mitem,mleft,mright = > (match !mostleft with > | Empty -> assert false > | Tree infos -> infos) in > match !lp with > | Empty -> > attach (Tree(mkey,mitem,fleft,mright)) > | Tree(_,_,lpleft,lpright) -> > attach (Tree(mkey,mitem,fleft,fright)); > lpleft := !mright; > end > > type ('a,'b) istack = > | SItem of ('a * 'b) > | STree of ('a , 'b) tree ref > > let rec make_enum f s = > let rec next () = > match !s with > | [] -> raise Enum.No_more_elements > | (SItem x) :: h -> > s := h; > f x > | (STree t) :: h -> > let rec loop t = > match !t with > | Empty -> > s := h; > next() > | Tree (k,i,left,right) -> > s := (SItem (k,i)) :: (STree right) :: !s; > loop left > in > loop t > in > let rec tcount t = > match !t with > | Empty -> 0 > | Tree(_,_,tl,tr) -> 1 + tcount tl + tcount tr > in > let count() = > List.fold_left (fun acc c -> > match c with > | SItem _ -> acc + 1 > | STree t -> acc + tcount t > ) 0 !s > in > let clone() = > make_enum f (ref !s) > in > Enum.make ~next ~count ~clone > > let pairs t = > make_enum (fun x -> x) (ref [STree t.t]) > > let enum t = > make_enum snd (ref [STree t.t]) > > let keys t = > make_enum fst (ref [STree t.t]) > > let optimize t = assert false > > let to_list t = > let rec loop dst r = > match !r with > | Empty -> dst > | Tree(k,i,tl,tr) -> > let dst = loop dst tl in > let r = [ (k,i) ] in > Obj.set_field (Obj.repr dst) 1 (Obj.repr r); > loop r tr > in > let x = [Obj.magic()] in > ignore(loop x t.t); > List.tl x > > let optimize t = > let l = ref (to_list t) in > let len = List.length !l in > let rec opt_list len = > if len = 0 then > Empty > else if len = 1 then begin > match !l with > | [] -> assert false > | (k,i) :: q -> > l := q; > Tree(k,i,ref Empty,ref Empty) > end else begin > let left = opt_list (len/2) in > let k, i = (match !l with > | [] -> assert false > | x :: q -> > l := q; > x) in > let right = opt_list ((len-1)/2) in > Tree(k,i,ref left,ref right) > end > in > t.t := opt_list len > ------=_NextPart_000_0015_01C32C45.52938130 > Content-Type: application/octet-stream; > name="binTree.mli" > Content-Transfer-Encoding: 7bit > Content-Disposition: attachment; > filename="binTree.mli" > > > type ('a,'b) t > > val empty : unit -> ('a,'b) t > val make : ('a -> 'a -> int) -> ('a,'b) t > > (* O(N) *) > val length : ('a,'b) t -> int > val depth : ('a,'b) t -> int > val copy : ('a,'b) t -> ('a,'b) t > val optimize : ('a,'b) t -> unit > > (* log N *) > val add : ('a,'b) t -> 'a -> 'b -> unit > val find : ('a,'b) t -> 'a -> 'b (* raise Not_found *) > val remove : ('a,'b) t -> 'a -> unit (* raise Not_found *) > > val enum : ('a,'b) t -> 'b Enum.t > val keys : ('a,'b) t -> 'a Enum.t > val pairs : ('a,'b) t -> ('a * 'b) Enum.t > val to_list : ('a,'b) t -> ('a * 'b) list > > val iter : ('a -> 'b -> unit) -> ('a,'b) t -> unit > > val clear : ('a,'b) t -> unit > ------=_NextPart_000_0015_01C32C45.52938130-- > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Etnus, makers of TotalView, The best > thread debugger on the planet. Designed with thread debugging features > you've never dreamed of, try TotalView 6 free at www.etnus.com. > _______________________________________________ > ocaml-lib-devel mailing list > oca...@li... > https://lists.sourceforge.net/lists/listinfo/ocaml-lib-devel > |
From: Nicolas C. <war...@fr...> - 2003-06-06 09:30:01
|
> Tree node here takes 3 (for Tree+tuple, and two refs) allocations where > it can take 2 (or 1?) with: > > | Tree of ('a, 'b) tree_rec > > and type ('a, 'b) tree_rec = > { > a : 'a; > b : 'b; > mutable left : ('a, 'b) tree; > mutable right : ('a, 'b) tree; > } > > and this is more readable IMHO. True, thanks. That piece of code is pretty old, I just added the enum features without thinking about the internal representation. I'll update it this way when I have some time. Nicolas Cannasse |
From: Nicolas C. <war...@fr...> - 2003-06-06 07:06:03
|
Hi list, Here's a module proposal for adding to the ExtLib. This is a binary search tree ( module BinTree ). Operations add, remove, find are done in Log(n) when the tree is optimized. This is a mutable version of the data structure. Nicolas Cannasse |
From: MikhailFedotov <mi...@ki...> - 2003-06-05 07:08:56
|
Hi! > This is what I was talking about when I mentioned "makefile hacks". > Unfortunately, I know enough about the build environment of Windows > (despite several years busily repressing memories) that I am > wary of doing and scripting hacks. The only scripting language I am > sure exists is Ocaml. > > Hmm. That might be enough. All I need to do is: > > - move the original code foo.ml to a backup file. > > - concatenate the original code and the unit test code into > the new foo.ml I thought that the only problem is that internal functions are not present in the external interface, and can be solved by extending external inteface to include everything for the test version of the library. This way you'll end up with "testing the library", not just the code, as well as library compiling process, etc. More clean approach, without unwanted surprises. Mikhail |
From: Nicolas C. <war...@fr...> - 2003-06-05 02:33:01
|
> Here's an interesting debate point- what does a 1.0 release of ExtLib look > like? How close are we? > > Starting this discussion is tantaumont to saying "I think it's ready now", > which I'm not. I'm just thinking of maybe getting a rough sketch of what > people want/are working on. > > Here's my short list in approximate order of priority: > - fix the bug in psqueue > - unit tests at least some > - generic red-black tree library > - doubly linked list library > - bit fields/bit sets > > Comments? I think we should first review the current code, implement missing functions, add documentation everywhere, and have a look at several past projects ( CDK and friends ) to see if we've been including all needed "standard" functions. Then we should be able to make a first release, and then continue adding useful modules. Nicolas Cannasse |
From: Nicolas C. <war...@fr...> - 2003-06-05 02:10:09
|
[...] > Asserts get used *in addition to* strong type checking and unit tests. > And stay in the final "production" code. Yes, this is way overkill. As > the guy said when asked what do with his mother-in-law's mortal remains, > "embalm, cremate, then bury. Take no chances." There is a flag in ocamlc that does what you want : -noassert : Don't compile assertion checks. That way you may have a "debug" version and "release" version of the library. Note that the "assert false" are not removed, because they're handled as special cases. Nicolas Cannasse |
From: Brian H. <bri...@ql...> - 2003-06-04 22:35:25
|
Here's an interesting debate point- what does a 1.0 release of ExtLib look like? How close are we? Starting this discussion is tantaumont to saying "I think it's ready now", which I'm not. I'm just thinking of maybe getting a rough sketch of what people want/are working on. Here's my short list in approximate order of priority: - fix the bug in psqueue - unit tests at least some - generic red-black tree library - doubly linked list library - bit fields/bit sets Comments? Brian |
From: Brian H. <bh...@sp...> - 2003-06-04 22:02:57
|
On Thu, 5 Jun 2003, Mikhail Fedotov wrote: > Hi! > > This won't be unit testing in the usual sense then. Unit > testing covers external interface of the library; the main > point is to figure out that something is broken; as long as > it is known, it is easy to find exact error line, even by > putting "let () = print_string error_here in" in the code. > > Including tests for internal functions does not buy much > but makes the whole thing more complex. This kind of > testing is usually implemented by inserting asserts in C > code and by raising Invalid_argument in some ocaml > functions. My general experience has been that the smaller the hunk of code you test, the more likely you are to find problems and the easier it is to get full coverage (all possible conditions are tested). Yes, this makes things more complex. What I want is classic C conditional compilation. I want to be able to do "make test" which builds a copy of the libraries with unit testing enabled, and runs the unit tests. "make" would then build copies of the library sans the unit test code for "production". Asserts get used *in addition to* strong type checking and unit tests. And stay in the final "production" code. Yes, this is way overkill. As the guy said when asked what do with his mother-in-law's mortal remains, "embalm, cremate, then bury. Take no chances." > > If you really need to test internal function I think (with > my limited ocaml expirience) that it is possible to compile > a special version of the library by copying library.ml to > testlibrary.ml and creating the specil version of the > library out of it which would include all the functions. I > don't remember if .mli file is strictly required (I've > never tried to build a library without it), but if it is > not, then it is even easier. > This is what I was talking about when I mentioned "makefile hacks". Unfortunately, I know enough about the build environment of Windows (despite several years busily repressing memories) that I am wary of doing and scripting hacks. The only scripting language I am sure exists is Ocaml. Hmm. That might be enough. All I need to do is: - move the original code foo.ml to a backup file. - concatenate the original code and the unit test code into the new foo.ml - do the same with foo.mli - Compile foo.ml[i] - run the unit tests - delete the new foo.ml[i] - move the originals back Brian |
From: Mikhail F. <mi...@ki...> - 2003-06-04 21:00:07
|
Hi! > > > > My preference would be to be able to compile the > > > > library *without* the unit testing code in it. > > > > This will require some thought, and probably > > > > some makefile hacking. > > > > > > Why Brian ? > > > I mean, having a big ExtLib.cma is not a problem, > since > > > we're using different modules, > > > > Unit testing code is usually implemented as a seperate > > program which uses the library, not as part of the > library. > > > > This actually limits the ability of the unit testing to > nail down what > just broke. For example, the PSQueue module has internal > functions > "rotate_right" and "rotate_left". I'd like the unit > tests to test those > functions independently- and balance as seperate from > rotate. Allowing me > to tell if I broke insert, balance, or a rotate. > > My inclination is to agree with Nicolas and just toss the > test functions > into the library and worry about size when it becomes a > problem. This won't be unit testing in the usual sense then. Unit testing covers external interface of the library; the main point is to figure out that something is broken; as long as it is known, it is easy to find exact error line, even by putting "let () = print_string error_here in" in the code. Including tests for internal functions does not buy much but makes the whole thing more complex. This kind of testing is usually implemented by inserting asserts in C code and by raising Invalid_argument in some ocaml functions. If you really need to test internal function I think (with my limited ocaml expirience) that it is possible to compile a special version of the library by copying library.ml to testlibrary.ml and creating the specil version of the library out of it which would include all the functions. I don't remember if .mli file is strictly required (I've never tried to build a library without it), but if it is not, then it is even easier. Mikhail |
From: Brian H. <bh...@sp...> - 2003-06-04 19:12:08
|
On Wed, 4 Jun 2003, Mikhail Fedotov wrote: > Hi! > > > > My preference would be to be able to compile the > > > library *without* the unit testing code in it. > > > This will require some thought, and probably > > > some makefile hacking. > > > > Why Brian ? > > I mean, having a big ExtLib.cma is not a problem, since > > we're using different modules, > > Unit testing code is usually implemented as a seperate > program which uses the library, not as part of the library. > This actually limits the ability of the unit testing to nail down what just broke. For example, the PSQueue module has internal functions "rotate_right" and "rotate_left". I'd like the unit tests to test those functions independently- and balance as seperate from rotate. Allowing me to tell if I broke insert, balance, or a rotate. My inclination is to agree with Nicolas and just toss the test functions into the library and worry about size when it becomes a problem. Brian |
From: Nicolas C. <war...@fr...> - 2003-06-04 05:56:06
|
> > > My preference would be to be able to compile the > > > library *without* the unit testing code in it. > > > This will require some thought, and probably > > > some makefile hacking. > > > > Why Brian ? > > I mean, having a big ExtLib.cma is not a problem, since > > we're using different modules, > > Unit testing code is usually implemented as a seperate > program which uses the library, not as part of the library. Mea Culpa ! I really misunderstood the topic of this thread :) I was thinking that we could add an "Unit test library" to the ExtLib, which would of course be compiled with it. As for "writing unit tests for ExtLib" , if someone have enough will to so so, then why not... but I don't see myself covering all Enum cases that can occur with unit tests :-) I think we can make releases/bugfix enough often to wait for the users to find bugs. Nicolas Cannasse |
From: Mikhail F. <mi...@ki...> - 2003-06-04 05:43:16
|
Hi! > > My preference would be to be able to compile the > > library *without* the unit testing code in it. > > This will require some thought, and probably > > some makefile hacking. > > Why Brian ? > I mean, having a big ExtLib.cma is not a problem, since > we're using different modules, Unit testing code is usually implemented as a seperate program which uses the library, not as part of the library. Mikhail |
From: Nicolas C. <war...@fr...> - 2003-06-04 02:50:32
|
> let default_has_next next count = > let lookahead = ref None in > let has_next' () = > match lookahead with > Some x -> true > | None -> try > lookahead := Some (next ()); > true > with No_more_elements -> false > and next' () = > match lookahead with > Some x -> > lookahead := None; > x > | None -> > next() > and count' () -> > match lookahead with > Some _ -> 1 + (count ()) > | None -> count() This is a little more complicated than that... Actually calling count() can make next being called ( if this is a count calling force ), and then the return of count() will be the total number of elements, including the lookhead, so you'll count 1 more element since you're adding 1. You could first call count and then match again the lookahead to see if still Some.... but that's pretty inneficient. You can have a look at current Enum.peek implementation, I think it resolves the problem in a nice and elegant way. Nicolas Cannasse |
From: Nicolas C. <war...@fr...> - 2003-06-04 02:48:41
|
> > > I think it would be a good idea to add unit tests to all the library > > > functions using OUnit: > > > http://home.wanadoo.nl/maas/ocaml/ > > > > > > I think unit tests are *less* necessary in Ocaml due to our wonderful type > > > system, but they're still a good idea. > > > > I agree, but I'm not familiar enough with others langages Unit tests library > > to see if this implementation is a convenient way of doing unit tests. Maybe > > we could write a library mixing unit and benchmarks tests ? ( something > > similar to Doug Bagley's one ). Theses two should stick well together. > > > > My preference would be to be able to compile the library *without* the > unit testing code in it. This will require some thought, and probably > some makefile hacking. Why Brian ? I mean, having a big ExtLib.cma is not a problem, since we're using different modules, the ocaml linker will find itself which ones to choose. Of course, you also need the CMIs since there is currently no way of packaging several CMIs. But compilation speed shouldn't be a problem. BTW, I'm planning to add soon XmlLight to the ExtLib , and I think the best is to link XmlLight modules with others, even if we have separate compilation. About compilation and makefiles issues, I'm also planning to add some options to ocamake so you can write some kind of Ocamakefile and you don't need make any more (with all the problems of calling RM or DEL for cleaning, / against \ in paths , and so on...). Then the ExtLib will be simply compiled (installed ?) by running "ocamake" in its directory. Nicolas Cannasse |
From: Nicolas C. <war...@fr...> - 2003-06-04 02:38:39
|
Hi list, After the recent talks about enum features, I have made a major update. First, I started to think about the "clone" implementation : it was actually very complicated and pretty inefficient : in fact, cloning an enumeration on a list or on an array should be O(1) since we only need to get a different pointer on the list of copy the current index for an array. Clone was trying his best by using a shared list as cache between the clone and the original, but this was costly in the end, and actually doing several clones was adding more and more overload. So I decided to add clone:(unit -> 'a t) as argument for Enum.make function. Like this, we can have an efficient clone when available. Enum.from is then having a default clone which is calling "force" and the clone() again. Since force is building the list of elements, he can provide an O(1) clone function. Now that the clone issues has been resolved, I have been adding peek (the real one, doesn't fetch any data) and "empty : 'a t -> bool". In order to satisfy people who are interested in count() cost, I have been adding a boolean flag "fast_count" that give an hint about count cost. So we have for exemple : let empty t = if fast_count t then t.count() = 0 else peek t = None Please note that fast_count doesn't mean O(1) , since it can depends on the underlying implementation of the count function given to make as parameter. After that, I've been updated the *.enum functions of DynArray , ExtList and ExtString to provide a good clone implementation. I also improved ExtList.enum count() function by using a little hack so List.length will be called only once. Now my next though is about having "clone" being an optional parameter in Enum.make. Actually most of the time when you create a data structure with Enum.make, providing a next() and count() function, implementing clone() can be done in an efficient way, but is requiring a little thinking and some recursion, which make Enum.make a bit more (too much?) difficult to use for the average user. Then clone should perhaps be implemented as "inneficient by default" an the user would care providing an efficient primitive only if he's actually using it. Nicolas Cannasse |
From: Alan P. <ap...@re...> - 2003-06-03 02:10:18
|
In article <Pin...@ea...>, Brian Hurt wrote: > > That depends upon the program. I'd be most worried about 10+ maps on > 10_000_000 elements myself. With non-trivial map functions. As that's > where Enum really starts to shine- as you do wavefront processing. And > where tail recursion stops being an optimization and starts being required > for correctness. This is the problem with microbenchmarks. > > I'll see if I can't whip something up. Yes, real-world applications are better than benchmarks. I suggest, however, that benchmarks are better than speculation. |
From: Nicolas C. <war...@fr...> - 2003-06-03 01:21:43
|
> > Thanks, I get the same results on the windows MSVC version of OCaml. The > > None/Some approach is really slow. > > (BTW, I resolved Brian problem with having Enum exported next different from > > imported one by renaming exported "next" to "peek") > > Nope. Just moved it around. "peek" to me doesn't have side effects. > I'll think on names for a bit. True. I was thinking that the peek function of Stream was actualy removing it but after re-reading the documentation, it doesn't. What about "get" then ? Nicolas |
From: Brian H. <bri...@ql...> - 2003-06-02 22:24:11
|
On Mon, 2 Jun 2003, Nicolas Cannasse wrote: > > I think it would be a good idea to add unit tests to all the library > > functions using OUnit: > > http://home.wanadoo.nl/maas/ocaml/ > > > > I think unit tests are *less* necessary in Ocaml due to our wonderful type > > system, but they're still a good idea. > > I agree, but I'm not familiar enough with others langages Unit tests library > to see if this implementation is a convenient way of doing unit tests. Maybe > we could write a library mixing unit and benchmarks tests ? ( something > similar to Doug Bagley's one ). Theses two should stick well together. > My preference would be to be able to compile the library *without* the unit testing code in it. This will require some thought, and probably some makefile hacking. Brian |
From: Brian H. <bri...@ql...> - 2003-06-02 22:23:41
|
On Mon, 2 Jun 2003, Nicolas Cannasse wrote: > Thanks, I get the same results on the windows MSVC version of OCaml. The > None/Some approach is really slow. > (BTW, I resolved Brian problem with having Enum exported next different from > imported one by renaming exported "next" to "peek") Nope. Just moved it around. "peek" to me doesn't have side effects. I'll think on names for a bit. > > Please notice that you're here testing for the best case, where count is > O(1) , but sometimes it can be O(N) or even worse since sometimes we need to > create an intermediate list and do a lot of I/O (reading a file for > example). We can always implement an O(1) has_next given an O(1) next. This is only an advantage if we avoid creating an intermediate data structure, however. My personal opinion is that count should be able to return "I don't know"/"to find that out requires the creation of an intermediate datastructure". Assuming count returns int option, this would allow for code like: let rec use_enum e = match Enum.count e with Some c -> (* we have a good count *) ... | None -> (* explicitly create the intermediate datastructure *) use_enum (List.enum (List.of_enum e)) Alternatively, the user code might have more efficient ways of dealing with unknown length enumerations. For example, Dynarray.append_enum might look something like: let append_enum arr e = match Enum.count e with Some c -> (* Normal code, that prereserves c elements in the array, so the data is only copied once *) | None -> (* Just repeatedly call add *) Enum.iter (Dynarray.add arr) e > Interesting. In the first time, I would say this is a good idea, since with > your benchs we can see that there is little improvement when iterating using > a O(1) counter against using an exception . But how can we tell that > e.count() is fast ? We can surely tells when it is slow : Enum.from could > put a boolean to true saying that the "worst count" will be called. But what > about Enum.make ? For example , ExtList.enum has an O(N) count, since we > don't want to count the elements we creating an enum ( this is the lazy > approach : do it only when we need it ). We only call the O(N) count when we're about to do an O(N) function anyways (we never need to call count when doing a map). I see three possibilities here (assuming we're calling it on a list): 1) The list is short enough to fit into cache and is already in cache. This seems to be what we're benchmarking most of the time- in which case calling List.length will be very cheap. It will probably be about the same cost as N has_next calls, maybe even less. 2) The list is short enough to fit into cache, and is not already (entirely) in cache. In which case, the main cost (unless we're doing a very expensive function on each element) will be the cost of loading the list into cache. Remember that on modern CPUs, a cache miss can be 300+ clock cycles. 3) The list is too long to fit into cache. Here, has_next really shines. As if we use count, we have to load the entire list into cache twice. hash_next, however, will have the effect of simply loading the next element into cache (which we were going to have to do anyways). Case #3 is when performance matters the most- but paradoxically, it's when the various approachs we are discussing change performance the least. The cost is dominated by the memory latency, or the user functions. > When is the O(N) count faster than the exception catching ? I would say > "depends on the size of the list"... but we don't know the size :-) So if we > can classify "bad" counts we can't do it for "good" counts, and this > optimization will not work... No matter which way we pick to do things, I can probably come up with a scenario where "it'd be faster to do it this other way". I'd rate the performance of all the approaches as "acceptable" and go on. > > > So I think the question is: is it worth a 10-20% performance hit in > > the library to unify the stream and enum types? Myself: yes. If I'm worrying about every single clock cycle, I'm not writting in Ocaml, I'm writting in assembler. If I'm writting in Ocaml I'm looking for power and expressivity (this is, in fact, the normal case) and giving up much more than 10-20% in performance. > > My question is : when do we need speed ? For small lists when few operations > or for big lists with lots of operations ? With 5 maps on a 1000 elements > list, I got 200% speed using exceptions instead of None/Some. That depends upon the program. I'd be most worried about 10+ maps on 10_000_000 elements myself. With non-trivial map functions. As that's where Enum really starts to shine- as you do wavefront processing. And where tail recursion stops being an optimization and starts being required for correctness. This is the problem with microbenchmarks. I'll see if I can't whip something up. Brian |
From: Brian H. <bri...@ql...> - 2003-06-02 20:57:34
|
On Mon, 2 Jun 2003, Nicolas Cannasse wrote: > But we can't rewrite fold/iter using has_next because, has you mentionned, > the default has_next will test count > 0 and the default count will force > the enumeration of the data structure, resulting the building of an > intermediate list, which is what we actually don't want to do. Actually, we could have the default do a lookahead, like: let default_has_next next count = let lookahead = ref None in let has_next' () = match lookahead with Some x -> true | None -> try lookahead := Some (next ()); true with No_more_elements -> false and next' () = match lookahead with Some x -> lookahead := None; x | None -> next() and count' () -> match lookahead with Some _ -> 1 + (count ()) | None -> count() in Enum.make ~has_next:has_next' ~next:next' ~count:count' And then override it when count is O(1). Note that a map always gets the count and has_next functions, so we only ever allocate a single Some block (note that next doesn't allocate a some block). > Of course we could add the has_next function as a parameter to make but : > - when we call make, we usually have a good count , so testing count > 0 is > not so costly I agree that this is the default case- a fast count. Which is why I was making it the default case. On the other hand, handling the case when count is expensive is more complex code, so maybe the library should be supplying that. > - as the next function is returning a 'a option (and I still think that's > the best way of doing it) the user will not need an has_next to loop. No. I was thinking of doing it for the library. Another question I have: how often are we going to get "deep maps" (i.e. more than 1 layer deep)? I expect the vast majority of cases are going to be either: 1) no maps at all (converting from one data structure to another), or 2) exactly one map (doing a single conversion). So that, on average, we have less than 1 map we're going through. Now, I want to encourage deep maps, as it's a form of wavefront processing and thus more efficient in general. But I don't think they will be common. In optimizing, you optimize for the common case. Brian |