You can subscribe to this list here.
2003 |
Jan
|
Feb
(81) |
Mar
(97) |
Apr
(88) |
May
(80) |
Jun
(170) |
Jul
(9) |
Aug
|
Sep
(18) |
Oct
(58) |
Nov
(19) |
Dec
(7) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(22) |
Feb
(9) |
Mar
(28) |
Apr
(164) |
May
(186) |
Jun
(101) |
Jul
(143) |
Aug
(387) |
Sep
(69) |
Oct
(14) |
Nov
(8) |
Dec
(99) |
2005 |
Jan
(10) |
Feb
(34) |
Mar
(24) |
Apr
(7) |
May
(41) |
Jun
(20) |
Jul
(3) |
Aug
(23) |
Sep
(2) |
Oct
(26) |
Nov
(41) |
Dec
(7) |
2006 |
Jan
(6) |
Feb
(3) |
Mar
(11) |
Apr
|
May
|
Jun
(5) |
Jul
(8) |
Aug
(20) |
Sep
|
Oct
(6) |
Nov
(5) |
Dec
|
2007 |
Jan
|
Feb
(1) |
Mar
|
Apr
(3) |
May
(2) |
Jun
|
Jul
|
Aug
(1) |
Sep
(7) |
Oct
(6) |
Nov
(19) |
Dec
(11) |
2008 |
Jan
|
Feb
(7) |
Mar
(9) |
Apr
(21) |
May
(42) |
Jun
(27) |
Jul
(28) |
Aug
(26) |
Sep
(16) |
Oct
(32) |
Nov
(49) |
Dec
(65) |
2009 |
Jan
(35) |
Feb
(20) |
Mar
(36) |
Apr
(42) |
May
(111) |
Jun
(99) |
Jul
(70) |
Aug
(25) |
Sep
(15) |
Oct
(29) |
Nov
(3) |
Dec
(18) |
2010 |
Jan
(10) |
Feb
(4) |
Mar
(57) |
Apr
(63) |
May
(71) |
Jun
(64) |
Jul
(30) |
Aug
(49) |
Sep
(11) |
Oct
(4) |
Nov
(2) |
Dec
(3) |
2011 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
(1) |
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
2024 |
Jan
(1) |
Feb
(3) |
Mar
(6) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2025 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Alan P. <ap...@re...> - 2003-06-02 09:15:00
|
In article <000d01c328df$e687e6b0$2713f9ca@WARP>, Nicolas Cannasse wrote: > >> Another possibility would be to be clever about having fold and iter >> use (e.count ()) when it is fast, and use exception catching when >> (e.count ()) is slow, without actually having separate types. > > Interesting. In the first time, I would say this is a good idea, since with > your benchs we can see that there is little improvement when iterating using > a O(1) counter against using an exception . But how can we tell that > e.count() is fast ? We can surely tells when it is slow : Enum.from could > put a boolean to true saying that the "worst count" will be called. But what > about Enum.make ? For example , ExtList.enum has an O(N) count, since we > don't want to count the elements we creating an enum ( this is the lazy > approach : do it only when we need it ). > When is the O(N) count faster than the exception catching ? I would say > "depends on the size of the list"... but we don't know the size :-) So if we > can classify "bad" counts we can't do it for "good" counts, and this > optimization will not work... It doesn't have to be perfect to be an improvement. Enums from arrays or dynarrays, for instance, have fast counts. Enum.map would inherit the fast_count flag, whereas most other enum transforms (e.g. Enum.filter) would set the fast_count flag to zero. This could also be done by keeping the stream and enum types separate, with the stream type having no count. Either way would give the performance benefit; it's really a matter of making a nice library API for people. Doing it with types, there would be: Enum.map: ('a -> 'b) -> 'a enum -> 'b enum Stream.map: ('a -> 'b) -> 'a stream -> 'b stream Enum.filter: ('a -> bool) -> 'a enum -> 'a stream *** Note *** Stream.filter: ('a -> bool) -> 'a stream -> 'a stream Enum.iter: ('a -> unit) -> 'a enum -> unit Stream.iter: ('a -> unit) -> 'a stream -> unit where Enum.iter and Enum.filter use count, and Stream.iter and Stream.filter use a try-catch. List.stream: 'a list -> 'a stream List.from_stream: 'a stream -> 'a list List.from_enum: 'a enum -> 'a list Array.enum: 'a array -> 'a enum Array.from_stream: 'a stream -> 'a array Array.from_enum: 'a enum -> 'a array There is no List.enum, since you can't count a list easily. List.from_stream does a try-catch, and List.from_enum uses count. Of course reading lines from a file handle would be a stream, rather than an enum. There could be a function: Enum.from_stream: 'a stream -> 'a enum with warnings on it that it may be rather expensive, to do the Enum.force magic. Array.from_stream could be implemented as: let from_stream s = from_enum (Enum.from_stream s);; I don't see how a Stream.from_enum function would be useful. >> So I think the question is: is it worth a 10-20% performance hit in >> the library to unify the stream and enum types? > > Which unification ? You mean, using None/Some next function ? I am not at all suggesting using options to do the end-of-stream notification. Certainly I think that using exceptions is a much better way to go. The "unification" I was referring to was the decision about whether to have distinct enum and stream types. The "performance hit" I was referring to was the cost of using exceptions to iterate over an enum when you could have used the count. |
From: Nicolas C. <war...@fr...> - 2003-06-02 08:22:26
|
> I wrote a little program to benchmark the three approaches I've seen > discussed on the list for implementing streams/enums: > > 1. (e.next ()) returns an option > 2. for-loop style, relying on the count (not suitable for > streams, including filtered enums) > 3. exception-catching style (the current implementation) > > It tests folds over maps of enums of constant ints. [...] Thanks, I get the same results on the windows MSVC version of OCaml. The None/Some approach is really slow. (BTW, I resolved Brian problem with having Enum exported next different from imported one by renaming exported "next" to "peek") Please notice that you're here testing for the best case, where count is O(1) , but sometimes it can be O(N) or even worse since sometimes we need to create an intermediate list and do a lot of I/O (reading a file for example). That leads to your next question... > Another possibility would be to be clever about having fold and iter > use (e.count ()) when it is fast, and use exception catching when > (e.count ()) is slow, without actually having separate types. Interesting. In the first time, I would say this is a good idea, since with your benchs we can see that there is little improvement when iterating using a O(1) counter against using an exception . But how can we tell that e.count() is fast ? We can surely tells when it is slow : Enum.from could put a boolean to true saying that the "worst count" will be called. But what about Enum.make ? For example , ExtList.enum has an O(N) count, since we don't want to count the elements we creating an enum ( this is the lazy approach : do it only when we need it ). When is the O(N) count faster than the exception catching ? I would say "depends on the size of the list"... but we don't know the size :-) So if we can classify "bad" counts we can't do it for "good" counts, and this optimization will not work... > So I think the question is: is it worth a 10-20% performance hit in > the library to unify the stream and enum types? Which unification ? You mean, using None/Some next function ? As excepted, the smaller the list is and the lesser there is map operations applied on, then the better is None/Some performances compared to exceptions one (note that you need 0 maps and list size less than 50 elements to have None/Some being faster that exceptions). My question is : when do we need speed ? For small lists when few operations or for big lists with lots of operations ? With 5 maps on a 1000 elements list, I got 200% speed using exceptions instead of None/Some. Nicolas Cannasse |
From: Alan P. <ap...@re...> - 2003-06-02 06:58:03
|
I wrote a little program to benchmark the three approaches I've seen discussed on the list for implementing streams/enums: 1. (e.next ()) returns an option 2. for-loop style, relying on the count (not suitable for streams, including filtered enums) 3. exception-catching style (the current implementation) It tests folds over maps of enums of constant ints. Runs of it (ocamlopt compiled, of course) demonstrate to me that: a) The option-based approach is slowest, especially when doing lots of nested maps. b) The for-loop approach is fastest. c) The exception-based approach is in the middle, performance-wise. In the pathological case (only one map, with only five elements in the enum), this approach takes 1.4 times as much CPU as the for-loop approach. With three nested maps, and 500 elements in the enum, it requires 1.13 times the CPU. So I think the question is: is it worth a 10-20% performance hit in the library to unify the stream and enum types? Another possibility would be to be clever about having fold and iter use (e.count ()) when it is fast, and use exception catching when (e.count ()) is slow, without actually having separate types. Below is the program listing. Please let me know if I screwed up somewhere. Alan ------------------------------------------------------------ (* * bench.ml * * Author: Alan Post <ap...@re...> * Usage: test performance of various implementations of Enum * * 1: option-based (also suitable for streams) * 2: count-based (not suitable for streams) * 3: exception-based (also suitable for streams) *) type 'a enum_1 = { mutable next: unit -> 'a option; mutable count: unit -> int };; type 'a enum_2 = { mutable next: unit -> 'a; mutable count: unit -> int };; exception No_more_elements;; let rec fold_1 f x e = match e.next () with None -> x | Some v -> fold_1 f (f v x) e and map_1 f e = { count = e.count; next = (fun () -> match e.next () with None -> None | Some v -> Some (f v))} and init_1 value num = let rnum = ref num in { count = (fun () -> !rnum); next = (fun () -> match !rnum with 0 -> None | _ -> Some (decr rnum; value))} and fold_2 f x e = let rec loop c accu = match c with 0 -> accu | _ -> loop (c-1) (f (e.next ()) accu) in loop (e.count ()) x and map_2 f e = { count = e.count; next = (fun () -> f (e.next ()))} and init_2 value num = let rnum = ref num in { count = (fun () -> !rnum); next = (fun () -> match !rnum with 0 -> raise No_more_elements | _ -> (decr rnum; value))} and fold_3 f x e = let result = ref x in let rec loop () = result := f (e.next ()) !result; loop () in try loop () with No_more_elements -> !result in let map_3 = map_2 and init_3 = init_2 in let nest f n = let rec loop f g n = match n with 0 -> g | _ -> loop f (fun e -> f (g e)) (n-1) in loop f f n and which = Sys.argv.(1) and items = (int_of_string Sys.argv.(2)) and nests = (int_of_string Sys.argv.(3)) and iters = (int_of_string Sys.argv.(4)) in let once = match which with "1" -> (fun () -> ignore (fold_1 (+) 0 ((nest (map_1 (( * ) 2)) nests) (init_1 1 items)))) | "2" -> (fun () -> ignore (fold_2 (+) 0 ((nest (map_2 (( * ) 2)) nests) (init_2 1 items)))) | "3" -> (fun () -> ignore (fold_3 (+) 0 ((nest (map_3 (( * ) 2)) nests) (init_3 1 items)))) | _ -> raise (Invalid_argument which) in let rec loop n = if n == 0 then () else (once (); loop (n-1)) in loop iters; print_string "done\n";; |
From: Nicolas C. <war...@fr...> - 2003-06-02 03:09:28
|
> I think it would be a good idea to add unit tests to all the library > functions using OUnit: > http://home.wanadoo.nl/maas/ocaml/ > > I think unit tests are *less* necessary in Ocaml due to our wonderful type > system, but they're still a good idea. I agree, but I'm not familiar enough with others langages Unit tests library to see if this implementation is a convenient way of doing unit tests. Maybe we could write a library mixing unit and benchmarks tests ? ( something similar to Doug Bagley's one ). Theses two should stick well together. Nicolas Cannasse |
From: Nicolas C. <war...@fr...> - 2003-06-02 02:52:22
|
> Sorry, forgot to mention: the default has_next calls count() and compares > it against 0. Which means only in places where count() would be > signifigantly more expensive than has_next() do we need to (or can) supply > a seperate has_next(). Thanks Brian, I already had some thoughs about adding this has_next function , which can be useful. But we can't rewrite fold/iter using has_next because, has you mentionned, the default has_next will test count > 0 and the default count will force the enumeration of the data structure, resulting the building of an intermediate list, which is what we actually don't want to do. Seen like this, the has_next should be only used on the user side, and could be implemented by : let has_next t = t.count() > 0 with some documentation telling that it can be pretty ineficient (the first call only). Of course we could add the has_next function as a parameter to make but : - when we call make, we usually have a good count , so testing count > 0 is not so costly - when we don't call make (call from) we don't have a count... I can see few samples when we can peek data (without actually reading it) to see if there is some available, but I'm not sure they're worth it. - as the next function is returning a 'a option (and I still think that's the best way of doing it) the user will not need an has_next to loop. Nicolas Cannasse |
From: Brian H. <bri...@ql...> - 2003-06-01 19:49:12
|
On Thu, 29 May 2003, Alan Post wrote: > > let iter f e = > > let c = e.count() in > > for i = 1 to c do > > f (e.next()) > > done > > > This undoes the lazy nature of Enum.filter, Enum.filter_map, and even > more seriously, input_enum and input_char_enum. Basically, anywhere > the code uses Enum.from will be a problem when relying on Enum.count. > This doesn't change the lazy nature of map- map remains the same. Calling count on the result of Enum.from does force the entire enumeration into memory- granted. Which is why I thought of the has_next() idea. Which means we only need to be one element ahead ever. Personally, I'd also like a way for count to be able to say "I don't know how many elements are in the enumeration". But I got shot down on that last time around. Brian |
From: Alan P. <ap...@re...> - 2003-05-31 21:43:06
|
In article <Pin...@ea...>, Brian Hurt wrote: > > Sorry, I just had an epiphany on this issue. With enums we always have a > good count, right? I mean, we're not implementing my "-1 means we don't > know how many elements are in the enumeration". > > The enum code should use this and never cause an exception to be thrown. > Therefor it doesn't need to catch the exceptions either. > > let iter f e = > let c = e.count() in > for i = 1 to c do > f (e.next()) > done > > let fold f x e = > let rec loop c accu = > if c <= 0 then accu > else loop (c-1) (f (e.next ()) accu) > in > loop (e.count ()) x This undoes the lazy nature of Enum.filter, Enum.filter_map, and even more seriously, input_enum and input_char_enum. Basically, anywhere the code uses Enum.from will be a problem when relying on Enum.count. |
From: Brian H. <bri...@ql...> - 2003-05-30 21:39:59
|
Sorry, forgot to mention: the default has_next calls count() and compares it against 0. Which means only in places where count() would be signifigantly more expensive than has_next() do we need to (or can) supply a seperate has_next(). Brian |
From: Brian H. <bri...@ql...> - 2003-05-30 21:33:53
|
Count is called, generally, for two reasons: 1) To pre-reserve space for all the elements in the enumeration. This is how dynArray uses it. 2) To see if there are any more elements left in the enumeration (count > 0). The second case is the interesting one- it can always be O(1). And it's the information we most often need. So why not provide it directly? Iter and map then start looking like: let iter f t = let rec loop () = if t.has_next() then f (t.next()); loop () else () in loop () let fold f init t = let rec loop accu = if t.has_next() then loop (f (t.next()) accu) else accu in loop init In extString, we'd have: let enum l = let lr = ref l in Enum.make ~next:(fun () -> match !lr with | [] -> raise Enum.No_more_elements | h :: t -> lr := t; h ) ~count:(fun () -> length !lr ) ~has_next:(fun () -> match !lr with | [] -> false | _ -> true ) This gets rid of the annoying O(N) count cost. Brian |
From: Brian H. <bri...@ql...> - 2003-05-30 21:16:01
|
I think it would be a good idea to add unit tests to all the library functions using OUnit: http://home.wanadoo.nl/maas/ocaml/ I think unit tests are *less* necessary in Ocaml due to our wonderful type system, but they're still a good idea. Brian |
From: Brian H. <bri...@ql...> - 2003-05-30 20:22:27
|
On Fri, 30 May 2003, Diego Olivier Fernandez Pons wrote: > Bonjour, > > > Because I have this dream of someday going through and changing the node > > definitions so that instead of: > > Of course I understand... but having a data structure that might be > slower but exists and works is better than a fast data structure which > does not exists yet or is bugged. > Agreed. But the bug occurred because I misremembered how red-black trees worked, and didn't work throught the logic enough (and obviously didn't test enough). Not because I picked too complex of a datastructure. Brian |
From: Diego O. F. P. <Die...@et...> - 2003-05-30 19:37:51
|
Bonjour, > Because I have this dream of someday going through and changing the node > definitions so that instead of: Of course I understand... but having a data structure that might be slower but exists and works is better than a fast data structure which does not exists yet or is bugged. Diego Olivier |
From: Brian H. <bri...@ql...> - 2003-05-30 19:13:42
|
On Fri, 30 May 2003, Diego Olivier Fernandez Pons wrote: > Why don't you try AVL balancing scheme ? As stated by Ralf Hinze, it > does not have the property of a finite number of rebalancing > operations in the delete process (logarithmic instead) but should be > much more easy to implement. Ralf Hinze uses in his paper the > weight-balanced scheme which is quite similar. Because I have this dream of someday going through and changing the node definitions so that instead of: type 'a node_t = Node of color_t * dominance_t * ... (i.e. use a word each for color and dominance) of instead doing: type 'a node_t = RedLeft of ... (* red color, left dominance *) | RedRight of ... (* red color, right dominance *) | BlackLeft of ... | BlackRight of ... Color and dominance then become 1 bit values effectively. The pattern matching, however, then becomes *ahem* interesting. I can't simply throw in a _ in the proper place to ignore color or dominance. I may at that point switch away from pattern matching to branching conditionals, I don't know. But this is the sort of thing which a library can do. Brian |
From: Diego O. F. P. <Die...@et...> - 2003-05-30 19:00:02
|
Bonjour, On Fri, 30 May 2003, Brian Hurt wrote: > Basically, what I need to do is break my single modify function into three > different functions- one for insert (number of nodes increases), one for > delete (number of node decreases), and while I'm at it one for update > (number of nodes and structure of the tree remains the same). Why don't you try AVL balancing scheme ? As stated by Ralf Hinze, it does not have the property of a finite number of rebalancing operations in the delete process (logarithmic instead) but should be much more easy to implement. Ralf Hinze uses in his paper the weight-balanced scheme which is quite similar. Diego Olivier |
From: Brian H. <bri...@ql...> - 2003-05-30 18:49:56
|
There is a balancing problem with PSQueue on delete- deletes do not preserve balancing constraints and produce illegal red-black trees. The problem was that until now I didn't know how to fix this problem. I think I've figured out what needs to happen. Unfortunately, the fix is going to be somewhat ugly. It'll happen this weekend, and I should get working code in next week. Basically, what I need to do is break my single modify function into three different functions- one for insert (number of nodes increases), one for delete (number of node decreases), and while I'm at it one for update (number of nodes and structure of the tree remains the same). Brian |
From: Nicolas C. <war...@fr...> - 2003-05-30 07:33:06
|
Corrected the implementations of : append , concat and clone, who were causing strange behaviors when mixed with calls to force(). The main problem is that only force should be authorized to modify the next and count functions after creation. I modified the append and concat so they don't , but clone need to modify the enum passed as parameter, and is needing to store the current next and count functions when called, and then giving buggy behaviors , especially when counting elements. The current implementation seems to resolve all problems, I made some tests when cloning twice and then printing/counting in different orders to check if the results are the same. Nicolas Cannasse |
From: Nicolas C. <war...@fr...> - 2003-05-30 03:59:36
|
> let enum_dup c val = > (* returns an enumeration which is just val repeated c times *) > let cref = ref c in > let next () = if (!cref) > 0 then decr cref; val > else raise No_more_elements > and count () = !cref > in > Enum.make ~next:next ~count:count Added to the CVS : Enum.init : int -> (int -> 'a) -> 'a t Similar to List.init and Array.init Nicolas Cannasse |
From: Brian H. <bri...@ql...> - 2003-05-29 18:44:30
|
Sorry, I just had an epiphany on this issue. With enums we always have a good count, right? I mean, we're not implementing my "-1 means we don't know how many elements are in the enumeration". The enum code should use this and never cause an exception to be thrown. Therefor it doesn't need to catch the exceptions either. let iter f e = let c = e.count() in for i = 1 to c do f (e.next()) done let fold f x e = let rec loop c accu = if c <= 0 then accu else loop (c-1) (f (e.next ()) accu) in loop (e.count ()) x Calling next when count == 0, when there are no more elements, is an *error*. Of course it throws an exception. But the Enum code shouldn't do that. And if next doesn't have any more elements and count > 0, then something exceptional did happen. The above code strikes me as being clean and nice. And it solves your problem of an exception from next being "accidentally" caught at a higher level, and enum doesn't catch exceptions. And Enum.next has the same interface as the next passed into Enum.make has. The above code strikes me as being clean and easy to understand. Two comments on this: 1) on some data structures, calling count will be expensive- for example, on lists it's O(N). 2) Your example changes behavior: > > exemple : > > let e1 = List.enum [1;2;3] > let e2 = List.enum [] > > let combine x = > (x , Enum.next e2) > > let l = Enum.of_list (Enum.map combine e1) > > => [] Here, instead of returning the empty list, the above code throws an uncaught exception. Which is actually, I think, the correct thing to do. What you wanted to express above is, I think: let combine a b = a, b in let l = List.of_enum (Enum.map2 combine e1 e2) But this also allows you to do: let combine x = try (x, Enum.next e2) with No_more_elements -> x, 0 in let l = List.of_enum (Enum.map combine e1) As a general note, data structures we add in the future should have fast (O(1)) length routines, even at the cost of a little more bookkeeping. Not a lot we can do with lists at this point, but if everything else should be fast. And at worst case we spin through the data structure twice. This actually encourages piling multiple maps onto a single list, and not actually instantiating the intermediate lists. Sort of the cheap version of lazy evaluation. Brian |
From: Brian H. <bri...@ql...> - 2003-05-29 17:53:51
|
On Thu, 29 May 2003, William D. Neumann wrote: > On Thu, 29 May 2003, Brian Hurt wrote: > > Yeah, exposing phony nulls was one of the philisophical issues I was > wrestling with as I added my expansion hack. At the time I decided that > the ability to do this outweighed the potential to do something silly (and > I promised myself to always look out for null values (the null I'm using > in this case would never be a valid value, so I can ignore them when I run > across them). One of the reasons I didn't want to use options as the array elements- for the cases where they weren't needed, and there really was a safe null. > > And you're right, expanding the array the way I described it won't work > like I wanted, you have to do something along the lines of: > > let null = "";; > let j = DynArray.make 2 null;; > DynArray.add j "blah";; > DynArray.add j "flah";; > > (* Now I want to add something to slot 9 *) > > ignore (Dynarray.append j (DynArray.init 8 8 null (fun x -> null)));; > DynArray.set j 9 "flee";; Given: let enum_dup c val = (* returns an enumeration which is just val repeated c times *) let cref = ref c in let next () = if (!cref) > 0 then decr cref; val else raise No_more_elements and count () = !cref in Enum.make ~next:next ~count:count You don't actually need to create the intermediate array. You can just do: DynArray.set_enum j (DynArray.length j) (enum_dup 8 ""); DynArray.set j 9 "flee";; Mental note to self: need to add DynArray.append_enum. > Sigh...why does doing things the right way always interfere with doing > what I think I want... Because what you think you wanted to do is not what you really wanted to do. Perhaps instead of a Dynamic Array, you wanted a hash table mapping int->'a? Brian |
From: William D. N. <wne...@cs...> - 2003-05-29 17:07:43
|
On Thu, 29 May 2003, Brian Hurt wrote: > I don't have a problem with a pre-expand, but adding elements out of order > doesn't work (at least the way I think they work). If the dynarray has n > elements, the elements are numbers 0..n-1 inclusive. > > The solution is to use a 'a option Dynarray.t. You're null element then > becomes None. I dislike exposing the null elements as "valid entries"- to > my mind, they're just filler. This is especially important if you > consider an int Dynarray.t - is that 0 a real, important 0, or a null > element? So I would would fake up a enum that returns n Nones and append > it to the dynarray to increase the size. Dynarray.set can then change the > Nones to Some x. Yeah, exposing phony nulls was one of the philisophical issues I was wrestling with as I added my expansion hack. At the time I decided that the ability to do this outweighed the potential to do something silly (and I promised myself to always look out for null values (the null I'm using in this case would never be a valid value, so I can ignore them when I run across them). And you're right, expanding the array the way I described it won't work like I wanted, you have to do something along the lines of: let null = "";; let j = DynArray.make 2 null;; DynArray.add j "blah";; DynArray.add j "flah";; (* Now I want to add something to slot 9 *) ignore (Dynarray.append j (DynArray.init 8 8 null (fun x -> null)));; DynArray.set j 9 "flee";; Where the length field is artificially pushed out to the end of the array, so stuff can be added out of order. But this does expose all those not-really-null nulls in spots 2 - 8. Now that I think about it this might be best left out. However, if it is added, it should be called unsafe_expand or something similar so people realize that they're willfully doing something "incorrect". Sigh...why does doing things the right way always interfere with doing what I think I want... William D. Neumann --- "Well I could be a genius, if I just put my mind to it. And I...I could do anything, if only I could get 'round to it. Oh we were brought up on the space-race, now they expect you to clean toilets. When you've seen how big the world is, how can you make do with this? If you want me, I'll be sleeping in - sleeping in throughout these glory days." -- Jarvis Cocker |
From: Brian H. <bri...@ql...> - 2003-05-29 16:34:28
|
On Thu, 29 May 2003, William D. Neumann wrote: > I have a situation where I need to expand the > size of the array to fit the data I will be adding, but I might be adding > the elements out of order, so I can't just call DynArray.add I don't have a problem with a pre-expand, but adding elements out of order doesn't work (at least the way I think they work). If the dynarray has n elements, the elements are numbers 0..n-1 inclusive. The solution is to use a 'a option Dynarray.t. You're null element then becomes None. I dislike exposing the null elements as "valid entries"- to my mind, they're just filler. This is especially important if you consider an int Dynarray.t - is that 0 a real, important 0, or a null element? So I would would fake up a enum that returns n Nones and append it to the dynarray to increase the size. Dynarray.set can then change the Nones to Some x. We might want to consider adding a function in Enum to create an enum that returns so many of a given value, to make this easier. Brian |
From: William D. N. <wne...@cs...> - 2003-05-29 16:18:53
|
I currently need some of the DynArrays that I'm working with to have the ability to expand to a given size without adding a series of elements onto the end of the array (I have a situation where I need to expand the size of the array to fit the data I will be adding, but I might be adding the elements out of order, so I can't just call DynArray.add). Now, I can easily do this externally with a call to make and then blit (or append if the DynArray is a non-mutable record field or class member), or I could add it to my local copy of the DynArray module itself via something like: let expand darr new_length = if new_length > darr.length then changelength darr (new_length - darr.length) and adding the appropriate blurb to the interface. I'm just wondering if anyone else thinks this would be useful enough to add to the DynArray module proper... Discuss. William D. Neumann --- "Well I could be a genius, if I just put my mind to it. And I...I could do anything, if only I could get 'round to it. Oh we were brought up on the space-race, now they expect you to clean toilets. When you've seen how big the world is, how can you make do with this? If you want me, I'll be sleeping in - sleeping in throughout these glory days." -- Jarvis Cocker |
From: Nicolas C. <war...@fr...> - 2003-05-29 02:11:00
|
> Enum.fold has to set up an exception stack every time it calls next. It > only catches an exception once- but it has to set up the stack frame every > single time. And that costs. You're benchmark is just comparing one try...catch with one None/Some matching. When you are doing some O(1) functional operations you have to do 1 try...catch and 'N' None/Some matching. But according to your results I modified the implementation of fold in order not to setup a stack frame every time : let fold f init t = let acc = ref init in let rec loop() = acc := f (t.next()) !acc; loop() in try loop() with No_more_elements -> !acc It should now be very efficient. > In either case, interface harmonization is a more important issue IMHO. > The exported definition of next should not be any more complicated than: > let next t = t.next () I already told several times about this : if we let escape the No_more_elements exception outside of the API, then maybe one iter upper in the call stack will catch it and break iteration , so if the user fail to catch it properly when calling next() it can cause very error prone behaviors since he won't be noticed about the exception being raised. exemple : let e1 = List.enum [1;2;3] let e2 = List.enum [] let combine x = (x , Enum.next e2) let l = Enum.of_list (Enum.map combine e1) => [] So we have to keep the current next implementation. Nicolas Cannasse |
From: Brian H. <bri...@ql...> - 2003-05-28 16:12:28
|
On Wed, 28 May 2003, Nicolas Cannasse wrote: > > > I don't agree here : > > > - BUT the internal next is most of the time "concatening" the next > functions > > > in a purely functional way, so for example after two Enum.maps and doing > an > > > Enum.iter, you'll have to check the None/Some three times ! > > > > For Enum.iter this is correct. However, for doing just about anything > > else you need to try/catch the exception the same number of times you need > > to check for None/Some. And I would expect map and fold to be at least as > > important as iter. > > I don't think so... here's current Enum.map implementation : Oh. You're right- I forgot Enum.map was O(1). I was mainly thinking of fold, and tossed map in as well without thinking about it. > In this sequence, there is only 1 catch for the exception if you're using > Enum.iter or 2 if you're using Enum.fold , and it does not depends of the > number of maps / filters you have done ! Here will be a map with next > returning 'a option : Enum.fold has to set up an exception stack every time it calls next. It only catches an exception once- but it has to set up the stack frame every single time. And that costs. > Exceptions are really cheap in ocaml, better than pattern matching since the > caml stay unwinding is very efficient. Attached are a pair of simple benchmarks I whipped up. Basically, they compare the relative costs of raising an exception vr.s returing None/Some. I checked the output, and compiling with "ocamlopt -o test test1.ml" (or 2, as the case may be) doesn't "overoptimize" the test. The file called test1.ml uses exceptions, and runs in 2.18 seconds on my system. The file called test2.ml uses Some/None, and runs in 1.3 seconds. Exceptions are fast. Just not as fast as allocating a word of memory and taking a branch. Playing around with the code a little bit leads me to beleive that 0.7 seconds are program overhead, so as an approximation, catching an exception is 2.5x as expensive as Some/None. Rough approximation here. But it's actually a lot closer than I would have guessed. In either case, interface harmonization is a more important issue IMHO. The exported definition of next should not be any more complicated than: let next t = t.next () Brian |
From: Nicolas C. <war...@fr...> - 2003-05-28 09:06:56
|
> > I don't agree here : > > - BUT the internal next is most of the time "concatening" the next functions > > in a purely functional way, so for example after two Enum.maps and doing an > > Enum.iter, you'll have to check the None/Some three times ! > > For Enum.iter this is correct. However, for doing just about anything > else you need to try/catch the exception the same number of times you need > to check for None/Some. And I would expect map and fold to be at least as > important as iter. I don't think so... here's current Enum.map implementation : let map f t = { count = t.count; next = (fun () -> f (t.next())); } Every time we're catching only the exception one at the very end, since the map doesn't check for it, and the fold is only used to "finalize" an d Enum. Common usage of enums is the following : let e = DataStructure.enum d in let e = Enum.map f e in ... several maps / filter .... let d = DataStructure.of_enum d (* this one will call Enum.fold or Enum.iter *) In this sequence, there is only 1 catch for the exception if you're using Enum.iter or 2 if you're using Enum.fold , and it does not depends of the number of maps / filters you have done ! Here will be a map with next returning 'a option : let map f t = { count = t.count; next = (fun () -> match t.next() with None -> None | Some e -> Some (f e)); } and thus matching and allocating blocks all around... Exceptions are really cheap in ocaml, better than pattern matching since the caml stay unwinding is very efficient. Nicolas Cannasse |