From: Brian H. <bri...@ql...> - 2003-05-30 21:33:53
|
Count is called, generally, for two reasons: 1) To pre-reserve space for all the elements in the enumeration. This is how dynArray uses it. 2) To see if there are any more elements left in the enumeration (count > 0). The second case is the interesting one- it can always be O(1). And it's the information we most often need. So why not provide it directly? Iter and map then start looking like: let iter f t = let rec loop () = if t.has_next() then f (t.next()); loop () else () in loop () let fold f init t = let rec loop accu = if t.has_next() then loop (f (t.next()) accu) else accu in loop init In extString, we'd have: let enum l = let lr = ref l in Enum.make ~next:(fun () -> match !lr with | [] -> raise Enum.No_more_elements | h :: t -> lr := t; h ) ~count:(fun () -> length !lr ) ~has_next:(fun () -> match !lr with | [] -> false | _ -> true ) This gets rid of the annoying O(N) count cost. Brian |
From: Brian H. <bri...@ql...> - 2003-05-30 21:39:59
|
Sorry, forgot to mention: the default has_next calls count() and compares it against 0. Which means only in places where count() would be signifigantly more expensive than has_next() do we need to (or can) supply a seperate has_next(). Brian |
From: Nicolas C. <war...@fr...> - 2003-06-02 02:52:22
|
> Sorry, forgot to mention: the default has_next calls count() and compares > it against 0. Which means only in places where count() would be > signifigantly more expensive than has_next() do we need to (or can) supply > a seperate has_next(). Thanks Brian, I already had some thoughs about adding this has_next function , which can be useful. But we can't rewrite fold/iter using has_next because, has you mentionned, the default has_next will test count > 0 and the default count will force the enumeration of the data structure, resulting the building of an intermediate list, which is what we actually don't want to do. Seen like this, the has_next should be only used on the user side, and could be implemented by : let has_next t = t.count() > 0 with some documentation telling that it can be pretty ineficient (the first call only). Of course we could add the has_next function as a parameter to make but : - when we call make, we usually have a good count , so testing count > 0 is not so costly - when we don't call make (call from) we don't have a count... I can see few samples when we can peek data (without actually reading it) to see if there is some available, but I'm not sure they're worth it. - as the next function is returning a 'a option (and I still think that's the best way of doing it) the user will not need an has_next to loop. Nicolas Cannasse |
From: Brian H. <bri...@ql...> - 2003-06-02 20:57:34
|
On Mon, 2 Jun 2003, Nicolas Cannasse wrote: > But we can't rewrite fold/iter using has_next because, has you mentionned, > the default has_next will test count > 0 and the default count will force > the enumeration of the data structure, resulting the building of an > intermediate list, which is what we actually don't want to do. Actually, we could have the default do a lookahead, like: let default_has_next next count = let lookahead = ref None in let has_next' () = match lookahead with Some x -> true | None -> try lookahead := Some (next ()); true with No_more_elements -> false and next' () = match lookahead with Some x -> lookahead := None; x | None -> next() and count' () -> match lookahead with Some _ -> 1 + (count ()) | None -> count() in Enum.make ~has_next:has_next' ~next:next' ~count:count' And then override it when count is O(1). Note that a map always gets the count and has_next functions, so we only ever allocate a single Some block (note that next doesn't allocate a some block). > Of course we could add the has_next function as a parameter to make but : > - when we call make, we usually have a good count , so testing count > 0 is > not so costly I agree that this is the default case- a fast count. Which is why I was making it the default case. On the other hand, handling the case when count is expensive is more complex code, so maybe the library should be supplying that. > - as the next function is returning a 'a option (and I still think that's > the best way of doing it) the user will not need an has_next to loop. No. I was thinking of doing it for the library. Another question I have: how often are we going to get "deep maps" (i.e. more than 1 layer deep)? I expect the vast majority of cases are going to be either: 1) no maps at all (converting from one data structure to another), or 2) exactly one map (doing a single conversion). So that, on average, we have less than 1 map we're going through. Now, I want to encourage deep maps, as it's a form of wavefront processing and thus more efficient in general. But I don't think they will be common. In optimizing, you optimize for the common case. Brian |
From: Nicolas C. <war...@fr...> - 2003-06-04 02:50:32
|
> let default_has_next next count = > let lookahead = ref None in > let has_next' () = > match lookahead with > Some x -> true > | None -> try > lookahead := Some (next ()); > true > with No_more_elements -> false > and next' () = > match lookahead with > Some x -> > lookahead := None; > x > | None -> > next() > and count' () -> > match lookahead with > Some _ -> 1 + (count ()) > | None -> count() This is a little more complicated than that... Actually calling count() can make next being called ( if this is a count calling force ), and then the return of count() will be the total number of elements, including the lookhead, so you'll count 1 more element since you're adding 1. You could first call count and then match again the lookahead to see if still Some.... but that's pretty inneficient. You can have a look at current Enum.peek implementation, I think it resolves the problem in a nice and elegant way. Nicolas Cannasse |