From: Alan P. <ap...@re...> - 2003-06-02 09:15:00
|
In article <000d01c328df$e687e6b0$2713f9ca@WARP>, Nicolas Cannasse wrote: > >> Another possibility would be to be clever about having fold and iter >> use (e.count ()) when it is fast, and use exception catching when >> (e.count ()) is slow, without actually having separate types. > > Interesting. In the first time, I would say this is a good idea, since with > your benchs we can see that there is little improvement when iterating using > a O(1) counter against using an exception . But how can we tell that > e.count() is fast ? We can surely tells when it is slow : Enum.from could > put a boolean to true saying that the "worst count" will be called. But what > about Enum.make ? For example , ExtList.enum has an O(N) count, since we > don't want to count the elements we creating an enum ( this is the lazy > approach : do it only when we need it ). > When is the O(N) count faster than the exception catching ? I would say > "depends on the size of the list"... but we don't know the size :-) So if we > can classify "bad" counts we can't do it for "good" counts, and this > optimization will not work... It doesn't have to be perfect to be an improvement. Enums from arrays or dynarrays, for instance, have fast counts. Enum.map would inherit the fast_count flag, whereas most other enum transforms (e.g. Enum.filter) would set the fast_count flag to zero. This could also be done by keeping the stream and enum types separate, with the stream type having no count. Either way would give the performance benefit; it's really a matter of making a nice library API for people. Doing it with types, there would be: Enum.map: ('a -> 'b) -> 'a enum -> 'b enum Stream.map: ('a -> 'b) -> 'a stream -> 'b stream Enum.filter: ('a -> bool) -> 'a enum -> 'a stream *** Note *** Stream.filter: ('a -> bool) -> 'a stream -> 'a stream Enum.iter: ('a -> unit) -> 'a enum -> unit Stream.iter: ('a -> unit) -> 'a stream -> unit where Enum.iter and Enum.filter use count, and Stream.iter and Stream.filter use a try-catch. List.stream: 'a list -> 'a stream List.from_stream: 'a stream -> 'a list List.from_enum: 'a enum -> 'a list Array.enum: 'a array -> 'a enum Array.from_stream: 'a stream -> 'a array Array.from_enum: 'a enum -> 'a array There is no List.enum, since you can't count a list easily. List.from_stream does a try-catch, and List.from_enum uses count. Of course reading lines from a file handle would be a stream, rather than an enum. There could be a function: Enum.from_stream: 'a stream -> 'a enum with warnings on it that it may be rather expensive, to do the Enum.force magic. Array.from_stream could be implemented as: let from_stream s = from_enum (Enum.from_stream s);; I don't see how a Stream.from_enum function would be useful. >> So I think the question is: is it worth a 10-20% performance hit in >> the library to unify the stream and enum types? > > Which unification ? You mean, using None/Some next function ? I am not at all suggesting using options to do the end-of-stream notification. Certainly I think that using exceptions is a much better way to go. The "unification" I was referring to was the decision about whether to have distinct enum and stream types. The "performance hit" I was referring to was the cost of using exceptions to iterate over an enum when you could have used the count. |