From: skaller <sk...@us...> - 2004-05-27 20:52:02
|
Idea for partially fixing enum semnatics. (1) Move 'count()' up with the iterators. Count is specified to consume the enumeration. It doesn't actually have to. The current implementation is unchanged. (2) Add a function 'length: t->int option' This function guarrantees NOT to consume the enumeration. It returns the count (Some i) if it can do so without consuming the enumeration, otherwise it returns None. It is necessary to spell out the conditions under which you'll get a value here. For any 'in memory' data structure it will give Some result: basically any forward enumeration. Streams etc that have to be consumed return None. (3) Optionally deprecate fast_count. Technically 'length' above can be O(n) (or worse?) that is, not 'fast'. I'd probably prefer size: t -> int option to return Some result if it can be done O(1) otherwise None, but I'm not sure. It isn't clear why you'd ever bother to check fast_count anyhow .. if you need the count, you have to count it. If you don't, you won't even try. However, length is useful: suppose you want to make an array from an Enum: you have to know the length in advance, so you call length and throw an exception if you get None because it is impossible in that case. [OK, it isn't impossible .. but the idea is still there .. :] Comments? -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Brian H. <bh...@sp...> - 2004-05-28 01:49:15
|
OK, first of all, having worked with the current Enum's again in my extMap module (has that hit the list yet?), they're not as unwieldly to work with as I recall them being. As a side note, I wish Enum.make had an option to say count was slow. On 28 May 2004, skaller wrote: > Idea for partially fixing enum semnatics. > > (1) Move 'count()' up with the iterators. > Count is specified to consume the enumeration. > It doesn't actually have to. > The current implementation is unchanged. This is totally worthless. "this is how many items I just threw away". > > (2) Add a function 'length: t->int option' > > This function guarrantees NOT to consume the > enumeration. It returns the count (Some i) > if it can do so without consuming the enumeration, > otherwise it returns None. > > It is necessary to spell out the conditions > under which you'll get a value here. > For any 'in memory' data structure it will > give Some result: basically any forward enumeration. > Streams etc that have to be consumed return None. > > (3) Optionally deprecate fast_count. > Technically 'length' above can be O(n) (or worse?) > that is, not 'fast'. I'd probably prefer > size: t -> int option to return Some result > if it can be done O(1) otherwise None, > but I'm not sure. It isn't clear why you'd > ever bother to check fast_count anyhow .. > if you need the count, you have to count it. > If you don't, you won't even try. I actually changed the extMap implementation on thinking about this. Consider the enumeration of a list. You can make count O(1) by taking the O(n) hit when you create the enum. However, I can see many situations where you don't need count, and thus don't need to take the hit at all. Generally, I think count is only worthwhile if you're creating a data structure that needs to know how many items are in the enum- arrays (including, IIRC, dynarrays), I use it in extMap to turn an O(n log n) algorithm into an O(n) algorithm, etc. Try these semantics on for size: instead of fast_count, you get a has_count, which returns true if the enumeration has a count, false otherwise. If the enumeration does not have a count, it returns -1. In any case, count is gaurenteed to be no worse than O(n) cost. It is gaurenteed that Enum.to_list works without a count, so if you absolutely need a count and can not do without it (like Enum.to_array), you first convert the the enum to a list, and then get the count on the list and go from there. If you do not absolutely need a count but can find one usefull (like dynarray or extMap), then the function can exhibit different behavior if a count exists or not- for example, a function can be O(n) if a count exists, or O(n log n) if it doesn't. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: skaller <sk...@us...> - 2004-05-28 06:38:57
|
On Fri, 2004-05-28 at 11:56, Brian Hurt wrote: > > (1) Move 'count()' up with the iterators. > > Count is specified to consume the enumeration. > > It doesn't actually have to. > > The current implementation is unchanged. > > This is totally worthless. "this is how many items I just threw > away". Yes, that's what it does, but it isn't worthless. If the enumeration is forward, you can use a copy of it, which has not been consumed. If the enumeration is input, it makes no difference whether it is worthless or not because there is no other way to count such an enumeration than permanently consume it. Counting the number of lines in a file isn't entirely worthless .. unix wc does it :) > Try these semantics on for size: instead of fast_count, you get a > has_count, which returns true if the enumeration has a count, false > otherwise. If the enumeration does not have a count, it returns -1. In > any case, count is gaurenteed to be no worse than O(n) cost. > It is gaurenteed that Enum.to_list works without a count, so if you > absolutely need a count and can not do without it (like Enum.to_array), > you first convert the the enum to a list, and then get the count on the > list and go from there. If you do not absolutely need a count but can > find one usefull (like dynarray or extMap), then the function can exhibit > different behavior if a count exists or not- for example, a function can > be O(n) if a count exists, or O(n log n) if it doesn't. What I'd like to do is eliminate dynamic behaviour. I hate fast_count for that reason. I'd like to move more of the semantics into the static typing. C++ has no problem with this, although the type check doesn't happen until link time which is a pain. True, we should not copy the C++ technique for doing it. But anyhow my goal is to try and build a more static model of the forward/input iterator distinction. The reason is the same as why we all love Ocaml FP in the first place: the ability to actually reason about your code. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Brian H. <bh...@sp...> - 2004-05-28 17:39:35
|
On 28 May 2004, skaller wrote: > On Fri, 2004-05-28 at 11:56, Brian Hurt wrote: > > > > (1) Move 'count()' up with the iterators. > > > Count is specified to consume the enumeration. > > > It doesn't actually have to. > > > The current implementation is unchanged. > > > > This is totally worthless. "this is how many items I just threw > > away". > > Yes, that's what it does, but it isn't worthless. > If the enumeration is forward, you can use a copy > of it, which has not been consumed. > > If the enumeration is input, it makes no difference > whether it is worthless or not because there is no > other way to count such an enumeration than > permanently consume it. > > Counting the number of lines in a file isn't > entirely worthless .. unix wc does it :) OK, I'll backtrack on "totally useless". But "less usefull than it could be" definately applies. My current #1 gripe with enums is their imperitive behavior. Which, as near as I can tell, buys us nothing but trouble- take a long hard look at Enum.force and Enum.clone for what I mean. > > > Try these semantics on for size: instead of fast_count, you get a > > has_count, which returns true if the enumeration has a count, false > > otherwise. If the enumeration does not have a count, it returns -1. In > > any case, count is gaurenteed to be no worse than O(n) cost. > > It is gaurenteed that Enum.to_list works without a count, so if you > > absolutely need a count and can not do without it (like Enum.to_array), > > you first convert the the enum to a list, and then get the count on the > > list and go from there. If you do not absolutely need a count but can > > find one usefull (like dynarray or extMap), then the function can exhibit > > different behavior if a count exists or not- for example, a function can > > be O(n) if a count exists, or O(n log n) if it doesn't. > > What I'd like to do is eliminate dynamic behaviour. > I hate fast_count for that reason. I'd like to move > more of the semantics into the static typing. We have three choices: 1) We can have dynamic behavior- the same function is O(n) for some enums, O(n log n) for others, and have some way to tell which to do. 2) We can lose count, and take O(n log n) hits when we could do the same job in O(n) 3) We can lose "destructive" enums like reading in a file. If you want an enum of the lines of file, read them into an in-memory list, and create an enum on the list. I will comment that the biggest argument I've heard against a class-based implementation of enums is performance- calling a function on and object is slower (by some small, constant, number of clock cycles) than calling a function directly. Using an O(n log n) algorithm when we might have been able to use an O(n) algorithm is going to be a much bigger performance hit. Notice that the current algorithm is choice 3 (as I understand things)- if you try to do a count on a destructive enum, it calls Enum.force, which basically loads the enum into a list and then changes the enum to be an enum of that list. Notice that this means we have an imperitive semantic (enums) layered on top of an applicative semantic (lists) layered on top of an imperitive semantic (files). > > C++ has no problem with this, although the type check > doesn't happen until link time which is a pain. C++ uses overloading to deal with this. The if is still there, it's just hidden. > > True, we should not copy the C++ technique for doing it. > But anyhow my goal is to try and build a more static > model of the forward/input iterator distinction. One needs to fake the other, IMHO. Either can fake being the other. Choosing choice 2 (no count) means we assume all enumerations are imperitive and destructive. Choosing choice 3 (applicative) means we assume all enumerations have not-slow counts (no worse than O(n) time and O(log n) stack/memory to return a count of n), and we don't need a clone. Choosing choice 1 means we've choosen not to choose (which is always a choice), and expect everyone to deal with both types of enumerator. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Nicolas C. <war...@fr...> - 2004-05-28 18:40:47
|
> One needs to fake the other, IMHO. Either can fake being the other. > Choosing choice 2 (no count) means we assume all enumerations are > imperitive and destructive. Choosing choice 3 (applicative) means we > assume all enumerations have not-slow counts (no worse than O(n) time and > O(log n) stack/memory to return a count of n), and we don't need a clone. > Choosing choice 1 means we've choosen not to choose (which is always a > choice), and expect everyone to deal with both types of enumerator. I agree with Brian here. Since we don't want the forward/input iterator distinction to be enforced by the type of the enumeration itself, we have two choices left : having a flag that can tell about it - that's somehow what's fast_count is doing now - or enforcing it into the count function directly, with an int option return type. The problem is, what can do the people that want to count an input iterator ? Enum.count is nice because it's working all the time, fast_count is here to help if you care about performances. Of course, the complexity of a call to Enum.count can be specified such as taking at more O(n) time and O(log n) stack that looks reasonable. Specifying an O(n) count function is not nice, you should rely on Enum to do the work for you, and then treat your data as an input enumerator. Regards, Nicolas Cannasse |
From: Brian H. <bh...@sp...> - 2004-05-28 20:56:34
|
On Fri, 28 May 2004, Nicolas Cannasse wrote: > The problem is, what can do the people that want to count an input > iterator? There are three sorts of algorithms I can think of that use iterators: 1) Algorithms which don't need count, and wouldn't use it even if it existed. The canonical example here is ExtList.of_enum. 2) Algorithms which can use count, but don't need it. The canonical example here is building a balanced tree- I can build a balanced tree in O(n) if I know in advance how many elements I'm working with. Otherwise, it takes me O(n log n). 3) Algorithms which absolutely require a count, and can't be implemented without one. The canonical example here is Enum.to_array. Type 3 algorithms clash with "indefinate" enums (that don't have counts). So we have two choices- we either "fix" the indefinate enums so they have counts, or we "fix" the algorithms so they don't need a count from the enum. Interestingly enough, the basic strategy is exactly the same- convert the enum into a linked list (reading the entire enum into memory), and then count the elements in the linked list. The only question is which particular peice of code is responsible for doing the conversion? Personally, I think it's cleaner to solve the problem in Enum.to_array. Using the current imperitive semantics, to_array would look like: let rec to_array e = if has_count e then (* Normal, count-based implementation *) let c = count e in if c = 0 then [| |] else let init = next e in let retval = Array.make c init in for i = 1 to (c-1) do retval.(i) <- next e done; retval else (* convert the enum to a list, and then enumerate the list. *) to_array (ExtList.enum (ExtList.of_enum e)) ;; > Specifying an O(n) count function is not nice, you should rely on Enum > to do the work for you, and then treat your data as an input enumerator. I said "at most". If you can do a count in O(log n), or even O(1), then bully for you. But there are a number of times when I'll quite happily take an O(n) count, and still have it speed me up. Note that in the general case you should never have to call count more than once per enum- whoever is walking the enum can call count precisely once, just before starting to walk the enum, and then just keep a running count of how many items are left. The problem is that requiring an O(1) count (which is doable) can turn creating the enum from an O(1) algorithm to an O(n) algorithm. And then you might never use the count- there are a lot of uses for an enum which don't need a count. Don't take the O(n) hit unless you have to. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Nicolas C. <war...@fr...> - 2004-05-29 07:59:20
|
> let rec to_array e = > if has_count e then > (* Normal, count-based implementation *) > let c = count e in > if c = 0 then > [| |] > else > let init = next e in > let retval = Array.make c init in > for i = 1 to (c-1) do > retval.(i) <- next e > done; > retval > else > (* convert the enum to a list, and then enumerate the > list. *) > to_array (ExtList.enum (ExtList.of_enum e)) > ;; simply replace has_count by fast_count if it actually works :) Now if John want it so bad, I'm not against adding the following to enum : let poll_length e = if fast_count e then Some (count e) else None And actually I think we should rename fast_count or change its documentation to specify it more in terms of the enum "being consumed" (there is already a reference to a call to "force" being made, but that's maybe not clear enough). Regards, Nicolas Cannasse |
From: skaller <sk...@us...> - 2004-05-29 08:54:58
|
On Sat, 2004-05-29 at 17:57, Nicolas Cannasse wrote: > Now if John want it so bad, I'm not against adding the following to enum : > > let poll_length e = > if fast_count e then Some (count e) else None No, i don't want it 'so bad' -- at this point I'd urge you not to actually change anything. I'm merely doing 'thought experiments' in order to try to find a way to preserve the good features of Enum but also solve the forwrd/input iteration problem. I do not see an easy solution at the moment. I believe you don't either. So it would be wrong I think to destabilise the library doing 'half-assed' changes. > And actually I think we should rename fast_count or change its documentation > to specify it more in terms of the enum "being consumed" (there is already a > reference to a call to "force" being made, but that's maybe not clear > enough). The way I think about count is: really count is just an instantiation of fold where the function argument is just (fun acc _ -> acc + 1). In addition for *forward* enumerations the fold is actually applied to a clone so the original enumeration is not consumed. >From this viewpoint, count can be implement by the client, and shouldn't be in the library at all. However we all know that for some containers you can calculate count more quickly, and, for many input enumerations it is *also* possible to equip the enumeration with a non-consuming count. For example a enum of the chars in a file might be equipped with a 'stat' to count the chars without reading the file. A generator function which supplies 0,1,2,....n-1 is constructed to produce n values so we know its count() in advance. However 'enum' and 'enum with count' are two distinct data types. 'forward enum' can always be converted to 'forward enum with count'. The other issue arising here is: why are we making the 'count' function a special case? It is special, because the canonical Enum is '0 1 2 3 .. n-1' but that isn't quite enough of an argument: as I pointed out count is just a fold. Other folds can also be optimised. An important one is 'all elements satisfy P' for some predicate P, where there is a well known boolean shortcut to speed up the fold. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: skaller <sk...@us...> - 2004-05-28 18:54:44
|
On Sat, 2004-05-29 at 03:46, Brian Hurt wrote: > On 28 May 2004, skaller wrote: > OK, I'll backtrack on "totally useless". But "less usefull than it could > be" definately applies. This isn't really right though: in my proposal I also added a function 'length' which counts the length without consuming the enumeration. So you could write: match length x with Some i -> i | None -> count x to get the current semantics. The difference is that with the split up I suggested, it is *statically* assured length doesn't consume the enumeration. > My current #1 gripe with enums is their imperitive behavior. Which, as > near as I can tell, buys us nothing but trouble- take a long hard look at > Enum.force and Enum.clone for what I mean. This is part of the problem too, I agree. I don't have a complete solution. However the imperative behaviour is inescapable when dealing with input iterators over streams, etc: destruction is inevitable. My guess is that Enums and Iterators are dual. Dual doesn't mean 'the same'. So using a single interface to handle both is bound to create nasty problems. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Brian H. <bh...@sp...> - 2004-05-28 21:12:52
|
On 29 May 2004, skaller wrote: > On Sat, 2004-05-29 at 03:46, Brian Hurt wrote: > > On 28 May 2004, skaller wrote: > > > OK, I'll backtrack on "totally useless". But "less usefull than it could > > be" definately applies. > > This isn't really right though: in my proposal I also > added a function 'length' which counts the length without > consuming the enumeration. If you have a length function, why supply a count function? The rare occassion I want to line-count a file, writting a few lines of code to do so destructively isn't that hard: let linecount e = let rec loop c = if Enum.has_next e then let _ = e.next() in loop (c+1) else c in loop 0 ;; > > My current #1 gripe with enums is their imperitive behavior. Which, as > > near as I can tell, buys us nothing but trouble- take a long hard look at > > Enum.force and Enum.clone for what I mean. > > This is part of the problem too, I agree. I don't have a complete > solution. However the imperative behaviour is inescapable > when dealing with input iterators over streams, etc: destruction > is inevitable. In the OO Enum code I sent around a while ago, I "faked" applicative semantics over imperative semantics by implicitly making a linked list. Basically, an enum had two major functions- current, which returned the current/head/car value of the enum (of type 'a), and next, which returned an enum of the rest of the values- the tail or cdr value of the enum (of type 'a enum). For destructive enums, the object implicitly cached both the current and the next return values (only creating them as it needed to). Thus, if you walked the entire enum while keeping a reference to the head element, you effectively created a linked list of the entire enum. If you didn't keep a reference to the head element, elements became garbage after you moved past them. Note that this scheme worked exceptionally well if you only needed to "peak" at a couple of elements downstream, as I only instantiated those objects I needed to. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Bardur A. <oca...@sc...> - 2004-05-29 00:21:38
|
On Fri, May 28, 2004 at 04:19:43PM -0500, Brian Hurt wrote: > > > My current #1 gripe with enums is their imperitive behavior. Which, as > > > near as I can tell, buys us nothing but trouble- take a long hard look at > > > Enum.force and Enum.clone for what I mean. > > > > This is part of the problem too, I agree. I don't have a complete > > solution. However the imperative behaviour is inescapable > > when dealing with input iterators over streams, etc: destruction > > is inevitable. > > In the OO Enum code I sent around a while ago, I "faked" applicative > semantics over imperative semantics by implicitly making a linked list. > Basically, an enum had two major functions- current, which returned the > current/head/car value of the enum (of type 'a), and next, which returned > an enum of the rest of the values- the tail or cdr value of the enum (of > type 'a enum). For destructive enums, the object implicitly cached both > the current and the next return values (only creating them as it needed > to). Thus, if you walked the entire enum while keeping a reference to the > head element, you effectively created a linked list of the entire enum. > If you didn't keep a reference to the head element, elements became > garbage after you moved past them. > > Note that this scheme worked exceptionally well if you only needed to > "peak" at a couple of elements downstream, as I only instantiated those > objects I needed to. > To my mind it seems pretty clear that using the non-OO approach is a dead end because there are all these tradeoffs to consider (and where there is basically no single "right" way to do things). Now, I can't recall seeing your OO enum code on the list, but I had envisaged something like this: --- class virtual enum ['a] = object (self:'self) method virtual get : option 'a; method virtual clone : 'self; (* ...etc... *) end; class virtual countable = object (self:'self) method virtual count : int; end; class virtual fast_count = object (self:'self) inherit countable; method virtual count : int; method fast_count_tag = (); (* Prevent equivalence to slow_count *) end; class virtual slow_count = object (self:'self) inherit countable; method virtual count : int; method slow_count_tag = (); (* Prevent equivalence to fast_count *) end; (* Shortcuts *) class virtual countable_enum ['a] = object (self:'self) inherit enum ['a]; inherit countable; end; class virtual fast_countable_enum ['a] = object (self:'self) inherit enum ['a]; inherit fast_count; end; class virtual slow_countable_enum ['a] = object (self:'self) inherit enum ['a]; inherit slow_count; end; (* more if necessary *) (* Function using a somewhat specialized enumerator: *) value demo_function (e:countable_enum 'a) = (match e#get with [ None -> ... | Some x -> ... ]); --- (revised syntax, but the meaning should be pretty obvious). You get static validation of minimum required enum capabilities by constraining parameter types to the "least" qualities you need (like in the "demo_function" above). For each method in "enum" which one could imagine having different semantics, you separate it out into a separate interface -- e.g. if we want a "fast_clone" and "slow_clone", we can just add a "clonable" virtual class with the "clone" method, add the "fast_clonable" and "slow_clonable" virtual classes with appropriate tags. This, of course, creates O(n^2) "shortcuts" like the above, but you don't actually have to declare the less commonly used shortcuts, the implementor of a function which uses the less commonly used combinations can just declare them directly in their modules (all it requires is a few "inherit" lines). Creating enums for an ADT/File/whatever is also easy, you just create a class inheriting from the specialized virtual classes whose "contracts" you can fulfill and implement all the methods that you need to implement. I can only see three potential problems with this approach: - Slightly worse performance than when using pure functions. - The "cascaded maps" problem (as brought up in an old thread), i.e. not being able to collapse cascaded enum map applications. Please correct me, if I'm mistaken, but I believe this problem exists only because of a limitation in the OCaml type system (which could, in principle at least, be removed in future). - Manual 'upcasts' needed when calling functions which can (I can't see why OCaml can't automagically do these, but that's probably just my ignorance of the OCaml type system revealing itself) The only one of these problems which _I_ would consider a serious problem is that last one. In practical terms it means that (let e : fast_countable_enum 'a = ... something which creates an enum ... in demo_function e); becomes (let e : fast_countable_enum 'a = ... something which creates an enum ... in demo_function (e :> countable_enum 'a)); Is it too much to hope for that this can be fixed/worked around somehow? Of course, if you really want run-time-selectable behaviour then this approach is not that useful, but personally, I would much prefer that the characteristics of "clone", "count", etc. were compile-time fixed in cases where you actually need a specific behaviour -- and in cases were don't actually need a specific behaviour you could just specify a less specific behaviour as the requirement. -- Bardur Arantsson <ba...@im...> <ba...@sc...> - Be ot or bot ne ot, taht is the nestquoi. | Monty Python's Flying Circus |
From: skaller <sk...@us...> - 2004-05-29 06:59:15
|
On Sat, 2004-05-29 at 10:21, Bardur Arantsson wrote: > On Fri, May 28, 2004 at 04:19:43PM -0500, Brian Hurt wrote: > To my mind it seems pretty clear that using the non-OO > approach is a dead end because there are all these > tradeoffs to consider [Class factorisation exhibited] Why can't this also be done with functors, modules, and constraints? It may be harder to write the code than using classes, but is it possible? Is there anything with classes that you cant do here with functors, if so what? -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Bardur A. <oca...@sc...> - 2004-05-29 07:32:41
|
On Sat, May 29, 2004 at 04:59:10PM +1000, skaller wrote: > On Sat, 2004-05-29 at 10:21, Bardur Arantsson wrote: > > On Fri, May 28, 2004 at 04:19:43PM -0500, Brian Hurt wrote: > > > To my mind it seems pretty clear that using the non-OO > > approach is a dead end because there are all these > > tradeoffs to consider > > [Class factorisation exhibited] > > Why can't this also be done with functors, > modules, and constraints? > > It may be harder to write the code than using classes, > but is it possible? Is there anything with classes > that you cant do here with functors, if so what? Actually, I have no idea. :) I haven't really dabbled with the whole functor/constraints thing... I just thought classes were an obvious way to implement this. :) -- Bardur Arantsson <ba...@im...> <ba...@sc...> - A poignant plea, sir. Enough to melt the stoniest of hearts, but the answer, I'm afraid, must remain, "You're going to die, fat pig". Edmund Blackadder | Blackadder III |
From: Brian H. <bh...@sp...> - 2004-05-29 15:13:57
|
On 29 May 2004, skaller wrote: > On Sat, 2004-05-29 at 10:21, Bardur Arantsson wrote: > > On Fri, May 28, 2004 at 04:19:43PM -0500, Brian Hurt wrote: > > > To my mind it seems pretty clear that using the non-OO > > approach is a dead end because there are all these > > tradeoffs to consider > > [Class factorisation exhibited] > > Why can't this also be done with functors, > modules, and constraints? > > It may be harder to write the code than using classes, > but is it possible? Is there anything with classes > that you cant do here with functors, if so what? > > I want to be able to call a generic "next" function on any enumeration- a next function that needs state bound to it. The current implementation uses partial application to bind state to the function, my implementation used objects. I don't know how to bind state to a function using modules. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Nicolas C. <war...@fr...> - 2004-05-29 07:55:07
|
> To my mind it seems pretty clear that using the non-OO > approach is a dead end because there are all these > tradeoffs to consider (and where there is basically no > single "right" way to do things). > > Now, I can't recall seeing your OO enum code on the list, > but I had envisaged something like this: [...] The main advantages of using OO are : - subtyping - types are not module-bound (see RFC for CommonIO on caml-list) Since we already have an Enum module, we don't care about the second. And for the first , we actually don't need OO, please consider the following sample : --- type countable type not_countable type ('a,'b) enum val get : ('a,'b) enum -> 'a option val count : ('a,countable) enum -> int --- enum.ml type countable = unit type not_countable = unit --- We're using here an abstract type to type-check the count function usage . But I personaly think that's not worth it, and that we don't need to enforce the "countable" property directly into the type of the enum (or the return type of count). KISS is the rule for enums. Regards, Nicolas Cannasse |
From: Bardur A. <oca...@sc...> - 2004-05-29 08:13:03
|
On Sat, May 29, 2004 at 09:53:39AM +0200, Nicolas Cannasse wrote: > > To my mind it seems pretty clear that using the non-OO > > approach is a dead end because there are all these > > tradeoffs to consider (and where there is basically no > > single "right" way to do things). > > > > Now, I can't recall seeing your OO enum code on the list, > > but I had envisaged something like this: > [...] > > The main advantages of using OO are : > - subtyping > - types are not module-bound (see RFC for CommonIO on caml-list) > > Since we already have an Enum module, we don't care about the second. > And for the first , we actually don't need OO, please consider the following > sample : > > --- > > type countable > type not_countable > > type ('a,'b) enum > > val get : ('a,'b) enum -> 'a option > val count : ('a,countable) enum -> int > > --- enum.ml > > type countable = unit > type not_countable = unit > > --- > We're using here an abstract type to type-check the count function usage . But this isn't really extensible in that client code needs changes when the enum type is "expanded" with, say, "clonable". That's essentially what OO buys us in this case. With the OO approach no such changes are necessary (unless the code actually _needs_ to distinguish between, say, fast_clonable and slow_clonable). > [...] and that we don't need to enforce the "countable" > property directly into the type of the enum (or the > return type of count). I'm not really sure what you mean by this, could you clarify? > KISS is the rule for enums. > I think the suggested OO scheme *is* keeping it simple (well modulo the necessary upcast, I'm still wondering why OCaml thinks that it is necessary...) while still achieving static guarantees which are much stronger that run-time "has_count", "has_fast_count", "is_fast_clonable", etc. -- Bardur Arantsson <ba...@im...> <ba...@sc...> - This is Mr. Death. He's a reaper. | Monty Python's The Meaning of Life |
From: skaller <sk...@us...> - 2004-05-29 09:28:40
|
On Sat, 2004-05-29 at 18:12, Bardur Arantsson wrote: > On Sat, May 29, 2004 at 09:53:39AM +0200, Nicolas Cannasse wrote: > I think the suggested OO scheme *is* keeping it simple > (well modulo the necessary upcast, I'm still wondering why > OCaml thinks that it is necessary...) The reason is that Ocaml is quite strict about matching: unlike with generative classes (nominal typing?) where you are chosing between a fixed set of known base types for an upcast, in Ocaml the set of possible 'upcasts' is unbounded. So sometimes, even when you see only one possibility for an upcast, you need to remember that Ocaml isn't infering types in the same order as you do. This problem also exists for polymorphic variants. It is occasionally annoying, but that's it: a bit annoying is a small price for what you get. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Bardur A. <oca...@sc...> - 2004-05-29 09:35:45
|
On Sat, May 29, 2004 at 07:28:31PM +1000, skaller wrote: > On Sat, 2004-05-29 at 18:12, Bardur Arantsson wrote: > > On Sat, May 29, 2004 at 09:53:39AM +0200, Nicolas Cannasse wrote: > > > I think the suggested OO scheme *is* keeping it simple > > (well modulo the necessary upcast, I'm still wondering why > > OCaml thinks that it is necessary...) > > The reason is that Ocaml is quite strict about matching: > unlike with generative classes (nominal typing?) > where you are chosing between a fixed set of known base types > for an upcast, in Ocaml the set of possible 'upcasts' > is unbounded. Ah, yes, of course. -- Bardur Arantsson <ba...@im...> <ba...@sc...> - Ah, but you're forgetting Rimmer directive 273 which states just as clearly, "No chance you metal bastard!" Arnold Rimmer | Red Drarf |
From: skaller <sk...@us...> - 2004-05-29 09:10:19
|
On Sat, 2004-05-29 at 17:53, Nicolas Cannasse wrote: > The main advantages of using OO are : > - subtyping Modules also admit subtyping, and the subtyping is both more expressive and stronger I think: every constraint: module x = y : t is a subtyping coercion. The *disadvantage* here is that the subtyping is compile time only: OO provides dynamic binding at the *expense* of losing full subtyping. Proof: covariance problem. Modules have no problem with that. Class typing cannot handle it. EG: modules have no problem adding two kinds of integer up for arbitrary kinds, classes cannot ever do this correctly. Clearly the module case works because the type system enforces the need for the client to supply the addition method for each pair of integer kinds .. for methods it is impossible. For IO, covariance isn't an issue usually because the arguments of methods are usually invariant types like characters or counts. So classes are good for IO. However Enums are *algebraic* data types in principle, and one certainly expects covariance to be important for functions like map, map2, concat, etc.. > - types are not module-bound (see RFC for CommonIO on caml-list) Yes, that's a bit of technical difficulty with Ocaml. > KISS is the rule for enums. As for all good software :) However one must consider the simplicity of the library, and the simplicity of code that uses it too: in the latter case 'simplicity' includes 'ability to reason about the code'. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: skaller <sk...@us...> - 2004-05-29 06:24:52
|
On Sat, 2004-05-29 at 07:19, Brian Hurt wrote: > If you have a length function, why supply a count function? Well, Nicolass original specified that to handle both forward and input enums he (a) wanted to use a modular interface (not a class) (b) wanted minimal changes to the existing interface So the rationale behind the proposal is: what we really should have is two distinct interfaces, and we really don't want to have two functions 'count' and 'length', but having them both might allow a single interface to act as two interfaces if you use it appropriately. You can do that now perhaps with dynamic checks, but the use with 'count' and 'length' is easier to reason about because when you see an algorithm calling 'length' you can judge it won't consume the enum, and therefore is appropriate for either input or forward enums. When you see a call to count, even with a dynamic check before calling it, it is harder to reason that subsequent processing makes any sense for input Enums. The thing is if you have already forced the input enum, you have an internal forward enum, and count() then isn't destructive (the force was instead). But again that's a dynamic condition. Ability to reason about correctness of code and whether certain invariants are preserved is intrinsically vital when dealing with input enums: you really must know exactly when they're consumed. So having 'length()' function is supposed to help by promising that it won't consume an enumeration. Whether this actually works in real algorithms I don't know: it was just an idea, motivated by he concept that we desire to obtain *static* assurances of correctness and performance wherever possible. I note that for C++ iterators, it is partly possible to do such static reasoning, but still quite difficult: there is no type checking of input/forward distinction. However forward/bidirectional/random *are* checked in the sense that bidrectional and random iterators must be equipped with additional functions (--i for bidirectional and i[index] for random) Of course this is a weak test, but quite effective in practice at catching errors. The input/forward distinction is actually easier to check with Enum because in Ocaml cloning is an explicit operation, whereas in C++ the copy constructor is needed even for input iterators, and the C++ Standard has to talk about when such copies are invalidated. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |