From: John G. <jgo...@co...> - 2004-05-10 12:57:47
|
Hello, I have a question about Enums in Extlib. They appear to be quite similar in concept to OCaml Streams. Streams have the added benefit of being supported by parsers when camlp4 is used. Glancing at the Enum interface, it appears that most/all of it could also be implemented as functions operating on Streams. Is there a particular reason that a new data type was invented? Is it still possible to switch to streams? -- John |
From: Nicolas C. <war...@fr...> - 2004-05-11 07:37:54
|
> Hello, > > I have a question about Enums in Extlib. They appear to be quite > similar in concept to OCaml Streams. Streams have the added benefit of > being supported by parsers when camlp4 is used. Glancing at the Enum > interface, it appears that most/all of it could also be implemented as > functions operating on Streams. > > Is there a particular reason that a new data type was invented? Is it > still possible to switch to streams? > > -- John Stream are used for parsers, and only have a minimal API. Their specifications are different. In particular, the count function is returning the number of readed elements on stream and enum are returning the number of available elements. Enum permit also the cloning of an enumeration at any time, and lazy operations such as map and filter can be stacked and then you can write fully lazy algorithms a la Haskell. More generaly, stream are input iterators while Enum can be both input and forward iterators. Regards, Nicolas Cannasse |
From: skaller <sk...@us...> - 2004-05-11 09:10:11
|
On Tue, 2004-05-11 at 17:36, Nicolas Cannasse wrote: > > Is there a particular reason that a new data type was invented? Is it > > still possible to switch to streams? > Enum permit also the cloning of an enumeration at any > time, This isn't quite correct. IF the enum represents a forward iterator, THEN the enum is clonable, otherwise it isn't. Clone is the key distinction between input and forward enums. Clone can be called on either, but the semantics are radically different: when you advance an input iterator all its siblings are invalidated, when you advance a forward iterator its siblings remain valid. Part of the problem with the Enum API at the moment is that the very design *converts* input iterators to forward ones (when you call 'force' input streams are loaded into an internal data structure). So weirdly you start with 'weak cloning', but after the forcing, your cloning is 'strong' meaning it produces forward iterators. STL does this more cleanly using adaptors. > More generaly, stream are > input iterators while Enum can be both input and forward iterators. Yes. However, what works and what doesn't in the two cases isn't clearly distinguished. Count function being the most obvious example. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Nicolas C. <war...@fr...> - 2004-05-11 09:31:14
|
> > > Is there a particular reason that a new data type was invented? Is it > > > still possible to switch to streams? > > > Enum permit also the cloning of an enumeration at any > > time, > > This isn't quite correct. IF the enum represents a forward > iterator, THEN the enum is clonable, otherwise it isn't. > > Clone is the key distinction between input and forward enums. > Clone can be called on either, but the semantics are radically > different: when you advance an input iterator all its siblings > are invalidated, when you advance a forward iterator > its siblings remain valid. > > Part of the problem with the Enum API at the moment > is that the very design *converts* input iterators > to forward ones (when you call 'force' input streams > are loaded into an internal data structure). > > So weirdly you start with 'weak cloning', but > after the forcing, your cloning is 'strong' meaning > it produces forward iterators. > > STL does this more cleanly using adaptors. > > > More generaly, stream are > > input iterators while Enum can be both input and forward iterators. > > Yes. However, what works and what doesn't in the two > cases isn't clearly distinguished. Count function being > the most obvious example. I agree with you, any API returning an iterator should tells if it's an input or forward iterator. The thing I'm not sure yet is should we enforce that difference directly into the type itself . Having two different enum types, yet compatible with common functions, is subtyping. This can be done in two ways : - using objects, but they are not mainstream ocaml (the ocaml stdlib actually doesn't use them - neither most of the applications) - using polymorphic variants as an addtionnal type parameter : this might work but is a little bit complicated to use for users that might be interested in using enums (having type alias might help). Regards, Nicolas Cannasse |
From: skaller <sk...@us...> - 2004-05-11 13:23:12
|
On Tue, 2004-05-11 at 19:30, Nicolas Cannasse wrote: > I agree with you, any API returning an iterator should tells if it's an > input or forward iterator. Thought experiment. In STL, there are also output iterators, bidirectional iterators, and random iterators. In addition, forward, bidirectional, and random iterators may return mutable lvalues or immutable values. I am not suggesting all this classification be provided by Enums at the moment or even ever, but posing the question: how *would* these distinctions be represented? > The thing I'm not sure yet is should we enforce > that difference directly into the type itself . Agree. I suggest by default the answer is NO. At least initially, it would be a good step to simply *document* the semantic distinctions. > Having two different enum > types, yet compatible with common functions, is subtyping. I'm not so sure. Subtyping is basically a functor. The relationship here is between inductive and coinductive types (lists vs. streams). I don't think this is a subtyping problem. Perhaps an adjunction is closer to the right model. > This can be done > in two ways : > - using objects, but they are not mainstream ocaml (the ocaml stdlib > actually doesn't use them - neither most of the applications) The 'O' in Ocaml does mean 'object' :-) > - using polymorphic variants as an addtionnal type parameter I suspect this is 'phantom type trick' and I think I'm against it because it's just that: a trick. The true answer should be in the module system, but unfortunately we must have dynamic binding to simulate the lack of functorial polymorphism. This suggests, as you seem to be noting, that some kind of dynamic dispatch may be needed. Yet notice there is none in C++ STL. I think you also said you wanted a uniform interface, and I think this is what you meant. So we appear to have a contradiction. I have some guesses that it simply can't be done in Ocaml. It only works in C++ by allowing the typing to become unsound. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: skaller <sk...@us...> - 2004-05-11 13:36:44
|
one more thought. Consider the question: What does the laziness in Enum's buy you? There are two quite distinct answers: (1) Optimisation. For example you can 'stack up maps' and have the mappings applied 'when you need them' which could save memory and processing time. (2) Something like 'control inversion'. What I mean is: you can make a compound structure from, for example, streams, without having to unravel them to do so. For example 'make a stream of pairs from a pair of streams'. Point (2) isn't just a matter of optimistion, and this is the key problem with enums. It *can* make a differences *when* streams are unravelled. Partial solution: we MUST have the ability of the enum package to choose unravelling time to some extent (buffering isn't something we can do without). So a partial solution might be to BAN any use of coupled enumerations, to try to ensure that whilst the unravelling time is indeterminate, that indeterminacy will not change the specified behaviour of the package. For example 'make a stream of pairs from a pair of streams' is well defined if and only if the two streams are behaviourally independent -- successfully excluding stream_of_pairs file_handle file_handle from predictable behaviour because the two arguments -- being the same -- are indeed coupled (moving file pointer on one moves it on both ..) This is a constraint on the *client* expressed as a precondition for the post conditions of the enums package. Idea: try to rewrite specs of all functions in the Enum package using pre- and post-conditions. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Brian H. <bh...@sp...> - 2004-05-12 04:16:38
|
On 11 May 2004, skaller wrote: > On Tue, 2004-05-11 at 17:36, Nicolas Cannasse wrote: > > > > Is there a particular reason that a new data type was invented? Is it > > > still possible to switch to streams? > > > Enum permit also the cloning of an enumeration at any > > time, > > This isn't quite correct. IF the enum represents a forward > iterator, THEN the enum is clonable, otherwise it isn't. A side note: alterative implementations don't necessarily have this restriction. The OO iterator code I posted a couple of days ago doesn't have this restriction. I've been spending time working in C++ again recently. And I'm returning to my old philosophy of simply assuming if C++ does something, it's wrong. In this specific case, I don't see the advantage of having 18 gazillion different types of iterators/enums. You very rapidly need to know what data structure the iterator came from- at which point the prime benefit of iterators (genericity) has vanished. As an example, I *can* implement a "back" function on Ocaml's singly linked lists: let rec back orig_list curr_loc = match orig_list, curr_loc with | car :: ((foo :: _) as cdr), x :: _ -> if foo == x then orig_list else back cdr curr_loc | _ -> assert false ;; Only problem is- that function is O(n). Opps! So we don't provide it. But at what point is a function to expensive to provide? What if the back function is O(log n) cost- do you provide it or not? What if it's an expensive O(1) cost (lots of lseeks and small reads to find where the last line began, for example)? No. Keep things simple. If you need to do something more complicated with the data, either put it in a more advanced data structure, or require it already be in one (by taking an argument with a type of that data structure, not a enum). Personally, I think I'd ditch the count function before I'd introduce a second class of iterators. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: skaller <sk...@us...> - 2004-05-12 05:14:21
|
On Wed, 2004-05-12 at 14:22, Brian Hurt wrote: > > I've been spending time working in C++ again recently. And I'm returning > to my old philosophy of simply assuming if C++ does something, it's wrong. > In this specific case, I don't see the advantage of having 18 gazillion > different types of iterators/enums. In C++ it is simply a matter of documenting behaviour. The algorithm interfaces are quite uniform (independent of iterator type and kind). -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: John G. <jgo...@co...> - 2004-05-11 13:56:40
|
On Tue, May 11, 2004 at 11:36:31PM +1000, skaller wrote: > one more thought. Consider the question: > > What does the laziness in Enum's buy you? I'm still not quite sure I understand why Enum has laziness but streams don't. If you do a stream of a function, the functio ndoesn't evaluate anything until next() is called on the stream to retrieve the next valie. -- John > > There are two quite distinct answers: > > (1) Optimisation. For example you can > 'stack up maps' and have the mappings > applied 'when you need them' which could > save memory and processing time. > > (2) Something like 'control inversion'. > What I mean is: you can make a compound > structure from, for example, streams, > without having to unravel them to do so. > For example 'make a stream of pairs from > a pair of streams'. > > Point (2) isn't just a matter of optimistion, > and this is the key problem with enums. > It *can* make a differences *when* streams > are unravelled. > > Partial solution: we MUST have the ability > of the enum package to choose unravelling time > to some extent (buffering isn't something > we can do without). > > So a partial solution might be to BAN any > use of coupled enumerations, to try to > ensure that whilst the unravelling time > is indeterminate, that indeterminacy > will not change the specified behaviour > of the package. > > For example 'make a stream of pairs from > a pair of streams' is well defined > if and only if the two streams are > behaviourally independent -- successfully > excluding > > stream_of_pairs file_handle file_handle > > from predictable behaviour because > the two arguments -- being the same -- > are indeed coupled (moving file pointer > on one moves it on both ..) > > This is a constraint on the *client* > expressed as a precondition for the > post conditions of the enums package. > > Idea: try to rewrite specs of all functions > in the Enum package using pre- and post-conditions. > > -- > John Skaller, mailto:sk...@us... > voice: 061-2-9660-0850, > snail: PO BOX 401 Glebe NSW 2037 Australia > Checkout the Felix programming language http://felix.sf.net > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by Sleepycat Software > Learn developer strategies Cisco, Motorola, Ericsson & Lucent use to deliver > higher performing products faster, at low TCO. > http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3 > _______________________________________________ > ocaml-lib-devel mailing list > oca...@li... > https://lists.sourceforge.net/lists/listinfo/ocaml-lib-devel |
From: skaller <sk...@us...> - 2004-05-11 14:34:56
|
On Tue, 2004-05-11 at 23:57, John Goerzen wrote: > On Tue, May 11, 2004 at 11:36:31PM +1000, skaller wrote: > > one more thought. Consider the question: > > > > What does the laziness in Enum's buy you? > > I'm still not quite sure I understand why Enum has laziness but streams > don't. > > If you do a stream of a function, the functio ndoesn't evaluate anything > until next() is called on the stream to retrieve the next valie. All good programmers and streams are intrinsically lazy :) The problem here is just when 'next' can be called. That's really the bottom line. If the enum uses buffering .. and they must, that's the whole point of the design, to use 'an efficient internal data structure' .. then 'next' is going to be called at unpredictable times. At least *some* indeterminacy is *required* to give the implementor freedom to arrange the buffering. As an example, two 'stacked maps' might be implemented by: let b = map f a in let c = map g b in or by: let c = map (g * f) a in The sequence of calls in the first case are: f f f f f f f f f g g g g g g g g g but in the second case are f g f g f g f g f g f g f g f g f g and I'm sure you can imagine functions 'f' and 'g' for which this makes a difference. This example is only meant to be an illustration of how an implementation detail can make a huge difference in stateful programming. The contradiction in the current design is that it tries to be 'purely functional' but at the same time also handle 'input iterators' in a seamless manner. To make this work, some constraints on both the client and Enum package are needed to ensure semantics are maintained. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |