From: Jesse G. <je...@wi...> - 2004-08-22 04:30:35
|
http://www.wingnet.net/~jesse/ocaml/dllist/index.html Read the page above for what I've changed. I can't wait to hear everyone's thoughts. -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: Nicolas C. <war...@fr...> - 2004-08-22 10:05:13
|
> http://www.wingnet.net/~jesse/ocaml/dllist/index.html > > Read the page above for what I've changed. I can't wait > to hear everyone's thoughts. Could someone resume shortly the advantages of using a mutable double linked list against other existing ocaml data structures ? ( I guess it has to be mutable since immutable dllist sounds efficientless). Regards, Nicolas Cannasse |
From: Jesse G. <je...@wi...> - 2004-08-22 13:00:32
|
Nicolas Cannasse wrote: >> http://www.wingnet.net/~jesse/ocaml/dllist/index.html >> >> Read the page above for what I've changed. I can't wait >> to hear everyone's thoughts. > > Could someone resume shortly the advantages of using a mutable double > linked list against other existing ocaml data structures ? ( I guess it > has to be mutable since immutable dllist sounds efficientless). My particular application is to create a linked hashtbl, where the order of insertion is preserved during looping operations. I'm sure there are more though. -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: Brian H. <bh...@sp...> - 2004-08-22 13:09:27
|
On Sun, 22 Aug 2004, Nicolas Cannasse wrote: > > http://www.wingnet.net/~jesse/ocaml/dllist/index.html > > > > Read the page above for what I've changed. I can't wait > > to hear everyone's thoughts. > > Could someone resume shortly the advantages of using a mutable double linked > list against other existing ocaml data structures ? ( I guess it has to be > mutable since immutable dllist sounds efficientless). O(1) removal of an element with a known location as opposed to O(log N). -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: skaller <sk...@us...> - 2004-08-22 15:31:26
|
On Sun, 2004-08-22 at 23:17, Brian Hurt wrote: > On Sun, 22 Aug 2004, Nicolas Cannasse wrote: > > > > http://www.wingnet.net/~jesse/ocaml/dllist/index.html > > > > > > Read the page above for what I've changed. I can't wait > > > to hear everyone's thoughts. > > > > Could someone resume shortly the advantages of using a mutable double linked > > list against other existing ocaml data structures ? ( I guess it has to be > > mutable since immutable dllist sounds efficientless). > > O(1) removal of an element with a known location as opposed to O(log N). More: dlists have the almost unique advantage of allowing well defined behaviour when iterating and simultaneously inserting and deleting. Another, more powerful data structure that allows this is the 'sorted map': Bascially, you can use a key as an iterator with the operation: get_key_after h key (which works even if key is not found). This is the right operation to use on B-trees and sorted maps like a linked hashtble if you're updating as you're iterating (its expensive but if there's only one or two cursors you can add caching .. ) Ocaml Hashtbl is a real pain here -- I have hashtbles of objects i need to process one after the other, and the processing can add new entries.. its the wrong data structure, but I need O(1) lookup too .. i actually have to build sets of keys to get any kind of well defined behaviour at all -- Hashtbl.iter isn't useful. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Brian H. <bh...@sp...> - 2004-08-22 15:37:54
|
On 23 Aug 2004, skaller wrote: > Another, more powerful data structure that allows this > is the 'sorted map': > > Bascially, you can use a key as an iterator with the operation: > > get_key_after h key This would be a usefull function to add to the standard map library, IMHO. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Jesse G. <je...@wi...> - 2004-08-22 21:11:22
|
Jesse Guardiani wrote: > http://www.wingnet.net/~jesse/ocaml/dllist/index.html > > Read the page above for what I've changed. I can't wait > to hear everyone's thoughts. I've been thinking very hard about removing the list type and making dllist a purely circular, node based list. This has a number of advantages: 1.) Greatly simplifies interface. No need for list vs node append/prepend functions. Everything is a node and the list concept is gone. No need for of_node function, first, last, etc... 2.) Simplifies per-node data structure. This means the list will be more memory efficient and slightly faster. 3.) More, I'm sure... However, I'm going to need some help designing this beast. Maybe you can help me work out the following problems. This is the type I was planning to use for nodes: type 'a node_t = { mutable data : 'a option; mutable next : 'a node_t; mutable prev : 'a node_t; };; Note that I planned to use the 'a option data type to represent empty lists. And since it's there, I figure we might as well use it to represent invalidated (deleted) nodes. I'm trying to keep this node type as simple as possible. If it exceeds the complexity of the node + list type system, then it ceases to be useful. Here are the problems I've thought up with the above node type: 1.) Now that the list type is gone, how do we grab a VALID node if the only node we know about has been invalidated? 2.) With the node + list type system, it is possible to detach a node while retaining the data within the node, and then either reattach the node to the same list or attach it to another list of the same type. This doesn't work if we use data=None to indicate an invalidated node. #2 can be solved with this type: type 'a node_t = { mutable data : 'a option; mutable next : 'a option node_t; mutable prev : 'a option node_t; };; But not only is this type more complex than the original node + list type (thus exceeding it's usefulness), it still doesn't solve #1. If we can think up a good way to solve these problems then I will be happy to rewrite dllist as a node-only list. -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: skaller <sk...@us...> - 2004-08-22 22:06:25
|
On Mon, 2004-08-23 at 07:11, Jesse Guardiani wrote: > This is the type I was planning to use for nodes: > > type 'a node_t = { > mutable data : 'a option; > mutable next : 'a node_t; > mutable prev : 'a node_t; > };; > > Note that I planned to use the 'a option data type > to represent empty lists. Nah. You're still thinking of a list as an object. What you need to think of is a pool of nodes connected in certain (restricted) ways (namely -- doubly linked). The central concept is actually 'double link iterator' which in a C++ implementation would just be a pointer. In Ocaml, everything is a pointer anyhow -- due to boxing. So the node itself stands in for the iterator. Basically it is the *relationships* between the nodes that is interesting -- and relationships are not objects. The picture for a singly linked list is a partial order: a forest of lists with shared tails. THAT is what an slist is -- its a collection of arrows organised in a certain way. So you might encode: let cut_between d1 d2 = if d1 = d2 then None else .. Here None isn't an empty list: a dlist is a *cursor* it has to point to an actual node. None just means "there ain't no such dlist" :) > And since it's there, I figure > we might as well use it to represent invalidated (deleted) > nodes. You can't delete nodes. This is an FPL with a GC, only the GC deletes things. At best you can just forget the node so it becomes unreachable. What that means is any 'pointer' to a node always points to a node -- in particular in Ocaml it *is* a node :) The only kind of 'invalid' node is one where node.prev.next <> node, i.e. the dlink invariant is broken. The question is: is there any reason not to enure it is always preserved? In which case a deleted node is still a valid (singleton) list? The answer is yes -- the cost of adjusting its pointers to itself when perhaps you'll forget about it anyhow is a reason to allow invalid nodes. > I'm trying to keep this node type as simple as > possible. If it exceeds the complexity of the node + list > type system, then it ceases to be useful. > > Here are the problems I've thought up with the above node > type: > > 1.) Now that the list type is gone, how do we grab a VALID > node if the only node we know about has been invalidated? Grab the node you need beforehand. EG: let next = node.next in ignore(delete node); .. do something with 'next' now ... > 2.) With the node + list type system, it is possible to > detach a node while retaining the data within the node, > and then either reattach the node to the same list or > attach it to another list of the same type. This doesn't > work if we use data=None to indicate an invalidated node. yeah, so don't. Basically, all you have to do is ensure postcondition: node.prev.next = node.next.prev = node for all nodes, assumiung this as a precondition. > If we can think up a good way to solve these problems then > I will be happy to rewrite dllist as a node-only list. I think you are seeing a problem where none exists. There is a valid question whether to use: type dlist = node_t option or type dlist = Empty | Node of node_t to represent a possibly empty dlist. The latter is more definite that you're talking about a dlist, but it's isomorphic to the first solution. But don't use any options in the actual node_t structure, otherwise the invariant gets difficult to state and therefore to maintain and work with: you'd need something like: match d with | None -> true | Some node -> (match node.prev with | Some n -> n.next = Some node | None -> true ) && (match node.next with | Some p -> p.prev = Some node | None -> true ) as the invariant -- and your manipulators would likely be similarly complicated. Occams Razor -- empty dlists are more complicated than non-empty ones -- so they don't exist :) -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Jesse G. <je...@wi...> - 2004-08-22 22:28:38
|
On Sunday 22 August 2004 18:06, skaller wrote: > On Mon, 2004-08-23 at 07:11, Jesse Guardiani wrote: > > This is the type I was planning to use for nodes: > > > > type 'a node_t = { > > mutable data : 'a option; > > mutable next : 'a node_t; > > mutable prev : 'a node_t; > > };; > > > > Note that I planned to use the 'a option data type > > to represent empty lists. > > Nah. You're still thinking of a list as an object. I think you're confused. *You're* still thinking of a list as an object, as the below snippet with the list type indicates. I'm trying to think of a way to implement it PURELY as a collection of nodes, which I'm starting to think is a concept that is impossible to implement safely. I think you need to take a good look at my current (20040821) dllist code. It implements everything we've talked about, except a N(1) length function, in a safe manner. You can treat the list as a linear list, or as a circular list. It can be empty or contain nodes. You can invalidate nodes (delete them from the list) and anyone still holding a pointer to the invalidated node will receive an exception if they try to traverse it. In the future, it would be easy to write functions that allow nodes to be safely removed from one list and added to another, or added back into the same list at a later time. It's type safe. And it would take very little work to make it thread safe. [...] > I think you are seeing a problem where none exists. > There is a valid question whether to use: > > type dlist = node_t option > > or > > type dlist = Empty | Node of node_t I'm trying to find a way to safely get rid of dlist type. Why do you even bring this up? -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: skaller <sk...@us...> - 2004-08-23 03:22:14
|
On Mon, 2004-08-23 at 08:28, Jesse Guardiani wrote: > I think you're confused. *You're* still thinking of a list as an > object, as the below snippet with the list type indicates. > > I think you are seeing a problem where none exists. > > There is a valid question whether to use: > > > > type dlist = node_t option > > > > or > > > > type dlist = Empty | Node of node_t > > I'm trying to find a way to safely get rid of dlist type. Why do > you even bring this up? Because it is desirable sometimes to write total functions, and, some functions may need a domain like that above, for example, what would: cut d d return (assume it returns the list which is cut out, leaving the rest behind)? Perhaps this is an argument that 'cut' isn't a dlist primitive, and I truly am thinking too much of lists as objects :) -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: skaller <sk...@us...> - 2004-08-23 04:01:44
|
On Mon, 2004-08-23 at 13:22, skaller wrote: > Because it is desirable sometimes to write total functions, and, > some functions may need a domain like that above, Argg .. of course I meant codomain :) -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Jesse G. <je...@wi...> - 2004-08-23 13:15:41
|
skaller wrote: > On Mon, 2004-08-23 at 08:28, Jesse Guardiani wrote: > >> I think you're confused. *You're* still thinking of a list as an >> object, as the below snippet with the list type indicates. > >> > I think you are seeing a problem where none exists. >> > There is a valid question whether to use: >> > >> > type dlist = node_t option >> > >> > or >> > >> > type dlist = Empty | Node of node_t >> >> I'm trying to find a way to safely get rid of dlist type. Why do >> you even bring this up? > > Because it is desirable sometimes to write total functions, and, > some functions may need a domain like that above, for example, > what would: > > cut d d > > return (assume it returns the list which is cut out, leaving > the rest behind)? Well, I think that might be a bad example. It would obviously return d, after doing d.next<-d and d.prev<-d. However, a tougher question is this: cut d g Now you've cut out 4 nodes (assuming each letter of the alphabet is a node), so what do we return? In a dllist with a list type, you can always return a new list type that is linked to the 4 nodes. > Perhaps this is an argument that 'cut' isn't a dlist primitive, > and I truly am thinking too much of lists as objects :) Seems useful enough to me. Instead, I think it's just a good argument for the list type. -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: Jesse G. <je...@wi...> - 2004-08-22 22:36:47
|
skaller wrote: [...] >> 1.) Now that the list type is gone, how do we grab a VALID >> node if the only node we know about has been invalidated? > > Grab the node you need beforehand. EG: > > let next = node.next in ignore(delete node); > .. do something with 'next' now ... This wouldn't work in a threaded environment. You have no guarantee that next will still be a valid node by the time you try to use it. In the current list implementation this is solved by always having (or being able to retrieve, even from an invalidated node) a pointer to the list type, which in turn always points to a valid head node. This raises an interesting point for me. Are all standard OCaml data structures thread safe by default? -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: skaller <sk...@us...> - 2004-08-23 03:35:14
|
On Mon, 2004-08-23 at 08:36, Jesse Guardiani wrote: > skaller wrote: > > [...] > > >> 1.) Now that the list type is gone, how do we grab a VALID > >> node if the only node we know about has been invalidated? > > > > Grab the node you need beforehand. EG: > > > > let next = node.next in ignore(delete node); > > .. do something with 'next' now ... > > This wouldn't work in a threaded environment. You have no > guarantee that next will still be a valid node by the time > you try to use it. Yes you do, since all nodes are valid. With one caveat, that the relinking operations actually preserve the dlist invariant. So you would be making a good point that in a threaded environment you'd need to make sure the relinking code was atomic. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Jesse G. <je...@wi...> - 2004-08-23 13:10:32
|
skaller wrote: > On Mon, 2004-08-23 at 08:36, Jesse Guardiani wrote: >> skaller wrote: >> >> [...] >> >> >> 1.) Now that the list type is gone, how do we grab a VALID >> >> node if the only node we know about has been invalidated? >> > >> > Grab the node you need beforehand. EG: >> > >> > let next = node.next in ignore(delete node); >> > .. do something with 'next' now ... >> >> This wouldn't work in a threaded environment. You have no >> guarantee that next will still be a valid node by the time >> you try to use it. > > Yes you do, since all nodes are valid. No, they aren't. Nodes can be invalidated. If you invalidate a node, it is effectively unlinked from the list. What happens if your next node is invalidated between the time you grab a pointer to it and the time you try to use it again? You can no longer rely on node.next and node.prev because their behavior in an unlinked node is undefined. In a dllist with a list type (i.e. my 20040821 code), you can always grab a pointer to the head of the list if this happens, but with a node-only list, you've got no where to go. > With one caveat, > that the relinking operations actually preserve the > dlist invariant. So you would be making a good point > that in a threaded environment you'd need to make sure > the relinking code was atomic. This goes back to my other question: Are all standard ocaml modules (i.e. Hashtbl) thread safe? If so, then I'll need to think of a way to make the link operations atomic. Are locks appropriate in a library that might be used in a threaded environment and might not? -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: skaller <sk...@us...> - 2004-08-23 16:55:20
|
On Mon, 2004-08-23 at 23:10, Jesse Guardiani wrote: > skaller wrote: > > Yes you do, since all nodes are valid. > > No, they aren't. Nodes can be invalidated. We're talking at cross purposes. In your design they can be invalidated. In mine they aren't. Using this record type: type 'a node_t = { mutable data: 'a; mutable next: 'a node_t; mutable prev :'a node_t; } all operations can enure node.next.prev = node && node.prev.next = node for all nodes, and hence there is no such thing as an invalid node. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Dustin S. <du...@sp...> - 2004-08-23 16:57:16
|
On Aug 23, 2004, at 6:10, Jesse Guardiani wrote: > No, they aren't. Nodes can be invalidated. If you invalidate a > node, it is effectively unlinked from the list. What happens if > your next node is invalidated between the time you grab a pointer > to it and the time you try to use it again? You can no longer > rely on node.next and node.prev because their behavior in an > unlinked node is undefined. Does this mean that deleting a range is O(N) for the size of the range? -- Dustin Sallings |
From: Brian H. <bh...@sp...> - 2004-08-23 13:52:42
|
On Sun, 22 Aug 2004, Jesse Guardiani wrote: > skaller wrote: > > [...] > > >> 1.) Now that the list type is gone, how do we grab a VALID > >> node if the only node we know about has been invalidated? > > > > Grab the node you need beforehand. EG: > > > > let next = node.next in ignore(delete node); > > .. do something with 'next' now ... > > This wouldn't work in a threaded environment. You have no > guarantee that next will still be a valid node by the time > you try to use it. In the current list implementation this > is solved by always having (or being able to retrieve, even > from an invalidated node) a pointer to the list type, which > in turn always points to a valid head node. One of the many reasons why mutable data structures are a bad idea. > > This raises an interesting point for me. Are all standard > OCaml data structures thread safe by default? Not to my knowledge. I have some experience in writting multi-threaded code- and in my experience, there is no good way to make the fundamental data structures thread safe *and* usefull. All too often you need to combine multiple operations into a single atomic operation. Also, you often need to combine the synchonization of multiple different data structures- consider the linked list + hash table example. No, synchronization issues have to be dealt with outside the library. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Jesse G. <je...@wi...> - 2004-08-23 18:15:42
|
Dustin Sallings wrote: > > On Aug 23, 2004, at 6:10, Jesse Guardiani wrote: > >> No, they aren't. Nodes can be invalidated. If you invalidate a >> node, it is effectively unlinked from the list. What happens if >> your next node is invalidated between the time you grab a pointer >> to it and the time you try to use it again? You can no longer >> rely on node.next and node.prev because their behavior in an >> unlinked node is undefined. > > Does this mean that deleting a range is O(N) for the size of the range? Yes, "deleting" would be. "Cutting" wouldn't. -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: Dustin S. <du...@sp...> - 2004-08-23 18:45:23
|
On Aug 23, 2004, at 11:16, Jesse Guardiani wrote: > Yes, "deleting" would be. "Cutting" wouldn't. What is the difference between deleting and cutting? Does this mean that it's possible for more than one list to have references to the same nodes? If a node exists in more than one list, how is it invalidated? The validation marking really seems like a job better suited to the garbage collector. For example, imagine a circular implementation with no ``head'' kind of object. Take the following list: [a; b; c; d; e; f; g] ``Deleting'' [c; d; e] could be something like this: b.next <- f f.prev <- b c.prev <- e e.next <- c You've essentially split the list into two valid lists in an O(1) operation such that anyone who just has a reference to ``a'' will now have a valid list [a; b; f; g] and if anyone did have a reference to one of the deleted nodes, it's now also a valid list [c;d;e] since it has been detached from its parent list. It also reduces the types of mutations that have to occur back to the bare minimum, which is just the linkage between nodes. -- Dustin Sallings |
From: Jesse G. <je...@wi...> - 2004-08-23 19:03:37
|
On Monday 23 August 2004 2:45 pm, you wrote: > > On Aug 23, 2004, at 11:16, Jesse Guardiani wrote: > > > Yes, "deleting" would be. "Cutting" wouldn't. > > What is the difference between deleting and cutting? Deleting invalidates all nodes. Cutting doesn't invalidate any nodes at all. Instead, it makes a new circular list out of the nodes you have cut out of the old list. Note, however, that the current code only provides functions to delete one node at a time and doesn't provide a cut function. This isn't a shortcoming of the node/list type structure. I just haven't written a cut function yet. > Does this mean > that it's possible for more than one list to have references to the > same nodes? No. This is what the option backreference in the node type to the list type is for. It insures that each node is associated with exactly one list. > If a node exists in more than one list, how is it invalidated? Can't happen. > The validation marking really seems like a job better suited to the > garbage collector. I'm growing weary of speculation. How about you guys actually look at the code? -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: skaller <sk...@us...> - 2004-08-23 20:04:24
|
On Tue, 2004-08-24 at 05:04, Jesse Guardiani wrote: > On Monday 23 August 2004 2:45 pm, you wrote: > > > > On Aug 23, 2004, at 11:16, Jesse Guardiani wrote: > > > > > Yes, "deleting" would be. "Cutting" wouldn't. > > > > What is the difference between deleting and cutting? > > Deleting invalidates all nodes. Cutting doesn't invalidate any nodes > at all. > No. This is what the option backreference in the node type to the > list type is for. It insures that each node is associated with exactly > one list. It is not necessary to have the head pointer to ensure that: see below. > I'm growing weary of speculation. How about you guys actually look at > the code? Having a head pointer makes some operations O(N) instead of O(1). Here's the flavour I had in mind: ---------------------------------------------------- (* dlink soup *) type 'a node_t = { mutable data: 'a; mutable next: 'a node_t; mutable prev: 'a node_t; } (* invariant *) let check x = assert (x.prev.next = x); assert (x.next.prev = x) (* construction *) let singleton a = let rec x = { data=a; next=x; prev=x } in x (* accessor *) let get x = x.data (* special *) let is_tail x = x.next = x let is_head x = x.prev = x let is_singleton x = is_head x && is_tail x (* navigation *) let succ x = x.next let pred x = x.prev (* private utilities [break invariants] *) let link x y = x.next <- y; y.prev <- x let entail x = x.next <- x let enhead x = x.prev <- x (* chop a list in half *) let cut x = entail (pred x); enhead x (* remove one element from a list *) let del x = link (pred x) (succ x); link x x (* join tail x to head y *) let join x y = let xn = succ x and yp = pred y in link x y; xn.prev <- xn; yp.next <- yp (* swap the two tails *) let swap_tails x y = let yp = pred y in link (pred x) y; link yp x (* chop sublist, inclusive *) let chop h t = let hp = pred h and nt = succ t in cut h; cut nt; link hp nt -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Jesse G. <je...@wi...> - 2004-08-23 20:43:35
|
skaller wrote: > On Tue, 2004-08-24 at 05:04, Jesse Guardiani wrote: >> On Monday 23 August 2004 2:45 pm, you wrote: >> > >> > On Aug 23, 2004, at 11:16, Jesse Guardiani wrote: >> > >> > > Yes, "deleting" would be. "Cutting" wouldn't. >> > >> > What is the difference between deleting and cutting? >> >> Deleting invalidates all nodes. Cutting doesn't invalidate any nodes >> at all. > > >> No. This is what the option backreference in the node type to the >> list type is for. It insures that each node is associated with exactly >> one list. > > It is not necessary to have the head pointer > to ensure that: see below. > >> I'm growing weary of speculation. How about you guys actually look at >> the code? > > Having a head pointer makes some operations O(N) instead of O(1). > > Here's the flavour I had in mind: > > ---------------------------------------------------- > (* dlink soup *) > type 'a node_t = { > mutable data: 'a; > mutable next: 'a node_t; > mutable prev: 'a node_t; > } > > (* invariant *) > let check x = > assert (x.prev.next = x); > assert (x.next.prev = x) > > (* construction *) > let singleton a = > let rec x = { data=a; next=x; prev=x } in x > > (* accessor *) > let get x = x.data > > (* special *) > let is_tail x = x.next = x > let is_head x = x.prev = x > let is_singleton x = is_head x && is_tail x > > (* navigation *) > let succ x = x.next > let pred x = x.prev > > (* private utilities [break invariants] *) > let link x y = x.next <- y; y.prev <- x > let entail x = x.next <- x > let enhead x = x.prev <- x > > (* chop a list in half *) > let cut x = entail (pred x); enhead x > > (* remove one element from a list *) > let del x = > link (pred x) (succ x); > link x x > > (* join tail x to head y *) > let join x y = > let xn = succ x and yp = pred y in > link x y; xn.prev <- xn; yp.next <- yp > > (* swap the two tails *) > let swap_tails x y = > let yp = pred y in > link (pred x) y; link yp x > > (* chop sublist, inclusive *) > let chop h t = > let hp = pred h and nt = succ t in > cut h; cut nt; link hp nt Great. Two questions: 1.) How do you represent an empty list? 2.) Given the following (slightly modified for context with the above) code from your previous example: let node = (* get some node *) in let next = succ node in ignore(del node); (* .. do something with 'next' now ... *) How do you guarantee that 'next' is a node that is still attached to the same list it used to be attached to by the time you get get around to actually doing something with it? Your multi-threaded application might have accidentally or purposefully deleted that node in the meantime. This leads to the possibility of introducing avoidable bugs. Maybe you don't think this is a problem, but I do. Even if 'next' has been simply deleted from the original list, you now have no way to get back to that original list because all of your pointers to it are gone. If you had a list type, and a back reference to your list type in every node (like my code), then you would know immediately if your node had been deleted from the list. You'd receive an exception, and you could immediately grab the head of the list and attempt your operation again. Of course, if my code allowed nodes to be extracted from one list and added to a different list then my code would be in the same boat. But it's doesn't allow that right now, so it's currently a non issue. Other than those two items, I like it. I'm starting to dislike doubly linked lists in general though. They don't seem very safe. -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: skaller <sk...@us...> - 2004-08-23 21:45:05
|
On Tue, 2004-08-24 at 06:44, Jesse Guardiani wrote: > skaller wrote: > Great. Two questions: > > 1.) How do you represent an empty list? There is no such thing, so it doesn't need a representation :) > 2.) Given the following (slightly modified for > context with the above) code from your > previous example: > How do you guarantee that 'next' is a node > that is still attached to the same list it > used to be attached to by the time you get > get around to actually doing something with > it? Who cares? Its a meaningless requrement in general. If YOU have some concept of 'the list' and being 'attached to it' the YOU the programmer must guarrantee it. For example -- if you are doing what I am doing. I am rewriting some sequences of instructions to other sequences. I have a list I scan, to find the patterns, and I may replace some sequences, and even insert things at the head and tail (typically labels). It is up to ME to decide, after doing this, where to continue on: should I rescan the new instructions? The whole list? Surely if I delete a NOP instruction I want to continue before or after it and not with it (since it is no longer followed by the 'rest' of the instructions). It is not so hard to organise this -- it is much harder to actually decide what the right thing to do actually is -- but that's a design issue, not a problem with the doubly linked representation I might use (i'm actually using a singly linked Ocaml kind of list at the moment, which can be a pain). BTW: I'm NOT saying don't have a 'list as container' object. The concept here is the simplest possible thing we can have, precisely to make building more sophisticated data structures with stronger invariants relatively easy (reuse and all that stuff :) You can either just obey a protocol in your usage, and/or you can encapsulate it in an encoding if you deem it worthwhile. > Your multi-threaded application might Leave multi-threading out of this for the moment? > have accidentally or purposefully deleted that > node in the meantime. It isn't possible to 'delete' anything. There is no 'delete' operation in Ocaml. What you mean, more generally, is that something 'mutated the thing from under you' -- and I don't have any general answers: don't use a mutable data structure is one thing to consider. > Maybe you don't think this is a problem, but > I do. Even if 'next' has been simply deleted > from the original list, you now have no way to > get back to that original list because all of > your pointers to it are gone. You made the list. "All of the pointers" are only gone if you're stupid enough to forget them, and you also unlinked the node <of course I mean 'you' in the general sense not you personally> YOU unlinked the node, you can't cry if you did that and then you can't find the next node -- because YOU made it so that there wasn't one. > Other than those two items, I like it. I'm starting > to dislike doubly linked lists in general though. > They don't seem very safe. The problem you raise with respect to multi-threading is a valid concern -- and pretty much universal when it comes to mutable data structures: to make a coherent modification you need a notion of transaction/atomic operation and a mutual exclusion lock, and that isn't enough either -- how come two threads are fiddling the same list? That's a *design* problem. What happens if two threads fiddle the same hashtable? Same thing -- at best you can protect the integrity of he data structure itself -- but that isn't always enough to protect the *logical* invariants the data is an encoding for. If you want to enure 'cut' and 'splice' actually maintain dlist invariants in an MT environment you can do: atomize (fun () -> cut x) where: let atomize f = lock g; f (); unlock g However this uses a global lock which is bad. Another technique would require a specific container object with its own atomize function which has a per-container lock. Neither technique will prevent you messing up the correspondence between two data structures that are supposed to remain synchronised. I think Nicolas has to set some policy here. However, dlists of the kind I described are pretty easy to protect compared to other messier mutable data structures :) Take you comment: If you had a list type, and a back reference to your list type in every node (like my code), then you would know immediately if your node had been deleted from the list. and extrapolate -- perhaps it solves ONE tiny problem, but if you generalise you can see it doesn't actually do very much -- its expensive, it's only useful in some circumstances, and it doesn't solve other related issues. If you *really* want strong guarrantees -- write purely functional code AND do it in Haskell so you can be dead sure it is purely functional .. and then provide a proof of correctness and check it with a mechanical theorem prover .. and you can still mess up your design :) -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Jesse D. G. <je...@wi...> - 2004-08-24 07:20:31
|
skaller wrote: [...] > > >> Maybe you don't think this is a problem, but >> I do. Even if 'next' has been simply deleted >> from the original list, you now have no way to >> get back to that original list because all of >> your pointers to it are gone. >> >> > >You made the list. "All of the pointers" are only gone >if you're stupid enough to forget them, and you also >unlinked the node <of course I mean 'you' in the general >sense not you personally> > >YOU unlinked the node, you can't cry if you did that >and then you can't find the next node -- because >YOU made it so that there wasn't one. > > I'm primarily concerned with this happening by accident, in a way that isn't obvious, but the test code I've written so far seems to indicate that it's not as easy to slip up as I originally thought. >>Other than those two items, I like it. I'm starting >>to dislike doubly linked lists in general though. >>They don't seem very safe. >> >> > >The problem you raise with respect to multi-threading >is a valid concern -- and pretty much universal >when it comes to mutable data structures: to make >a coherent modification you need a notion of >transaction/atomic operation and a mutual exclusion >lock, and that isn't enough either -- how come >two threads are fiddling the same list? That's >a *design* problem. > > OK, fair enough. I've never actually written any threaded code. I've just read books about it so far (but I plan to fix that in the near future, which is why I brought it up in the first place). I'll take your word for it at this point. >What happens if two threads fiddle the same hashtable? >Same thing -- at best you can protect the integrity >of he data structure itself -- but that isn't always >enough to protect the *logical* invariants the >data is an encoding for. > >If you want to enure 'cut' and 'splice' actually maintain >dlist invariants in an MT environment you can do: > > atomize (fun () -> cut x) > >where: > let atomize f = lock g; f (); unlock g > >However this uses a global lock which is bad. >Another technique would require a specific container >object with its own atomize function which has >a per-container lock. Neither technique will prevent >you messing up the correspondence between two >data structures that are supposed to remain >synchronised. > >I think Nicolas has to set some policy here. >However, dlists of the kind I described >are pretty easy to protect compared to other >messier mutable data structures :) > > Perhaps it's my lack of actual multi-threading experience that is the problem. I think I'll rewrite my dllist code to use your node based design. At worst, it will clean up the code and provide a consistent interface to an unsafe mutable data structure. At best, it might actually prove to be safe, reliable, and useful. -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |