From: Jesse G. <je...@wi...> - 2004-08-26 23:40:25
|
This time it's purely node based. I haven't implemented enums yet, as I don't know what they do. :) Most of the code has been tested, but there might be a few bugs left. Feature requests are welcome, as long as the feature can be implemented safely (i.e. forget about a multi-node O(1) cut function. There's no way to guarantee that the two nodes given as arguments are part of the same list) Here's what I've got so far: (* * Dllist- a mutable, circular, doubly linked list library * Copyright (C) 2004 Brian Hurt * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public * License as published by the Free Software Foundation; either * version 2.1 of the License, or (at your option) any later version, * with the special exception on linking described in file LICENSE. * * This library is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * Lesser General Public License for more details. * * You should have received a copy of the GNU Lesser General Public * License along with this library; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA *) type 'a node_t = { mutable data : 'a; mutable next : 'a node_t; mutable prev : 'a node_t; };; exception Empty;; let create x = let rec nn = { data = x; next = nn; prev = nn} in nn;; let length node = let rec loop cnt n = if n = node then cnt else loop (cnt + 1) n.next in loop 1 node.next ;; let append node elem = let nn = { data = elem; next = node.next; prev = node } in node.next.prev <- nn; node.next <- nn; nn ;; let prepend node elem = let nn = { data = elem; next = node; prev = node.prev } in node.prev.next <- nn; node.prev <- nn; nn ;; let remove node = let next = node.next in node.prev.next <- next; next.prev <- node.prev; node.prev <- node; node.next <- node; ;; let drop node = let next = node.next in node.prev.next <- next; next.prev <- node.prev; node.prev <- node; node.next <- node; next ;; let rev_drop node = let prev = node.prev in prev.next <- node.next; node.next.prev <- prev; node.prev <- node; node.next <- node; prev ;; let merge node1 node2 = let next = node1.next in let prev = node2.prev in node1.next <- node2; node2.prev <- node1; next.prev <- prev; prev.next <- next; ;; let set node data = node.data <- data;; let get node = node.data;; let next node = node.next;; let prev node = node.prev;; let jump node idx = let m = if idx > 0 then -1 else 1 in let rec loop idx n = if idx == 0 then n else loop (idx + m) n.next in loop idx node ;; let iter f node = let () = f node.data in let rec loop n = if n <> node then let () = f n.data in loop n.next in loop node.next ;; let fold_left f init node = let rec loop accu n = if n = node then accu else loop (f accu n.data) n.next in loop (f init node.data) node.next ;; let fold_right f node init = let rec loop accu n = if n = node then f n.data accu else loop (f n.data accu) n.prev in loop init node.prev ;; let map f node = let first = create (f node.data) in let rec loop last n = if n = node then begin first.prev <- last; first end else begin let nn = { data = f n.data; next = first; prev = last } in last.next <- nn; loop nn n.next end in loop first node.next ;; let copy node = map (fun x -> x) node;; let to_list node = fold_right (fun d l -> d::l) node [];; let of_list lst = match lst with | [] -> raise Empty | h :: t -> let first = create h in let rec loop last = function | [] -> last.next <- first; first.prev <- last; first | h :: t -> let nn = { data = h; next = first; prev = last } in last.next <- nn; loop nn t in loop first t; ;; -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: Brian H. <bh...@sp...> - 2004-08-27 00:13:37
|
On Thu, 26 Aug 2004, Jesse Guardiani wrote: > (* > * Dllist- a mutable, circular, doubly linked list library > * Copyright (C) 2004 Brian Hurt > * Jesse, if you're stepping up to maintain this code, you should probably change this like to: * Copyright (C) 2004 Brian Hurt, Jesse Gaurdiani -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Jesse D. G. <je...@wi...> - 2004-08-27 03:39:40
|
Brian Hurt wrote: >On Thu, 26 Aug 2004, Jesse Guardiani wrote: > > > >>(* >> * Dllist- a mutable, circular, doubly linked list library >> * Copyright (C) 2004 Brian Hurt >> * >> >> > >Jesse, if you're stepping up to maintain this code, you should probably >change this like to: > * Copyright (C) 2004 Brian Hurt, Jesse Gaurdiani > > I didn't want to change it since it's GPL, but I certainly don't mind. Thanks. -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: skaller <sk...@us...> - 2004-08-27 12:45:12
|
On Fri, 2004-08-27 at 13:39, Jesse D. Guardiani wrote: > Brian Hurt wrote: > > >On Thu, 26 Aug 2004, Jesse Guardiani wrote: > > > > > > > >>(* > >> * Dllist- a mutable, circular, doubly linked list library > >> * Copyright (C) 2004 Brian Hurt > >> * > >> > >> > > > >Jesse, if you're stepping up to maintain this code, you should probably > >change this like to: > > * Copyright (C) 2004 Brian Hurt, Jesse Gaurdiani > > > > > > I didn't want to change it since it's GPL, but I certainly don't mind. > Thanks. If its GPL then it cannot be used in Extlib. If you modify it you SHOULD add your authorship to the copyright notice, otherwise the notice would incorrectly seem to indicate that Brian is fully responsible -- which might be construed as fraud. (Specifically, 'passing off', which means misattributing the authorship. Usually that's passing off some work someone else did as your own, but the reverse can apply as well). This is true for almost all licences, even very liberal FFAU licences. The one thing an author does NOT want is for some code he *didn't* write and isn't responsible for to cause some problem, and have the user complain to him about it. If you modify some open source software -- you should take the first line of responsibility for the whole of it. It is customary to also give the whole credit to the upstream author. Most people say something like "All credit goes to Brian, and all the bugs are mine" or some such semi-humorous line. Anyhow I'm just trying to point out you aren't being generous in failing to attribute the authorship correctly. If your code is good, you deserve some of the credit and should take it. If there are bugs, you deserve some of the blame and should take it :) -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Brian H. <bh...@sp...> - 2004-08-27 12:52:09
|
On 27 Aug 2004, skaller wrote: > If its GPL then it cannot be used in Extlib. It's LGPL + standard exception, like the rest of Extlib and the Ocaml standard libraries. I copied the copyright notice from extList, IIRC. > This is true for almost all > licences, even very liberal FFAU licences. The one thing > an author does NOT want is for some code he *didn't* write > and isn't responsible for to cause some problem, and have > the user complain to him about it. I was mainly thinking "you're doing the work- you should get the credit". I actually considered simply replacing my name with his on the copyright line. While I wouldn't mind, I decided it was a bad idea to encourage people to do things that other people might mind. > > If you modify some open source software -- you should > take the first line of responsibility for the whole > of it. It is customary to also give the whole credit > to the upstream author. Most people say something like > > "All credit goes to Brian, and all the bugs are mine" > > or some such semi-humorous line. All hail the great and glorious Brian. Never mind that programmer behind the curtain. :-) Seriously- he's doing serious, material improvements to the code, and deserves co-author credits. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: Jesse G. <je...@wi...> - 2004-08-27 13:57:32
|
Brian Hurt wrote: > On 27 Aug 2004, skaller wrote: > >> If its GPL then it cannot be used in Extlib. > > It's LGPL + standard exception, like the rest of Extlib and the Ocaml > standard libraries. I copied the copyright notice from extList, IIRC. Yeah, I'd be greatly (and pleasantly) surprised if Nicolas wants dllist in ExtLib though. I'm mainly just posting here to get people's take on the code and announce availability. In that light, I hope the semi-off-topic nature of the dllist threads hasn't offended anyone. -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: skaller <sk...@us...> - 2004-08-27 14:38:55
|
On Fri, 2004-08-27 at 23:57, Jesse Guardiani wrote: > Yeah, I'd be greatly (and pleasantly) surprised if Nicolas wants dllist > in ExtLib though. Huh? Quite a few of us have said we'd like some doubly linked list facility in Extlib. Extlib was intended to augment the set of fundamental data structures (and functions on them) provided by the Ocaml standard library. I have no doubt some kind of doubly linked list belongs in Extlib. Exactly what is another issue :) -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Jesse G. <je...@wi...> - 2004-08-27 18:17:55
|
skaller wrote: > On Fri, 2004-08-27 at 23:57, Jesse Guardiani wrote: > >> Yeah, I'd be greatly (and pleasantly) surprised if Nicolas wants dllist >> in ExtLib though. > > Huh? Quite a few of us have said we'd like some doubly > linked list facility in Extlib. Extlib was intended > to augment the set of fundamental data structures > (and functions on them) provided by the Ocaml standard library. Sorry, I'm a pessimist. I prefer to expect the worst and be pleasantly surprised by good fortune than be disappointed constantly. I think that attitude fits fairly well with this forum, based on past experience. :) > I have no doubt some kind of doubly linked list > belongs in Extlib. Exactly what is another issue :) Well, since you seem to think there is hope for Dllist becoming a part of ExtLib (and I'm certainly willing to do any work necessary to make it happen), what does our self proclaimed benevolent dictator say? -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: skaller <sk...@us...> - 2004-08-27 18:45:10
|
On Sat, 2004-08-28 at 04:18, Jesse Guardiani wrote: > Sorry, I'm a pessimist. I prefer to expect the worst and > be pleasantly surprised by good fortune than be disappointed > constantly. I think that attitude fits fairly well with this > forum, based on past experience. :) Heh .. in this forum I'd be a lunatic optimist, compared to the ISO C++ committee where, for example, after patiently explaining the merits of closures and lexical scoping, and showing how to implement it .. half the group including Stroustrup didn't think it was important enough to warrant adding to the language. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Nicolas C. <war...@fr...> - 2004-08-27 21:15:39
|
> > I have no doubt some kind of doubly linked list > > belongs in Extlib. Exactly what is another issue :) > > Well, since you seem to think there is hope for Dllist becoming > a part of ExtLib (and I'm certainly willing to do any work > necessary to make it happen), what does our self proclaimed > benevolent dictator say? Well I might be a dictator but I don't have alone CVS write access, so I might be fired if I perform badly in my dictator job ;) In general, when I new proposal arise, if I see directly how it can be used it will be put in ExtLib quickly, that's an administrative privilege to speedup things a little bit. However when I'm not conviced - for exemple dllist - then I keep asking questions to understand how it can answer the problems of OCaml programmers *in general* (and not one particular problem). But I can't go against the angry mob standing at the doors of ExtLib or I will no longer be an appropriate guardian :) Nicolas |
From: Jesse G. <je...@wi...> - 2004-08-27 21:40:42
|
Nicolas Cannasse wrote: >> > I have no doubt some kind of doubly linked list >> > belongs in Extlib. Exactly what is another issue :) >> >> Well, since you seem to think there is hope for Dllist becoming >> a part of ExtLib (and I'm certainly willing to do any work >> necessary to make it happen), what does our self proclaimed >> benevolent dictator say? > > Well I might be a dictator but I don't have alone CVS write access, so I > might be fired if I perform badly in my dictator job ;) In general, when I > new proposal arise, if I see directly how it can be used it will be put in > ExtLib quickly, that's an administrative privilege to speedup things a > little bit. However when I'm not conviced - for exemple dllist - then I > keep asking questions to understand how it can answer the problems of > OCaml programmers *in general* (and not one particular problem). But I > can't go against the angry mob standing at the doors of ExtLib or I will > no longer be an appropriate guardian :) For what it's worth, I think ExtLib is shaping up to be a really nice collection of software. I wish I could manage to submit something that actually gets committed, but until then I'm quite happy that you all let me discuss my ideas on this list so I can write better code. Thanks! -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: Richard J. <ri...@an...> - 2004-08-27 22:32:34
|
On Fri, Aug 27, 2004 at 11:16:27PM +0200, Nicolas Cannasse wrote: > > > I have no doubt some kind of doubly linked list > > > belongs in Extlib. Exactly what is another issue :) > > > > Well, since you seem to think there is hope for Dllist becoming > > a part of ExtLib (and I'm certainly willing to do any work > > necessary to make it happen), what does our self proclaimed > > benevolent dictator say? >=20 > Well I might be a dictator but I don't have alone CVS write access, so I > might be fired if I perform badly in my dictator job ;) In general, when I > new proposal arise, if I see directly how it can be used it will be put in > ExtLib quickly, that's an administrative privilege to speedup things a > little bit. However when I'm not conviced - for exemple dllist - then I k= eep > asking questions to understand how it can answer the problems of OCaml > programmers *in general* (and not one particular problem). But I can't go > against the angry mob standing at the doors of ExtLib or I will no longer= be > an appropriate guardian :) FWIW I'm also not convinced by dllist. (Although I am convinced by GregorianDate, which I'm using myself ... so take any opinion with a pinch of salt) Rich. --=20 Richard Jones. http://www.annexia.org/ http://www.j-london.com/ Merjis Ltd. http://www.merjis.com/ - improving website return on investment 'There is a joke about American engineers and French engineers. The American team brings a prototype to the French team. The French team's response is: "Well, it works fine in practice; but how will it hold up in theory?"' |
From: Brian H. <bh...@sp...> - 2004-08-27 23:02:19
|
On Fri, 27 Aug 2004, Richard Jones wrote: > FWIW I'm also not convinced by dllist. In defense of dllist, we currently don't have a way to do an ordered lookup table- either imperitively (hashtable + dllist) or functionally. I don't think we need to support the combined data structure, but I think we need to support the individual data structures. My preference would actually be to do both- provide the necessary substrate for both the functional and imperitive versions. The best functional solution I can come up with to do it functionally is a double map solution- a ('key, int * 'data) map to do key lookups, and an (int, 'key * 'data) map to do insertion order lookups. When the insert counter rolls over, you just rebuild the entire data structure to compact the insertion locations. As a side note, I'm grokking Okasaki's paper on random access lists. Note that it only gives O(1) access to elements near one or the other end of the list, and updates (inserts and deletes) and the rest of the finds still take O(log N). One huge advantage of using my data structure- inserting a funcarr of length M anywhere into a funcarr of length N should only take O(log (M+N)). -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian |
From: skaller <sk...@us...> - 2004-08-28 00:05:08
|
On Sat, 2004-08-28 at 09:10, Brian Hurt wrote: > On Fri, 27 Aug 2004, Richard Jones wrote: > In defense of dllist, we currently don't have a way to do an ordered > lookup table > The best functional solution I can come up with to do it functionally is a > double map solution- a ('key, int * 'data) map to do key lookups, and an > (int, 'key * 'data) map to do insertion order lookups. When the insert > counter rolls over, you just rebuild the entire data structure to compact > the insertion locations. This doesn't meet the requirements. In particular, the ordering is fixed to 'append at end' which is not enough. Consider rewriting a sequence of labelled assembler instructions.. (which is effectively what I'm doing). -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: skaller <sk...@us...> - 2004-08-27 22:56:46
|
On Sat, 2004-08-28 at 07:16, Nicolas Cannasse wrote: > Well I might be a dictator but I don't have alone CVS write access, so I > might be fired if I perform badly in my dictator job ;) Ah no -- I *do* have CVS write access. You can fire me because you're the sole administrator. I can't fire you :) -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Nicolas C. <war...@fr...> - 2004-08-28 09:19:17
|
> > Well I might be a dictator but I don't have alone CVS write access, so I > > might be fired if I perform badly in my dictator job ;) > > Ah no -- I *do* have CVS write access. You can fire me > because you're the sole administrator. I can't fire you :) Aaaar That was my trump card, I've been uncovered No I have no other choice but to eliminate you John, sorry ;) Nicolas |
From: Nicolas C. <war...@fr...> - 2004-08-27 21:06:37
|
> > Yeah, I'd be greatly (and pleasantly) surprised if Nicolas wants dllist > > in ExtLib though. > > Huh? Quite a few of us have said we'd like some doubly > linked list facility in Extlib. Extlib was intended > to augment the set of fundamental data structures > (and functions on them) provided by the Ocaml standard library. > > I have no doubt some kind of doubly linked list > belongs in Extlib. Exactly what is another issue :) Well I have not yet been correctly answered to my question "Could someone resume shortly the advantages of using a mutable double linked list against other existing ocaml data structures ? ( I guess it has to be mutable since immutable dllist sounds efficientless)." Brian said : "O(1) removal of an element with a known location as opposed to O(log N)." That means exactly O(1) removal when we have an enumeration pointer. That's exactly the same complexity as a list using a filter on first matched element isn't it ? So I don't yet see the advantages. (BTW Brian it's ok to add IndexedTree to ExtLib if you want). Nicolas |
From: Jesse G. <je...@wi...> - 2004-08-27 21:37:03
|
Nicolas Cannasse wrote: >> > Yeah, I'd be greatly (and pleasantly) surprised if Nicolas wants dllist >> > in ExtLib though. >> >> Huh? Quite a few of us have said we'd like some doubly >> linked list facility in Extlib. Extlib was intended >> to augment the set of fundamental data structures >> (and functions on them) provided by the Ocaml standard library. >> >> I have no doubt some kind of doubly linked list >> belongs in Extlib. Exactly what is another issue :) > > Well I have not yet been correctly answered to my question > > "Could someone resume shortly the advantages of using a mutable double > linked > list against other existing ocaml data structures ? ( I guess it has to be > mutable since immutable dllist sounds efficientless)." > > Brian said : > > "O(1) removal of an element with a known location as opposed to O(log N)." > > That means exactly O(1) removal when we have an enumeration pointer. > That's exactly the same complexity as a list using a filter on first > matched element isn't it ? So I don't yet see the advantages. :) I don't have the energy or the desire to argue. Anyone who wants it can download Dllist from my site. I'll maintain it and develop it further unless a safer alternative is realized. In the meantime, if anyone really thinks it belongs in ExtLib, then please, by all means, argue with Nicolas. I've got other stuff to code. Have fun! -- Jesse Guardiani, Systems Administrator WingNET Internet Services, P.O. Box 2605 // Cleveland, TN 37320-2605 423-559-LINK (v) 423-559-5145 (f) http://www.wingnet.net |
From: skaller <sk...@us...> - 2004-08-27 23:50:39
|
On Sat, 2004-08-28 at 07:07, Nicolas Cannasse wrote: > "Could someone resume shortly the advantages of using a mutable double > linked > list against other existing ocaml data structures ? ( I guess it has to be > mutable since immutable dllist sounds efficientless)." > > Brian said : > > "O(1) removal of an element with a known location as opposed to O(log N)." > > That means exactly O(1) removal when we have an enumeration pointer. That's > exactly the same complexity as a list using a filter on first matched > element isn't it ? So I don't yet see the advantages. I did add some other things in reply to your query which you haven't quoted. Dlists have the following properties: (1) bidirectional iteration (2) keys stable under all mutations (3) keys are zero cost (4) keys are generated by the data structure to support any user defined ordering. No other data structure I know of has these properties. Maps and hashtbls are keyed and stable, but the keys are not zero cost and they have to be supplied by the user. A common trick is to use floating point numbers, since they emulate real numbers which support user defined ordering -- you can always insert between k1 and k2 by using the key (k1 + k2) / 2 which is in between. Of course that doesn't actually work with floats :) As to using lazy functional techniques -- well, if you do enough mutations you end up stacking up a lot of filters and folds and things and the whole list goes thru all of them each element -- the mutation is O(1) and once off. Furthermore you can't use a normal filter to do a deletion since the only way to identify a particular node is to match on physical node address -- a traditional filter, map, or fold does not present that information to the client function. Furthermore an 'Enum' style representation could never do that anyhow, since it relies on being able to change the sequencing infrastructure. Enums are forward iterators over the *values* in the list but dlist nodes are actually pairs where the first element is the node address (machine synthesised unique key where the client insertion defines the key ordering dynamically), and Enums don't support that. Or reverse iteration. That is -- dlinked lists have quite unique and useful properties. Of course, they aren't functional -- but then neither are Hashtbls :) -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Nicolas C. <war...@fr...> - 2004-08-28 10:38:40
|
> I did add some other things in reply to your query which you > haven't quoted. > > Dlists have the following properties: > > (1) bidirectional iteration > (2) keys stable under all mutations > (3) keys are zero cost > (4) keys are generated by the data structure to support > any user defined ordering. I don't agree with your definition of a node being a key to the item. Several reasons for that : - a node does not describe any interesting property of the item, it is just a pointer to the item. You cannot by any way produce a node by your own means, and only a node returned by add is valid for a given dllist. - nodes are not ordered, comparing them will just blow the stack since they're recursive. You can check for equality with == and != operators but nothing else. - node adresses are not stable since they can be moved by the GC. Taken apart advantages of keys, you get only bidrectional iteration, which is IMHO not interesting enough to justify adding a new data structure to ExtLib. Nicolas |
From: skaller <sk...@us...> - 2004-08-28 14:56:18
|
On Sat, 2004-08-28 at 20:39, Nicolas Cannasse wrote: > > I did add some other things in reply to your query which you > > haven't quoted. > > > > Dlists have the following properties: > > > > (1) bidirectional iteration > > (2) keys stable under all mutations > > (3) keys are zero cost > > (4) keys are generated by the data structure to support > > any user defined ordering. > > I don't agree with your definition of a node being a key to the item. Its a definition, your agreement is not required :) > - nodes are not ordered, Yes they are. The ordering isn't total over node soup. However for two nodes in a single linear list ordering is total. For a circular list, you can say a node is between two others in a given direction. > - node adresses are not stable since they can be moved by the GC. Irrelevant -- by address I didn't mean machine address. That was just a way to distingush, in C++ terminology, the node itself and a pointer to it. Such a pointer is perfectly stable, even with GC compaction: the difficulty expressing this for Ocaml is that a node 'pointer' is represented in Ocaml by the Ocaml term for the node -- since Ocaml automatically boxes things. > Taken apart advantages of keys, you get only bidrectional iteration, which > is IMHO not interesting enough to justify adding a new data structure to > ExtLib. You are simply not understanding the way I expressed the property in terms of keys. I'm sorry its my fault not expressing this well, but the property I am trying to explain actually exists. If you have some set of values V, and you wish to *impose* an ordering on them, you can do that by storing them in an array, for example. You can then lookup the values and get the index, and compare the indexes. I'm sure you understand that. The location in the array is a key. It isn't represented *explicitly* in the array. In fact the location in the array can be represented by a pointer to the array plus an integer OR a pointer to the array plus a pointer to the element. As you know in C: a[i] = *(a + i) expresses this idea [a+i is a pointer to an element] However for arrays, you can't modify the ordering in a flexible way in O(1) -- and you can't do it at all, without modifying some pointesr/indexes into the array -- if you blit elements after position n down, to insert a new element, every single index j >=n needs to have 1 added to it to continue to refer to the same element. I'm sure you agree, all this is obvious and well known properties of arrays (even if I'm not expressing it well). The key thing about dlists is that you can do the insertions and deletions into the ordering relation WITHOUT modifying any external cursors -- nothing moves when you add or delete an element, they're just relinked. That's what I mean by stable. Iterators, pointers, cursors, or keys -- whatever you want to call them -- into dlists are stable under ordering modifications. I used the word key to create an analogy with keyed data structures like Map. Using the functions (which don't actually exist in Ocaml, but that's not important): get_next_key data_structure key get_previous_key data_structure key you have stable keys and bidirectional iteration. But there is a problem -- to do insertion, you need to create a key between two others -- and that isn't always possible. It depends on the keys. If you use an Ocaml Map with the above functions, you can always do it with a string, since you can just add an extra character to the end if you run out of space -- I'm assuming the usual lexicographical comparison. But this is very expensive. For dlists, you can't and don't need to create your own keys -- when you insert between adjacent nodes n1 and n2 you automatically get returned a node n3 in between them -- and the cost is O(1) (and also VERY fast). By the way -- using Ocaml Map with the above next/previous functions is SUPERIOR in many ways to using a dlist. That data structure allows persistence/functional behaviour -- insertions create a new map without changing the old one. However, the operations are O(log N) and you need a lot more storage since you have to represent the keys and also the technique requires that the user choose and manage the keys to define the desired ordering between values. Simply defining the ordering by linking adjacent elements is much faster and more efficient -- at the cost of losing functional/persistent behaviour. ---------------------------------------------------- In my application -- lists of instructions -- the simple singly linked lists are quite clumbsy and hard to manage sometimes. you simply cannot afford to scan and rebuild the list multiple times to make a modification, and then do it all again to do the next one. To avoid this I try to do multiple modifications in each pass. Unfortunately the modifications are coupled... anyhow, it is possible that dlist would be better in this kind of case, where you can make a change and backtrack a bit, rescan to ensure some invariant, possibly making another change .. then proceed: it may be tricky, but with a functional technique it could be almost impossible to do efficiently. I know I'm not describing this well, but you can look at the Felix optimiser source code if you want to see the kind of thing I'm doing. Here's an example. To inline a procedure I replace a "call f" instruction with the body of f. If f has arguments, I have to create a new variable, assign the argument to it, and replace every use of the parameter in the body with a reference to the variable. I also have to replace every 'return' instruction with a 'goto end_f' instruction, and then make sure to insert the label 'end_f' at the end. Having done all that, I have inlined the procedure BUT there is at least one possible inefficiency: a return at the end of the function will be replaced by a 'goto end_f' followed by the label 'end_f'. Ok, so I can delete the goto in that case. But we're not finished. If that was the only return statement in the function, then the end label isn't needed either, so we can delete that too. There is a similar issue with tail-rec optimisation. I add a label at the start of the function to jump to -- before I know if there are actually any tail-rec calls. [I have to do that to avoid yet another functional pass over the list] Its even worse: when I inline a function, that function might contain a tail call to its caller. This call is tail-rec, but I can't optimise it immediately because you can't 'goto' out of a function. But after inlining -- THEN I can do the optimisation. So I have to rescan the inlined code looking for possible tail-rec optimisations. I have to do that BEFORE the function I'm inlining into is itself inlined into another function, otherwise the call will just go to the original function -- once a function body is inlined into another its identity is lost. Oh yeah -- this is just the coupling between two optimistions (inlining + tail-rec elimination). Of course I assumed i could detect a tail call easily -- but it isn't that easy. My instructions allow nops and comments. This is a tail call: fred: return; call f a;// **** tail call // a comment goto fred; so even detecting a tail call is non-trivial: in fact I even take conditional branches into account. Now I would *much* prefer a couple of simple functional passes to do all the work. I might be able to do that IF I had a nice clean mathematically simple representation and rewrite rules, perhaps similar to lambda calculus style term rewriting.. but I don't have such a simple representation and I'm implementing 'ad hoc' optimisations. So -- do i need dlists for these lists of instructions? I don't know. Actually I need to find labels fast, and identify sequential blocks, and a few other things, so I probably could use something even more sophisticated. So the argument for dlists here is -- unless they're available I can't even try them out. Like you, I'm inclined to avoid mutable data structures. I just don't know what I should use -- but at the moment slists are the only viable choice. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Nicolas C. <war...@fr...> - 2004-08-28 16:28:50
|
> > Taken apart advantages of keys, you get only bidrectional iteration, which > > is IMHO not interesting enough to justify adding a new data structure to > > ExtLib. [snip on arrays] I think you should look at Brian proposal of IndexedTree (functionnal array). It provides exactly that : a low cost insertion at any part of the array. But any insertion/deletion will of course invalid the keys of some parts of the array items. That's not exactly what you need here... [snip on felix compilation] I understand correctly your problem, because I got same when writing compilers. It's often that you need several passes in order to optimize some code or simply for example when writing a forward jump of N instructions into a forward jump to address XX. But I don't think dllist are the best solution here. As you said, you need some rewriting rules, so what about : instead of : type opcode type code = opcode array type key = int (* index in the array *) use : type opcode type code = | Op of opcode | Array of code array type key = array * int (* pointer in the tree *) you can then translate any part of your code into not only a single opcode but into an array of opcodes, and they will remain correctly ordered. You need a tree structure to perform rewriting rules correctly and efficiently , a dllist for this kind of problem is just an inconvenient way of handling flat trees in a mutable fashion. Nicolas |
From: skaller <sk...@us...> - 2004-08-28 17:45:53
|
On Sun, 2004-08-29 at 02:29, Nicolas Cannasse wrote: > type opcode > type code = opcode array > type key = int (* index in the array *) > > use : > > type opcode > type code = > | Op of opcode > | Array of code array > type key = array * int (* pointer in the tree *) Yeah I've thought of that, but: The problem is it makes pattern detection hard. For example to detect the sequence A B with a slist you just recursively seek A :: (B :: _ ) as tail -> .. If two instructions could be next to each other OR one is Op A and the second the first element of Array [| Op B; .. |] or perhaps Array [| Array [| .. so I'd need to flatten this to do that kind of check, and then I'd lose the location even if I did find the pattern.. However for some things like control flow this kind of representation might be useful (where you are looking for entry an exit points, and sequential code isn't interesting) .. except then I think you need a graph :) > you can then translate any part of your code into not only a single opcode > but into an array of opcodes, and they will remain correctly ordered. You > need a tree structure to perform rewriting rules correctly and efficiently , yes but the problem is that the tree you need for each rule is different :) In particular rewriting isn't always one-to-many. In fact the only description I can think of is that its spagetti to spagetti :) However what you suggest is potentially more stable than a singly linked list (eg replacing an instruction with many leaves labels in the same position). The actual instruction set is currently: and bexe_t = [ | `BEXE_label of range_srcref * string | `BEXE_comment of range_srcref * string | `BEXE_goto of range_srcref * string | `BEXE_ifgoto of range_srcref * tbexpr_t * string | `BEXE_ifnotgoto of range_srcref * tbexpr_t * string | `BEXE_call of range_srcref * tbexpr_t * tbexpr_t | `BEXE_call_direct of range_srcref * bid_t * btypecode_t list * tbexpr_t | `BEXE_call_stack of range_srcref * bid_t * btypecode_t list * tbexpr_t | `BEXE_call_prim of range_srcref * bid_t * btypecode_t list * tbexpr_t | `BEXE_jump of range_srcref * tbexpr_t * tbexpr_t | `BEXE_loop of range_srcref * int * tbexpr_t | `BEXE_read of range_srcref * bid_t | `BEXE_fun_return of range_srcref * tbexpr_t | `BEXE_proc_return of range_srcref | `BEXE_nop of range_srcref * string | `BEXE_code of range_srcref * string | `BEXE_assign of range_srcref * tbexpr_t * tbexpr_t | `BEXE_init of range_srcref * bid_t * tbexpr_t | `BEXE_regmatch of range_srcref * tbexpr_t * regular_args_t | `BEXE_reglex of range_srcref * tbexpr_t * tbexpr_t * regular_args_t ] range_srcref is a section of original source to print when there's an error. tbexpr is a typed, bound, expression term. bid_t is just an int, which is used to find a procedure in a hashtable. regular_args_t specifies a finite state machine. The representation of labels as strings requires lookup every time which sucks. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Nicolas C. <war...@fr...> - 2004-08-28 18:27:16
|
> > type opcode > > type code = > > | Op of opcode > > | Array of code array > > type key = array * int (* pointer in the tree *) > > Yeah I've thought of that, but: > > The problem is it makes pattern detection hard. > For example to detect the sequence A B with a slist > you just recursively seek > > A :: (B :: _ ) as tail -> .. That's true. You might want to keep some parts of your geneate code flat but that's something you can anticipate. First apply as much as possible of rewriting rules, then flatten, then loop using you reach a fixpoint. You can also minimize the number of allocations by implementing a two-steps flatten algorithm : first recursivly calculate the size of the array needed then alloc it and recursively fill it. %Jesse: Back to the subject I have thought a little by myself and I think now that DLList could be added to ExtLib, if it replaces the RefList module. I think almost nobody use it, so why not replace the ref-list by a mutable list, and why not a double linked list, that is my thinking. Which of Jesse or Brian will take over the DLList maintenance in ExtLib ? If Brian then you can commit it (as well as IndexedTree as discussed before) , if Jesse then please send me your SF account so I can add you to the developers list. Nicolas |
From: Bardur A. <oca...@sc...> - 2004-08-28 18:32:13
|
On Sat, Aug 28, 2004 at 08:28:05PM +0200, Nicolas Cannasse wrote: > > > type opcode > > > type code = > > > | Op of opcode > > > | Array of code array > > > type key = array * int (* pointer in the tree *) > > > > Yeah I've thought of that, but: > > > > The problem is it makes pattern detection hard. > > For example to detect the sequence A B with a slist > > you just recursively seek > > > > A :: (B :: _ ) as tail -> .. > > That's true. You might want to keep some parts of your geneate code flat but > that's something you can anticipate. First apply as much as possible of > rewriting rules, then flatten, then loop using you reach a fixpoint. You can > also minimize the number of allocations by implementing a two-steps flatten > algorithm : first recursivly calculate the size of the array needed then > alloc it and recursively fill it. > > %Jesse: > Back to the subject I have thought a little by myself and I think now that > DLList could be added to ExtLib, if it replaces the RefList module. I think OptParse uses RefList. -- Bardur Arantsson <ba...@im...> <ba...@sc...> Sufficiently advanced magic is indistinguishable from technology. Terry Pratchett | The Science of Discworld |