FMSLogo / Feature Requests / #94 Faster property lists

David Costanzo - 2013-04-06

I would like some clarifation on this feature request. This is actually two requests, correct? One to speed up performance and one to generalize what can be a property name.

Property lists, as their name implies, are lists. And PLIST really does just return a copy of the list. Lists have O(n) lookup and O(1) insertion. The API certainly supports faster lookup, but doing this without changing the behavior of PLIST would be tricky. But I wonder if what you really would prefer is something like a dictionary type which could be stored in a variable. Or more generally an "object" the way JavaScript or Python think about objects. Of course, non-word property names on objects would be very strange, so if we were going to make Logo more object-oriented, two separate data types for the two feature requests might be better.

As for using lists as property names, what are your expectations? That this list be converted to a word before lookup, or that lists be valid property names? For example is [abc] the same as "abc? For this, I'd defer to how LISP defines property lists to ensure I don't extend FMSLogo in a way that is incongruent with Logo's design philosophy. LISP supports non-word property names, and property names are compared using EQL. This isn't quite the same as Logo's EQUALP, but I think it would project into Logo as EQUALP. Would that fit your expectations? That "abc and [abc] are different, but that [a b c] and [a b c] are the same?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- DanielAjoy - 2013-04-08
  
  I would like some clarifation on this feature request. This is actually two requests, correct? One to speed up performance and one to generalize what can be a property name.
  
  yes.
  
  Property lists, as their name implies, are lists. And PLIST really does just return a copy of the list. Lists have O(n) lookup and O(1) insertion. The API certainly supports faster lookup, but doing this without changing the behavior of PLIST would be tricky. But I wonder if what you really would prefer is something like a dictionary type which could be stored in a variable.
  
  I would like something that behaves like lists. That can be stored in variables, that can be used locally, that can be passed around as inputs to other procedures.
  
  Or more generally an "object" the way JavaScript or Python think about objects. Of course, non-word property names on objects would be very strange, so if we were going to make Logo more object-oriented, two separate data types for the two feature requests might be better.
  
  I'm thinking what I'm looking for could be implemented using lists in this way:
  
  http://groups.yahoo.com/group/LogoForum/message/16820
  
  As for using lists as property names, what are your expectations? That this list be converted to a word before lookup, or that lists be valid property names? For example is [abc] the same as "abc?
  
  no, that there wouldn't be a conversion.
  
  For this, I'd defer to how LISP defines property lists to ensure I don't extend FMSLogo in a way that is incongruent with Logo's design philosophy. LISP supports non-word property names, and property names are compared using EQL. This isn't quite the same as Logo's EQUALP, but I think it would project into Logo as EQUALP. Would that fit your expectations?
  
  Yes. I want to be able to do things like this:
  
  make "values.of.cells [ [[a 1] 10] [[b 2] 20] [[c 3] 30] [[c 1] 31] ] show equipara: 0 :values.of.cells [a 1] 10 show equipara: 0 :values.of.cells [a 2] 0 show equipara: 0 :values.of.cells [b 2] 20 show equipara: 0 :values.of.cells [b 3] 0 foreach [a b c] [[col] foreach [1 2 3] [[row] type equipara: 0 :values.of.cells list :col :row type char 32] (print) ] 10 0 0 0 20 0 31 0 30
  
  That "abc and [abc] are different, but that [a b c] and [a b c] are the same?
  
  yes.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

DanielAjoy - 2013-04-08

https://en.wikipedia.org/wiki/AA_tree has two C language implementations of that kind of trees. I'm sure an internal implementation would be faster than an implementation made as a Logo library. The AA tree algorithm does require a way to give order to keys, but lists can be ordered by the order of their elements:

compare the first items of the lists.

if they are the same compare the rest of the list.

if one of the lists is empty and the other one is not empty then the non-empty lists is greater than the empty one.

if the first items of the lists are lists themselves apply this procedure recursively.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

DanielAjoy - 2013-04-09

It is only tangentially relevant that traversing such a tree would give us the associative array sorted by key:

http://en.wikipedia.org/wiki/Tree_sort

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David Costanzo - 2013-04-11

I think using non-word keys is straight-foward enough that it doesn't need to be discussed anymore. The strategy you describe for lists will work fine.

As for the performance boost, from the information you've provided, I'm still not sure which is the best approach:
(1) Enhancing PLIST
(2) Creating a new data type
(3) Extending GPROP to accept a list for the propery name input

To elaborate on (3) a little more, basically, if you gave GPROP a list with an even number of elements for its first input, it would treat it as a PLIST and return the answer. The first time it was invoked for any given list, it internally build an index for that list so future lookups on the exact same list could be made in O(Log N)--possibly faster. In this case, the list would still be a list, so it could be used as such.

This would be a weird, hidden feature of GPROP; you wouldn't be able to substitute the plist name with REMPROP or PPROP. Alternatively, if this lack of orthogonality is disturbing, I could add a new primitive called DICTIONARYGETITEM, instead of overloading GPROP. This essentially creates a new data type, without having to invent a new syntax for the type or figure out how all of the old procedures should act when giving the new data type.

I would like something that behaves like lists...

Based on the example you gave, I'm not sure this is necessary for you. For example, SENTENCE works on lists, but doesn't make much sense on a PLIST. The choices in a FOREACH work on lists, but, again, don't make much sense for a PLIST. In Logo, lists are immutable, which doesn't make sense if you want to add or remove items. If behaving like a list is a hard requirement, then only (3) is a valid solution.

...that can be stored in variables, that can be used locally, that can be passed around as inputs to other procedures.

Putting the name of a PLIST in a variable does almost all of this. The only think it can't do is automatically erase the PLIST when it goes out of scope, like a LOCAL variable can. If you want dynamic scoping on these associative arrays, then (1) won't work, but (2) or (3) would.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

DanielAjoy - 2013-04-11

I think using non-word keys is straight-foward enough that it doesn't need to be discussed anymore. The strategy you describe for lists will work fine.

As for the performance boost, from the information you've provided, I'm still not sure which is the best approach:
(1) Enhancing PLIST
(2) Creating a new data type
(3) Extending GPROP to accept a list for the propery name input

To elaborate on (3) a little more, basically, if you gave GPROP a list with an even number of elements for its first input, it would treat it as a PLIST and return the answer. The first time it was invoked for any given list, it internally build an index for that list so future lookups on the exact same list could be made in O(Log N)--possibly faster. In this case, the list would still be a list, so it could be used as such.

This would be a weird, hidden feature of GPROP; you wouldn't be able to substitute the plist name with REMPROP or PPROP. Alternatively, if this lack of orthogonality is disturbing, I could add a new primitive called DICTIONARYGETITEM, instead of overloading GPROP.

I think, being able to add and remove items once the PLIST has been constructed is important: for fast testing if an element is still in the dictionary. Often dictionaries are used as sets which change member items frequently, in games for example: where enemies come and go, or where walls are added or are destroyed through the game.

I would like something that behaves like lists...

Oh, there what I meant was that I wanted the dictionaries to be first class things like lists are. I wasn't talking about LIST-immutability or making dictionaries work with primitives that work on lists. I'm happy with dictionary being understood only by dictionary primitives, and one or two primitives that convert dictionaries to lists and vise-versa.

Based on the example you gave, I'm not sure this is necessary for you. For example, SENTENCE works on lists, but doesn't make much sense on a PLIST. The choices in a FOREACH work on lists, but, again, don't make much sense for a PLIST. In Logo, lists are immutable, which doesn't make sense if you want to add or remove items. If behaving like a list is a hard requirement, then only (3) is a valid solution.

so I'm not worried about the points mentioned in this paragraph above.

...that can be stored in variables, that can be used locally, that can be passed around as inputs to other procedures.

Putting the name of a PLIST in a variable does almost all of this.

I realize this, and I think I can accept that.

The only think it can't do is automatically erase the PLIST when it goes out of scope, like a LOCAL variable can.

I think I can come to grips with that. I imagine that I could still create a wrapper around it that does something like:

with.plist "mydict [ pprop "mydict [a 1] 10 pprop "mydict [b 2] 20 pprop "mydict [c 3] 30 pprop "mydict [c 1] 31 show plist "mydict ; would show ; [[a 1] 10 [b 2] 20 [c 3] 30 [c 1] 31] ] show plist "mydict ; would show ; [] because with.plist would call this line at the end. erase [[] [] [mydict]]

If you want dynamic scoping on these associative arrays, then (1) won't work, but (2) or (3) would.

I find this a strong argument in favor of 2.

A big feature of LogoFE is that computations can be made without naming data structures; that they are just passed around as inputs and outputs.

With (1) I would have to make an exception to this, because I would have to name the PLISTS. Though I could probably use GENSYM to avoid having to name them manually...

I also understand that with (1) I would still have dynamic scoping of the PLIST names for example:

to work.with.a.plist localmake "plistname gensym pprop :plistname "a 1 print [in work.with.a.plist] show plist :plistname work.with.another.plist print [back in work.with.a.plist] show plist :plistname end to work.with.another.plist ;using the same variable "plistname" localmake "plistname gensym pprop :plistname "a 1000 print [in work.with.another.plist] show plist :plistname end work.with.a.plist in work.with.a.plist [a 1] in work.with.another.plist [a 1000] back in work.with.a.plist [a 1]

a. I like (2) because I could actually pass the structure around, as inputs and outputs.

b. I like (2) because I could potentially control which NOT.FOUND value the structure would return (This would be a feature I would like, did you notice that EQUIPARA asks for a NOT.FOUND value)

c. I like (2) because as you said it wouldn't cause backwards compatibility problems, like (1) might.

d. I don't like (2) because I think you are saying that your idea implementation of (2) doesn't provide equivalents to PPROP and REMOVEP.

e. I don't like (2) because the functionality of PLISTS and (2) seems to be overlapping in places. Instead of fixing PLISTS we would be adding something else as a patch. This is a strong argument for me. I know it is a aesthetic one; but, in my own programming, I'm willing to sacrifice things like (c) in favor of having just one thing.

Last edit: DanielAjoy 2013-04-11
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David Costanzo - 2013-04-13

d. I don't like (2) because I think you are saying that your idea implementation of (2) doesn't provide equivalents to PPROP and REMOVEP.

I didn't explain this clearly. This restriction applies to (3)--the clever way of making PLIST faster in limited circumstances. A new datatype would have those procedures for getting and settings members.

e. I don't like (2) because the functionality of PLISTS and (2) seems to be overlapping in places

That is annoying, but LISP has the same non-orthogonality (assoc lists, plists, objects, and hash tables). That doesn't make it good, but it does mean that whomever extended Logo's predecesor over the years decided that dictionaries were important enough to break othogonality. We shouldn't let an existing bad design prevent us from making improvements. The thing I am concerned about is implementing a "dictionary" type and then later adding an "object" type, which does essentially the same thing and more. I haven't looked how UCBLogo added objects, but I'd want to do that before adding a new data structure.

Overally, I'm leaning toward (2), but it's a lot of work and very disruptive. It requires updating (or at least testing) all existing procedures to make sure they behave correctly when given the new data type for each input, updating the manual for these producures (or at least reviewing) to say what does or doesn't happen if given a dictionary, defining a new syntax for literals, updating all of the workspace management routines, and redefining a "contents list" to have four parts.

That said, I don't want to lose the small part of this request (generalizing "property name" to include lists) because of the large part (adding a new data type). Therefore, I'm going to do that first and release it. It might be that simply using a primitive for lookup will be fast enough, even if it is linear. Algorithmically, this shouldn't make things faster, but Logo has a way of hiding additional computational complexity, so a linear lookup in a primitive might be faster that what looks like a linear lookup in a library routine.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

DanielAjoy - 2013-04-13

That said, I don't want to lose the small part of this request (generalizing "property name" to include lists)

Isn't this the part that could potentially cause backwards compatibility problems?

because of the large part

To me, the large part is, primary, the algorithmic speed of retrieval of elements of associative arrays in FMSLogo.

Have you found a roadblock in replacing the internal representation of PLISTS with something like

https://en.wikipedia.org/wiki/AA_tree

?

(I'm trying to get a hint as to why you think this is not a good solution.)

Maybe I'm too naive but wouldn't it be fairly easy to replace the internal representation of PLISTS as lists to being stored in a balanced tree?

The interface of property lists would stay the same:

PPROP would add an item to the balanced tree.

GPROP would get an item from the balanced tree.

REMPROP would delete a node from the balanced tree.

PLIST would traverse the balanced tree and return a list.

The only change would be that GPROP and REMPROP would be faster for moderate size data.

Last edit: DanielAjoy 2013-04-13

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David Costanzo - 2013-04-13

That said, I don't want to lose the small part of this request (generalizing "property name" to include lists)

Isn't this the part that could potentially cause backwards compatibility problems?

No, I think this is safe because it's changing something that was previously an error to something that would now be legal. I think it's unlikely that there are any Logo programs which depends on PPROP throwing a "doesn't like" error when given a non-word propery name.

(I'm trying to get a hint as to why you think this is not a good solution.)

It's because you wrote that you wanted to pass these dictionaries around in variables and you can't do that with PLIST. If this is not a requirement, then PLIST could use a self-balancing tree or hash table. There is a small compatibility risk, because as GPROP and REMPROP get faster in general, PPROP will get slower and both GPROP and REMPROP will be slower in specific cases--when you get/remove the most recently added items. Likewise, unless I do a lot more work, PLIST will return the items in a different order than before. Of course, nothing about the order or time complexity was specified, but that doesn't mean that there aren't MSWLogo programs that depend on this stuff.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

DanielAjoy - 2013-04-13

It's because you wrote that you wanted to pass these dictionaries around in variables and you can't do that with PLIST.

It would be a nice thing to have. But considering that Property lists are already part of Logo and that improving them seems the least disruptive alternative to having faster associative arrays in Logo, I'm willing to work around not having the ability to pass dictionaries around (working around this problem by using things like with.plist mentioned above and GENSYM).

Do you think mentioning this conversation to the people from the LogoForum could help bring more clarity to what the needs of others using FMSLogo are, in regards to property lists?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David Costanzo - 2013-04-13

Do you think mentioning this conversation to the people from the LogoForum could help bring more clarity to what the needs of others using FMSLogo are, in regards to property lists?

Yeah, if you want to mediate a wider discussion, that'd be great. Personally, I trust your insticts, both as a Logo programmer and as an ambassador for FMSLogo. I won't get around to implementing this soon, so there's plenty of time to refine a plan-of-action based on feedback.

updating all of the workspace management routines, and redefining a "contents list" to have four parts.

...just to mention, my thinking here was unfounded. A new data type would, of course, be included in the workspace contents list as variables, since that would be the main motivation for using them.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David Costanzo - 2013-05-05

I have checked in support for list and array property names. This is the small part of this request.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- DanielAjoy - 2013-05-06
  
  I tested this
  
  pprop "a [1 2] [one] gprop [1 2] not enough inputs to gprop gprop "a [1 2] You don't say what to do with [one] pprop "a [1 [2]] [one] gprop "a [1 2] You don't say what to do with [one] pprop "a [1 [2]] [two] gprop "a [1 [2]] You don't say what to do with [two] plist "a You don't say what to do with [[1 [2]] [two] [1 2] [one]] POPLS Pprop "a [1 [2]] [two] Pprop "a [1 2] [one] popl "a Pprop "a [1 [2]] [two] Pprop "a [1 2] [one]
  
  and everything worked as expected.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David Costanzo - 2013-11-29

I have re-implemented property lists as AVL trees. Unfortunately, due to the coupling between property lists and FMSLogo's memory manager, I could not use someone else's fully-debugged library. I'm not completely content with the way I implemented it, but I don't expect to fiddle with it anymore.

The order of the list which PLIST outputs is different than before, but that's the only breaking change that I'm aware of.

This will be available in FMSLogo 6.32.0, which has no expected release date. In the meantime, you can test a developer release, which I have attached to this ticket (it's the Spanish wxwidget-based version).

fmslogo-1034.exe

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David Costanzo - 2013-11-29

status: open --> pending

assigned_to: David Costanzo
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David Costanzo - 2015-02-07

Changing resolution to CLOSED. Both requests have been implemented.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David Costanzo - 2015-02-07

status: pending --> closed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Faster property lists

A Logo programming environment for Microsoft Windows

Group

Searches

Help

#94 Faster property lists

Discussion