[Pyparsing] pyparsing, AST and named results

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi all,

I've sent following letters to Paul 20 days ago, but unfortunately
didn't receive any reply. :( So, I'm forwarding it to this list, maybe
someone else would be interested in the following patch...

---------- Forwarded message ----------
From: Alexey Borzenkov <sn...@gm...>
Date: Tue, Aug 4, 2009 at 8:43 PM
Subject: pyparsing, AST and named results
To: Paul McGuire <pt...@us...>

Hi Paul,

I've been using pyparsing to generate a simple AST, and among others I
had AST.Body class for sequence of statements. The problem was that as
soon as I implemented __len__, __iter__ and __getitem__ on AST.Body I
started having weird problems (all statements but the first one
disappearing from compiled code), which I traced to this patch:

[...older patch snipped...]

Basically, because I implemented __getitem__, and because I was using
named parameters, only the first element (the first statement) was
getting assigned under that name. This is a very quick (and possibly
incomplete fix), but it worked in my case.

[...other irrelevant info snipped...]

Thanks,
Alexey.

---------- Forwarded message ----------
From: Alexey Borzenkov <sn...@gm...>
Date: Wed, Aug 5, 2009 at 2:31 PM
Subject: Re: pyparsing, AST and named results
To: Paul McGuire <pt...@us...>


Hi Paul,

It's me again. I've been looking at ParseResults even more, and wonder
if under "if name:" the intension is not to assign empty results under
a name? Because as I see it, the code doesn't check for empty
ParseResults, and I'm wondering if that's intentional or not. If empty
ParseResults don't have any more special meaning than empty lists,
then perhaps it could be patched this way:

diff --git a/src/pyparsing.py b/src/pyparsing.py
index 57e938a..8dbdec1 100644
--- a/src/pyparsing.py
+++ b/src/pyparsing.py
@@ -277,14 +277,15 @@ class ParseResults(object):
    # constructor as small and fast as possible
    def __init__( self, toklist, name=None, asList=True, modal=True ):
        if self.__doinit:
+            if isinstance(toklist, list):
+                toklist = toklist[:]
+            else:
+                toklist = [toklist]
            self.__doinit = False
            self.__name = None
            self.__parent = None
            self.__accumNames = {}
-            if isinstance(toklist, list):
-                self.__toklist = toklist[:]
-            else:
-                self.__toklist = [toklist]
+            self.__toklist = toklist
            self.__tokdict = dict()

        if name:
@@ -293,9 +294,7 @@ class ParseResults(object):
            if isinstance(name,int):
                name = _ustr(name) # will always return a str, but
use _ustr for consistency
            self.__name = name
-            if not toklist in (None,'',[]):
-                if isinstance(toklist,basestring):
-                    toklist = [ toklist ]
+            if toklist and toklist[0] != '':
                if asList:
                    if isinstance(toklist,ParseResults):
                        self[name] = _ParseResultsWithOffset(toklist.copy(),0)
@@ -303,10 +302,7 @@ class ParseResults(object):
                        self[name] =
_ParseResultsWithOffset(ParseResults(toklist[0]),0)
                    self[name].__name = name
                else:
-                    try:
-                        self[name] = toklist[0]
-                    except (KeyError,TypeError,IndexError):
-                        self[name] = toklist
+                    self[name] = toklist[0]

    def __getitem__( self, i ):
        if isinstance( i, (int,slice) ):

The way I see it, it might even have some performance improvement,
because toklist now is always ParseResults or a list (just like in
__toklist), and I checked that toklist should never be None, this
leaves only one comparison (with '') instead of three, and no
unnecessary indexing or try/except. The question is will it break
anything?

I can't complete all unitTests (examples are missing in svn, and not
everything is in 1.5.2 release), but my changes don't make non-failing
ones fail.

Also, about empty ParseResults as names, the only case I could come up
with is something like this:

from pyparsing import *

s = "()"
g = (Suppress('(') + ZeroOrMore('.') + Suppress(')'))('dots') + StringEnd()
print repr(g.parseString(s).dots)

With my changes dots will now disappear, but it seems more consistent
to me, because ZeroOrMore('.')('dots') would not appear under the
name, why a bunch of suppressed tokens should make a difference?

Thanks,
Alexey.