Re: [Pyparsing] Order problem in binary operation withoperationPrecedence()

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hello, Paul!

First of all, thank you very much for your help. :)

Paul said:
> Gustavo -
>
> The first issue you have is your definition of identifier0, which you
> define using the Unicode regex "\w", but which cannot start with a numeric
> digit. Unfortunately, there are a lot more numeric digits in Unicode than
> just "0" through "9" - over 300 of them!  Here is how I replaced your
> identifier0 expression, and got your code to at least start working:
>
>     unidigit = Regex(u"[" + "".join(unichr(c) for c in xrange(0x10000) if
> unichr(c).isdigit()) + "]", re.UNICODE)
>     identifier0 = Combine(~unidigit + Regex("[\w]+", re.UNICODE))

Thanks, I've just replaced my solution (which used a parse action) with this.

> Some other comments:
>
> 1.
>
>     eq = CaselessLiteral("==")
>     ne = CaselessLiteral("!=")
>     lt = CaselessLiteral("<")
>     gt = CaselessLiteral(">")
>     le = CaselessLiteral("<=")
>     ge = CaselessLiteral(">=")
>
> Why are these not just plain Literal's?  Making them caseless just makes
> pyparsing do extra work trying to match upper or lower case "==".

Sorry, I forgot to explain that: In the actual code, I don't use those 
literals; I use variables because those tokens can be overridden by the user 
(e.g., the equality token could be "is" instead of "=="), so I want those 
tokens to be case-insensitive if applicable.

> 2.
>     relationals = eq ^ ne ^ lt ^ gt ^ le ^ ge
>
> '^' can be relatively expensive for pyparsing if there are many
> alternatives, since all alternatives will be checked.  This version is
> equivalent:
>
>     relationals = eq | ne | le | ge | lt | gt

Thanks for the hint. I used pipes initially but for some reason some tests 
didn't pass... I've just restored the pipes and the all the tests pass now, 
though :-S

> (I *did* have to be careful about the order, having to check for "<="
> before "<", and ">=" before ">".)

Could you please elaborate on that? Is it for performance reasons?

> 3.
>     not_ = Suppress("~")
>     and_ = Suppress("&")
>     in_or = Suppress("|")
>     ex_or = Suppress("^")
>
> Do you really want to suppress these operators?  Without them, you will
> have a dickens of a time evaluating the parsed expression.  These should
> probably be Literal's, too.

I think so: In the actual code, I have my own parse nodes (for operations and 
operands), and I use the parse actions to convert Pyparsing's parse trees into 
my own parse tree.

For example, the parse action for and_ looks like this:
    def make_and(tokens):
        left_operand = tokens[0][0]
        right_operand = tokens[0][1]
        return BooleanoAnd(left_operand, right_operand)

So the operator isn't necessary to make the new parse tree.

> 4. The parsed results show an empty [] before every identifier, I think
> this is to represent the empty leading namespace for each.  You might want
> to try this for identifier:
>
>     identifier = Combine(namespace.setResultsName("namespace_parts") +
>                   identifier0.setResultsName("identifier"))
>
> Combine will return the match as a single string, but you can still access
> the individual parts of the identifier by their results names if you need
> to.  (Combine will also ensure that you match only contiguous characters as
> an identifier.)

Nice, I've just updated my code accordingly.

> Looks interesting, keep us posted.

Sure! I'm getting closer to the first alpha release, by the way (for the 
impatient: https://launchpad.net/booleano) ;-)

Cheers!
-- 
Gustavo Narea <xri://=Gustavo>.
| Tech blog: =Gustavo/(+blog)/tech  ~  About me: =Gustavo/about |