Re: [Pyparsing] How to distinguish a variable from a integer
Brought to you by:
ptmcg
From: Gustavo N. <me...@gu...> - 2009-05-14 19:54:00
|
Paul said: > > Hello, everybody. > > > > First of all, I wanted to thank you for this awesome package. I'm having > > fun with it. :) > > Well, well, my friend, so we meet again! I'm pleased to see you have been > bitten by the pyparsing bug. :) Hello, Paul! Good to see you here :) > > How can I fix this? > > In general, I think this is why variable names in most computing languages > I know do *not* permit the name to begin with a number. But you are the > language designer, so I will show you how to do this in pyparsing. > > Two suggestions, not sure if I have a preference: > 1. use "operand = number ^ variable" instead of "operand = number | > variable". '|' returns MatchFirst expressions, which return, well, the > first matching expression. '^' returns Or expressions, which return > *longest* match of all the alternative expressions. Think of the '^' as a > little set of dividers, measuring the returned values of all the > expressions, and picking the longest. '^' is not a cure-all, though, and > can cause infinite run-time recursion in self-referencing grammars (those > that include operatorPrecedence or Forward expressions). I use both operatorPrecedence and Forward :/ > 2. As you say, invert operand to "operand = variable | number", and then > attach a parse action to variable that first tries to evaluate the result > as a number. In your current parser, you may eventually attach a parse > action to number, something like this: > number.setParseAction(lambda tokens: float(tokens[0])) > so that at post-parse time, the returned string has already been converted > to a float. So instead, attaching something like this to variable > (untested): > > def numOrVar(tokens): > try: > return float(tokens[0]) > except ValueError: > pass > variable.setParseAction(numOrVar) > > Now you don't even need the alternation, since as you observed, variable > will also match "22", so just define "operand = variable". > You could also try this for defining variable: > > variable = Word(unicode(alphanums+'_')) > > or > > variable = Word(unicode(alphanums+alphas8bit+'_')) > > or to absolutely cover all bases (for 2-byte Unicode, anyway): > > allUnicodeAlphas = u''.join(c for c in map(unichr,range(65536)) if > c.isalpha()) > allUnicodeNums = = u''.join(c for c in map(unichr,range(65536)) if > c.isdigit()) > variable = Word(allUnicodeAlphas + allUnicodeNums + u'_') > > (It's surprising how many Unicode digits there are besides '0'-'9'.) I said that an operand could be a variable or a number to simplify things given that the problem was between numbers and variables. But it's actually more complex than that: It could be a quoted string or a set (in the form "{element1, element2, ...}" where each element can be a number, variable, quoted string or even another set) too: operand = number | string | variable | set Therefore setting a parse action for the whole operand wouldn't be desirable, I'd rather set it in the types individually -- specially to be able to test them separately too. Sorry for not pointing this out. > > > BTW, this definition of decimals: > > decimals = Optional(decimal_sep + OneOrMore(Word(nums))) > > includes some unnecessary repetition. It should be sufficient to write: > > decimals = Optional(decimal_sep + Word(nums)) > > Unless I misunderstood your intent here. Thank you so much! I thought I had to set the quantifier explicitly. > So, will we see some pyparsing sneak into a repoze package one of these > days, perhaps some sort of authorization rights syntax, hmmm? You guessed right! :) I'm working on a package called PyACL, which as the name implies, implements Access Control Lists in Python (and repoze.what 2 will use it a lot). But one of the things that I was missing was the way to allow system administrators to filter the access rules easily, so I started working on this generic Pyparsing-based library which I'll announce here as soon as it's usable: https://launchpad.net/booleano > Buena suerte, y mucho gusto! ¡Lo mismo digo! ;-) Thank you! =) -- Gustavo Narea <xri://=Gustavo>. | Tech blog: =Gustavo/(+blog)/tech ~ About me: =Gustavo/about | |