Re: [Pyparsing] Lists and Groups
Brought to you by:
ptmcg
From: Paul M. <pt...@au...> - 2010-11-14 03:25:48
|
I don't think you've gone overboard here. Your BNF *will* be somewhat informal, but don't give up on it. Your basic form is: (get) (stuff) (at) (location) Given that you have some entries where you just get stuff, or specify a location just based on a leading '@' symbol, this gets a little more complex, (using []'s for optional parts: (get) (stuff) [[(at)] (location)] (stuff) and (location) are going to be pretty unstructured, but fortunately, you're defining some specific forms for (get) and (at). Your (stuff) can contain a list of items separated by commas, (and), or (, and), so I think you can define it as pretty open for (item), and then define a delimited list for the list of items. You'll need to specify the lookahead (as you already picked up in your posted pyparsing code) to avoid parsing "at" or "and" or a store label with a leading '@' as an item word. And using Keyword's for your delimiting words is a good choice, to guard against accidentally reading the leading 'at' in 'athletic socks' as 'at', and the remaining 'hletic socks' as some kind of store. So I'd informally write this as: shopping-list ::= get stuff [[at] location] get ::= "get" | "buy" | "pick up" at ::= "at" | "from" item ::= (~(and | at | '@') Word(alphas+'-'))... stuff ::= item [ (', and' | ',' | 'and') item ]... location ::= ['@'] Word(alphas)... Rendered into pyparsing, it ends up very similar to your posted code: COMMA,AT = map(Literal, ',@') KW = CaselessKeyword get = KW('get') | KW('buy') | KW('pick') + KW('up') | KW('need') at = KW('at') | KW('from') location = Combine(Optional(AT) + OneOrMore(Word(alphas)), ' ', adjacent=False) and_ = KW('and') itemdelim = COMMA + and_ | COMMA | and_ item = Combine(OneOrMore(~(at | itemdelim | '@') + Word(alphas)), ' ', adjacent=False) stuff = delimitedList(item, itemdelim) shoppingList = get + stuff("items") + Optional(Optional(at) + location("location")) And this seems to work ok for your posted tests. Please try to avoid defining Literals with embedded spaces, and *especially* not with leading spaces (as you do with " and ") - pyparsing's default whitespace skipping will almost surely make your leading-whitespace literal unmatchable. Note how I defined "pick up" as an option for (get) as two joined keywords - this immunizes us against cases with extra whitespace between the two words, at very little cost. Thanks for writing, and welcome to the World of Pyparsing! - :) -- Paul |