Re: [Pyparsing] Lists and Groups
Brought to you by:
ptmcg
From: Michael F. <nob...@gm...> - 2010-11-15 16:39:48
|
Thanks! I've confirmed it works, even after suppressing 'get' and 'AT', and grouping the shopping list. --Michael On Sat, Nov 13, 2010 at 8:25 PM, Paul McGuire <pt...@au...> wrote: > I don't think you've gone overboard here. Your BNF *will* be somewhat > informal, but don't give up on it. > > Your basic form is: > > (get) (stuff) (at) (location) > > Given that you have some entries where you just get stuff, or specify a > location just based on a leading '@' symbol, this gets a little more > complex, (using []'s for optional parts: > > (get) (stuff) [[(at)] (location)] > > (stuff) and (location) are going to be pretty unstructured, but > fortunately, > you're defining some specific forms for (get) and (at). > > Your (stuff) can contain a list of items separated by commas, (and), or (, > and), so I think you can define it as pretty open for (item), and then > define a delimited list for the list of items. You'll need to specify the > lookahead (as you already picked up in your posted pyparsing code) to avoid > parsing "at" or "and" or a store label with a leading '@' as an item word. > And using Keyword's for your delimiting words is a good choice, to guard > against accidentally reading the leading 'at' in 'athletic socks' as 'at', > and the remaining 'hletic socks' as some kind of store. > > So I'd informally write this as: > > shopping-list ::= get stuff [[at] location] > get ::= "get" | "buy" | "pick up" > at ::= "at" | "from" > item ::= (~(and | at | '@') Word(alphas+'-'))... > stuff ::= item [ (', and' | ',' | 'and') item ]... > location ::= ['@'] Word(alphas)... > > Rendered into pyparsing, it ends up very similar to your posted code: > > COMMA,AT = map(Literal, ',@') > KW = CaselessKeyword > get = KW('get') | KW('buy') | KW('pick') + KW('up') | KW('need') > at = KW('at') | KW('from') > location = Combine(Optional(AT) + OneOrMore(Word(alphas)), ' ', > adjacent=False) > and_ = KW('and') > itemdelim = COMMA + and_ | COMMA | and_ > item = Combine(OneOrMore(~(at | itemdelim | '@') + Word(alphas)), ' ', > adjacent=False) > stuff = delimitedList(item, itemdelim) > shoppingList = get + stuff("items") + Optional(Optional(at) + > location("location")) > > And this seems to work ok for your posted tests. > > Please try to avoid defining Literals with embedded spaces, and > *especially* > not with leading spaces (as you do with " and ") - pyparsing's default > whitespace skipping will almost surely make your leading-whitespace literal > unmatchable. Note how I defined "pick up" as an option for (get) as two > joined keywords - this immunizes us against cases with extra whitespace > between the two words, at very little cost. > > Thanks for writing, and welcome to the World of Pyparsing! - :) > > -- Paul > > > |