Thread: [Pyparsing] Unordered set of expressions
Brought to you by:
ptmcg
From: Russell D. <Rus...@as...> - 2011-09-29 18:25:23
|
I have to parse something with an unordered set of expressions. In the simplest case, some expressions are required, some are optional. For this, parsing is easy: language = expr1 & expr2 & Optional(expr3) However, there are cases where an expression can occur zero or more times, one or more times, or two or more times. But since they can still occur in any order, I'm not sure how to handle it. If I try: language = expr1 & expr2 & Optional(expr3) & ZeroOrMore(expr4) & ZeroOrMore(expr5) Then parsing fails if all expr4's are not adjacent. My current idea is to do the following: multiples = ZeroOrMore(expr4 | expr5) language = (multiples + expr1) & (multiples + expr2) & Optional(multiples + expr3) & multiples While its ugly, I can't think of any other way to do it. Additionally, it doesn't get me any closer to implementing OneOrMore in an unsorted set of expressions. |
From: Paul M. <pt...@au...> - 2011-09-29 23:09:22
|
Russell - I'm not seeing this behavior: from pyparsing import * """ language = expr1 & expr2 & Optional(expr3) & ZeroOrMore(expr4) & ZeroOrMore(expr5) """ A = Literal("A") B = Literal("B") C = Literal("C") D = Literal("D") E = Literal("E") lang = A & B & Optional(C) & ZeroOrMore(D) & ZeroOrMore(E) tests = """\ ABCDE ABDE AB BA EDCBA EAEB BEDCADEEDEDDEDD""".splitlines() for t in tests: print t print lang.parseString(t, parseAll=True) print prints: ABCDE ['A', 'B', 'C', 'D', 'E'] ABDE ['A', 'B', 'D', 'E'] AB ['A', 'B'] BA ['B', 'A'] EDCBA ['E', 'D', 'C', 'B', 'A'] EAEB ['E', 'A', 'E', 'B'] BEDCADEEDEDDEDD ['B', 'E', 'D', 'C', 'A', 'D', 'E', 'E', 'D', 'E', 'D', 'D', 'E', 'D', 'D'] All of my test cases parse successfully. This is using the latest pyparsing. You can have as many D's and E's as you like, anywhere, and at most 1 C anywhere, but you must have 1 A and 1 B, somewhere. -- Paul -----Original Message----- From: Russell Dill [mailto:Rus...@as...] Sent: Thursday, September 29, 2011 1:25 PM To: pyp...@li... Subject: [Pyparsing] Unordered set of expressions I have to parse something with an unordered set of expressions. In the simplest case, some expressions are required, some are optional. For this, parsing is easy: language = expr1 & expr2 & Optional(expr3) However, there are cases where an expression can occur zero or more times, one or more times, or two or more times. But since they can still occur in any order, I'm not sure how to handle it. If I try: language = expr1 & expr2 & Optional(expr3) & ZeroOrMore(expr4) & ZeroOrMore(expr5) Then parsing fails if all expr4's are not adjacent. My current idea is to do the following: multiples = ZeroOrMore(expr4 | expr5) language = (multiples + expr1) & (multiples + expr2) & Optional(multiples + expr3) & multiples While its ugly, I can't think of any other way to do it. Additionally, it doesn't get me any closer to implementing OneOrMore in an unsorted set of expressions. ---------------------------------------------------------------------------- -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |
From: Russell D. <Rus...@as...> - 2011-09-29 23:43:58
|
On Thu, Sep 29, 2011 at 4:09 PM, Paul McGuire <pt...@au...> wrote: > Russell - > > I'm not seeing this behavior: > > EAEB > ['E', 'A', 'E', 'B'] > > You can have as many D's and E's as you like, anywhere, and at most 1 C > anywhere, but you must have 1 A and 1 B, somewhere. Odd, at some point I must have convinced myself something about Each that wasn't true. Ok, minor issue though, the Group keyword seems to cause this to stop working. Since the order doesn't matter, it'd be nice to Group them together. Naming them seems to be a workaround, but it produces strange parse output. ZeroOrMore(Literal("E")("Es"))) > > -----Original Message----- > From: Russell Dill [mailto:Rus...@as...] > Sent: Thursday, September 29, 2011 1:25 PM > To: pyp...@li... > Subject: [Pyparsing] Unordered set of expressions > > I have to parse something with an unordered set of expressions. In the > simplest case, some expressions are required, some are optional. For > this, parsing is easy: > > language = expr1 & expr2 & Optional(expr3) > > However, there are cases where an expression can occur zero or more > times, one or more times, or two or more times. But since they can > still occur in any order, I'm not sure how to handle it. If I try: > > language = expr1 & expr2 & Optional(expr3) & ZeroOrMore(expr4) & > ZeroOrMore(expr5) > > Then parsing fails if all expr4's are not adjacent. My current idea is > to do the following: > > multiples = ZeroOrMore(expr4 | expr5) > > language = (multiples + expr1) & (multiples + expr2) & > Optional(multiples + expr3) & multiples > > While its ugly, I can't think of any other way to do it. Additionally, > it doesn't get me any closer to implementing OneOrMore in an unsorted > set of expressions. > > ---------------------------------------------------------------------------- > -- > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > Pyparsing-users mailing list > Pyp...@li... > https://lists.sourceforge.net/lists/listinfo/pyparsing-users > > |
From: Paul M. <pt...@au...> - 2011-09-30 02:44:35
|
Russell - Still not seeing the problem. from pyparsing import * """ language = expr1 & expr2 & Optional(expr3) & ZeroOrMore(expr4) & ZeroOrMore(expr5) """ A = oneOf("A a")("As") B = oneOf("B b")("Bs") C = oneOf("C c")("Cs") D = oneOf("D d")("Ds") E = oneOf("E e")("Es") lang = A & B & Optional(C) & ZeroOrMore(D) & ZeroOrMore(E) tests = """\ ABCDE ABDE AB BA EDCBA EaEB BeDCADEeDEDDeDD""".splitlines() for t in tests: print t.strip() print lang.parseString(t, parseAll=True).dump() print Prints: ABCDE ['A', 'B', 'C', 'D', 'E'] - As: A - Bs: B - Cs: C - Ds: D - Es: E ABDE ['A', 'B', 'D', 'E'] - As: A - Bs: B - Ds: D - Es: E AB ['A', 'B'] - As: A - Bs: B BA ['B', 'A'] - As: A - Bs: B EDCBA ['E', 'D', 'C', 'B', 'A'] - As: A - Bs: B - Cs: C - Ds: D - Es: E EaEB ['E', 'a', 'E', 'B'] - As: a - Bs: B - Es: ['E', 'E'] BeDCADEeDEDDeDD ['B', 'e', 'D', 'C', 'A', 'D', 'E', 'e', 'D', 'E', 'D', 'D', 'e', 'D', 'D'] - As: A - Bs: B - Cs: C - Ds: ['D', 'D', 'D', 'D', 'D', 'D', 'D'] - Es: ['e', 'E', 'e', 'E', 'e'] No Group required, and the parsed results look okay. -- Paul |
From: Russell D. <Rus...@as...> - 2011-09-30 03:05:31
|
> No Group required, and the parsed results look okay. Group() actually breaks it, so the naming is required instead. As for the results, I was looking at the output without dump, my bad. >>> lang = Word("A") & ZeroOrMore(Word("B")("Bs")) >>> lang.parseString("B A B", parseAll=True) (['B', 'A', 'B'], {'Bs': [('B', 0), ('B', 2), ((['B', 'B'], {}), 0)]}) Its correct. |