pyparsing-users Mailing List for Python parsing module (Page 19)
Brought to you by:
ptmcg
You can subscribe to this list here.
2004 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
(2) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(2) |
Feb
|
Mar
(2) |
Apr
(12) |
May
(2) |
Jun
|
Jul
|
Aug
(12) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
2006 |
Jan
(5) |
Feb
(1) |
Mar
(10) |
Apr
(3) |
May
(7) |
Jun
(2) |
Jul
(2) |
Aug
(7) |
Sep
(8) |
Oct
(17) |
Nov
|
Dec
(3) |
2007 |
Jan
(4) |
Feb
|
Mar
(10) |
Apr
|
May
(6) |
Jun
(11) |
Jul
(1) |
Aug
|
Sep
(19) |
Oct
(8) |
Nov
(32) |
Dec
(8) |
2008 |
Jan
(12) |
Feb
(6) |
Mar
(42) |
Apr
(47) |
May
(17) |
Jun
(15) |
Jul
(7) |
Aug
(2) |
Sep
(13) |
Oct
(6) |
Nov
(11) |
Dec
(3) |
2009 |
Jan
(2) |
Feb
(3) |
Mar
|
Apr
|
May
(11) |
Jun
(13) |
Jul
(19) |
Aug
(17) |
Sep
(8) |
Oct
(3) |
Nov
(7) |
Dec
(1) |
2010 |
Jan
(2) |
Feb
|
Mar
(19) |
Apr
(6) |
May
|
Jun
(2) |
Jul
|
Aug
(1) |
Sep
|
Oct
(4) |
Nov
(3) |
Dec
(2) |
2011 |
Jan
(4) |
Feb
|
Mar
(5) |
Apr
(1) |
May
(3) |
Jun
(8) |
Jul
(6) |
Aug
(8) |
Sep
(35) |
Oct
(1) |
Nov
(1) |
Dec
(2) |
2012 |
Jan
(2) |
Feb
|
Mar
(3) |
Apr
(4) |
May
|
Jun
(1) |
Jul
|
Aug
(6) |
Sep
(18) |
Oct
|
Nov
(1) |
Dec
|
2013 |
Jan
(7) |
Feb
(7) |
Mar
(1) |
Apr
(4) |
May
|
Jun
|
Jul
(1) |
Aug
(5) |
Sep
(3) |
Oct
(11) |
Nov
(3) |
Dec
|
2014 |
Jan
(3) |
Feb
(1) |
Mar
|
Apr
(6) |
May
(10) |
Jun
(4) |
Jul
|
Aug
(5) |
Sep
(2) |
Oct
(4) |
Nov
(1) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
(13) |
May
(1) |
Jun
|
Jul
(2) |
Aug
|
Sep
(9) |
Oct
(2) |
Nov
(11) |
Dec
(2) |
2016 |
Jan
|
Feb
(3) |
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
(3) |
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(4) |
2017 |
Jan
(2) |
Feb
(2) |
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
(4) |
Aug
|
Sep
|
Oct
(4) |
Nov
(3) |
Dec
|
2018 |
Jan
(10) |
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2020 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2023 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
(1) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(1) |
Aug
(3) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
From: Michael D. <md...@st...> - 2008-06-04 13:44:41
|
I'm porting matplotlib (which uses pyparsing) to the maemo platform, a Linux-based platform for the Nokia Internet Tablets (N770, N800, N880). One thing you often see on these smaller platforms is that the Python standard library has been modularized to save space. In particular on this platform, the Python xml package is distributed separately. It would be nice not to have to depend on the entire set of Python XML libraries just to use pyparsing. pyparsing uses xml.sax.saxutils.escape, which is actually a very straightforward and self-contained function. I've attached a simple patch to include this function in pyparsing.py itself when xml.sax.saxutils can't be imported. I realize this is a fairly uncommon use case, so I'll let you make the judgment call of whether it's worth including in the pyparsing trunk, but I thought it was worth bringing to your attention something that would improve the "portability" of pyparsing. Cheers, Mike -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA |
From: happybrowndog <hap...@ho...> - 2008-06-03 18:28:59
|
Thank you for that. In fact, I already implemented it in very much the same way as you suggested - with levels of operations of the same precedence Or'd, and that I did see how higher ordered operations preside over lower ordered ones. However, I did stick with operatorPrecedence rather than the "old-fashioned" way as the former is much more readable. It's good that you posted what you did, as it now confirms that my understanding is (near) correct. > You can then validate this expression with some asserts, using the new '==' > operator defined in 1.4.11: > > assert "(3+6)/5" == calcexpr > assert "((3+6)/5)" == calcexpr This is a very useful technique. Thanks for pointing that out. By the way, your library is a God-send. Thank you and very very much appreciated. There's a helluva lot of problems we can solve very neatly with this at our company. Paul McGuire wrote: > To begin with, just for readability, I would convert your operatorPrecedence > call to: > > calcexpr = operatorPrecedence( w_operand,[ > (op_sign, 1, opAssoc.RIGHT), > (op_mult, 2, opAssoc.LEFT), > (op_div, 2, opAssoc.LEFT), > (op_plus, 2, opAssoc.LEFT), > (op_minus, 2, opAssoc.LEFT), > ]) > > Then, since one of the main purposes of opPrec is to manage levels of > precedence of operations, I would combine operators of the same precedence > to the same level: > > calcexpr = operatorPrecedence( w_operand,[ > (op_sign, 1, opAssoc.RIGHT), > (op_mult | op_div, 2, opAssoc.LEFT), > (op_plus | op_minus, 2, opAssoc.LEFT), > ]) > > (There is also value in minimizing the number of levels in a call to opPrec > - performance can get pretty bad if the number of levels gets too deep.) > > You can then validate this expression with some asserts, using the new '==' > operator defined in 1.4.11: > > assert "(3+6)/5" == calcexpr > assert "((3+6)/5)" == calcexpr > > boolexpr is really just another set of operators with precedence: > > op_comparison = oneOf("= > < <= >= !=") > comparison = calcexpr + op_comparison + calcexpr > boolexpr = operatorPrecedence( comparison, [ > ("NOT", 1, opAssoc.RIGHT), > ("AND", 2, opAssoc.LEFT), > ("OR", 2, opAssoc.LEFT), > ]) > > But for this simple a set of operators, you can also do things the > old-fashioned, by-hand way: > > boolexpr = Forward() > boolTerm = Optional("NOT") + (comparison | Group("(" + boolexpr + ")")) > boolAnd = boolTerm + ZeroOrMore("AND" + boolTerm) > boolOr = boolAnd + ZeroOrMore("OR" + boolAnd) > boolexpr << ( boolOr ) > > This also bypasses some of the overhead cruft in opPrec, and so will perform > better. > > Check it out: > assert "(3+6)/5 > 1 AND 2 < 9" == boolexpr > assert "((3+6)/5 > 1) AND (2 < 9)" == boolexpr > assert "((3+6)/5 > 1) AND ( (2 < 9) OR ( (2+5/12 < 1) AND ( 3 < 4) ) )" == > boolexpr > > You can see how the precedence of operations automatically groups the > operands, so that higher-order operations get evaluated ahead of lower > order, that is "A AND B OR C" correctly evaluates to "(A AND B) OR C": > > def test(s): > from pprint import pprint > print s > try: > pprint (boolexpr.parseString(s).asList()) > except ParseException, pe: > print "Exception:", pe.msg > print > > test("(3+6)/5 > 1 AND 2 < 9") > test("((3+6)/5 > 1) AND (2 < 9)") > test("((3+6)/5 > 1) AND ( (2 < 9) OR ( (2+5/12 < 1) AND ( 3 < 4) ) )") > > prints: > > (3+6)/5 > 1 AND 2 < 9 > [[['3', '+', '6'], '/', '5'], '>', '1', 'AND', '2', '<', '9'] > > ((3+6)/5 > 1) AND (2 < 9) > [['(', [['3', '+', '6'], '/', '5'], '>', '1', ')'], > 'AND', > ['(', '2', '<', '9', ')']] > > ((3+6)/5 > 1) AND ( (2 < 9) OR ( (2+5/12 < 1) AND ( 3 < 4) ) ) > [['(', [['3', '+', '6'], '/', '5'], '>', '1', ')'], > 'AND', > ['(', > ['(', '2', '<', '9', ')'], > 'OR', > ['(', > ['(', ['2', '+', ['5', '/', '12']], '<', '1', ')'], > 'AND', > ['(', '3', '<', '4', ')'], > ')'], > ')']] > > Lastly, you can do the evaluation of these various terms in parse actions > attached to individual expressions in the grammar: > > calcexpr.setParseAction(evaluateCalc) > comparison.setParseAction(evaluateComparison) > boolexpr.setParseAction(evaluateBool) > > (I'll leave the implementation of these parse actions as an exercise - plus > I'm late for work!) > > -- Paul > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ |
From: Paul M. <pt...@au...> - 2008-06-03 15:15:59
|
To begin with, just for readability, I would convert your operatorPrecedence call to: calcexpr = operatorPrecedence( w_operand,[ (op_sign, 1, opAssoc.RIGHT), (op_mult, 2, opAssoc.LEFT), (op_div, 2, opAssoc.LEFT), (op_plus, 2, opAssoc.LEFT), (op_minus, 2, opAssoc.LEFT), ]) Then, since one of the main purposes of opPrec is to manage levels of precedence of operations, I would combine operators of the same precedence to the same level: calcexpr = operatorPrecedence( w_operand,[ (op_sign, 1, opAssoc.RIGHT), (op_mult | op_div, 2, opAssoc.LEFT), (op_plus | op_minus, 2, opAssoc.LEFT), ]) (There is also value in minimizing the number of levels in a call to opPrec - performance can get pretty bad if the number of levels gets too deep.) You can then validate this expression with some asserts, using the new '==' operator defined in 1.4.11: assert "(3+6)/5" == calcexpr assert "((3+6)/5)" == calcexpr boolexpr is really just another set of operators with precedence: op_comparison = oneOf("= > < <= >= !=") comparison = calcexpr + op_comparison + calcexpr boolexpr = operatorPrecedence( comparison, [ ("NOT", 1, opAssoc.RIGHT), ("AND", 2, opAssoc.LEFT), ("OR", 2, opAssoc.LEFT), ]) But for this simple a set of operators, you can also do things the old-fashioned, by-hand way: boolexpr = Forward() boolTerm = Optional("NOT") + (comparison | Group("(" + boolexpr + ")")) boolAnd = boolTerm + ZeroOrMore("AND" + boolTerm) boolOr = boolAnd + ZeroOrMore("OR" + boolAnd) boolexpr << ( boolOr ) This also bypasses some of the overhead cruft in opPrec, and so will perform better. Check it out: assert "(3+6)/5 > 1 AND 2 < 9" == boolexpr assert "((3+6)/5 > 1) AND (2 < 9)" == boolexpr assert "((3+6)/5 > 1) AND ( (2 < 9) OR ( (2+5/12 < 1) AND ( 3 < 4) ) )" == boolexpr You can see how the precedence of operations automatically groups the operands, so that higher-order operations get evaluated ahead of lower order, that is "A AND B OR C" correctly evaluates to "(A AND B) OR C": def test(s): from pprint import pprint print s try: pprint (boolexpr.parseString(s).asList()) except ParseException, pe: print "Exception:", pe.msg print test("(3+6)/5 > 1 AND 2 < 9") test("((3+6)/5 > 1) AND (2 < 9)") test("((3+6)/5 > 1) AND ( (2 < 9) OR ( (2+5/12 < 1) AND ( 3 < 4) ) )") prints: (3+6)/5 > 1 AND 2 < 9 [[['3', '+', '6'], '/', '5'], '>', '1', 'AND', '2', '<', '9'] ((3+6)/5 > 1) AND (2 < 9) [['(', [['3', '+', '6'], '/', '5'], '>', '1', ')'], 'AND', ['(', '2', '<', '9', ')']] ((3+6)/5 > 1) AND ( (2 < 9) OR ( (2+5/12 < 1) AND ( 3 < 4) ) ) [['(', [['3', '+', '6'], '/', '5'], '>', '1', ')'], 'AND', ['(', ['(', '2', '<', '9', ')'], 'OR', ['(', ['(', ['2', '+', ['5', '/', '12']], '<', '1', ')'], 'AND', ['(', '3', '<', '4', ')'], ')'], ')']] Lastly, you can do the evaluation of these various terms in parse actions attached to individual expressions in the grammar: calcexpr.setParseAction(evaluateCalc) comparison.setParseAction(evaluateComparison) boolexpr.setParseAction(evaluateBool) (I'll leave the implementation of these parse actions as an exercise - plus I'm late for work!) -- Paul |
From: Eike W. <ei...@us...> - 2008-06-03 14:12:14
|
On Tuesday 03 June 2008 09:34, happybrowndog wrote: > Hello... > first of all, thanks for your help in my last post on "Function > parameters that may have brackets" - I solved the problem by going > through the code samples as you stated and the problem is solved > now. > > Now I am having problems with parsing logical operations. > > I have an expression: > calcexpr = operatorPrecedence( w_operand,[(op_sign, 1, > opAssoc.RIGHT),(op_mult, 2, opAssoc.LEFT),(op_div, 2, > opAssoc.LEFT),(op_plus, 2, opAssoc.LEFT),(op_minus, 2, > opAssoc.LEFT),]) > > which parses the following correctly: "(3+6)/5". > > then I have: > boolexpr = Forward() > boolexpr << calcexpr + ("=" ^ ">" ^ "<") + calcexpr + > Optional(("AND" ^ "OR") + boolexpr) > > boolexpr then parses the following correctly: "(3+6)/5 > 1 AND 2 < > 9" > > but returns an error on the following: "((3+6)/5 > 1) AND (2 < 9)" > > ultimately, I would like to be able to parse a statement such as: > "((3+6)/5 > 1) AND ( (2 < 9) OR ( (2+5/12 < 1) AND ( 3 < 4) ) )" > > but of course this would also fail. > > To try to get to my goal, I try adding Optional("(") and > Optional(")") to boolexpr, such as: > > boolexpr << Optional("(") + calcexpr + ("=" ^ ">" ^ "<") + > calcexpr + Optional(")") + Optional(("AND" ^ "OR") + boolexpr) > > but this does not resolve the problem. The brackets are screwing > things up. I think the cause may be related to the same reason I > was having problems previously, but I can't seem to link the two. > > Can you please point me in the right direction? You should should add the additional operators to operatorPrecedence. First define: op_and = Keyword('AND') op_or = Keyword('OR') op_eq = Literal('=') op_lt = Literal('<') op_gt = Literal('>') And then add the following to the call to operatorPrecedence.: (op_and, 2, opAssoc.LEFT), (op_or, 2, opAssoc.LEFT), (op_eq, 2, opAssoc.LEFT), (op_lt, 2, opAssoc.LEFT), (op_gt, 2, opAssoc.LEFT), The "AND" belongs to the same level like multiplication, the 'OR' belongs to '+' and the and the comparison operators '= < >' belong to the end (IMHO). (Everything completely untested.) HTH, Eike. |
From: happybrowndog <hap...@ho...> - 2008-06-03 07:34:33
|
Hello... first of all, thanks for your help in my last post on "Function parameters that may have brackets" - I solved the problem by going through the code samples as you stated and the problem is solved now. Now I am having problems with parsing logical operations. I have an expression: calcexpr = operatorPrecedence( w_operand,[(op_sign, 1, opAssoc.RIGHT),(op_mult, 2, opAssoc.LEFT),(op_div, 2, opAssoc.LEFT),(op_plus, 2, opAssoc.LEFT),(op_minus, 2, opAssoc.LEFT),]) which parses the following correctly: "(3+6)/5". then I have: boolexpr = Forward() boolexpr << calcexpr + ("=" ^ ">" ^ "<") + calcexpr + Optional(("AND" ^ "OR") + boolexpr) boolexpr then parses the following correctly: "(3+6)/5 > 1 AND 2 < 9" but returns an error on the following: "((3+6)/5 > 1) AND (2 < 9)" ultimately, I would like to be able to parse a statement such as: "((3+6)/5 > 1) AND ( (2 < 9) OR ( (2+5/12 < 1) AND ( 3 < 4) ) )" but of course this would also fail. To try to get to my goal, I try adding Optional("(") and Optional(")") to boolexpr, such as: boolexpr << Optional("(") + calcexpr + ("=" ^ ">" ^ "<") + calcexpr + Optional(")") + Optional(("AND" ^ "OR") + boolexpr) but this does not resolve the problem. The brackets are screwing things up. I think the cause may be related to the same reason I was having problems previously, but I can't seem to link the two. Can you please point me in the right direction? |
From: Paul M. <pt...@au...> - 2008-05-18 07:18:18
|
-----Original Message----- From: pyp...@li... [mailto:pyp...@li...] On Behalf Of happybrowndog Sent: Saturday, May 17, 2008 3:25 PM To: pyp...@li... Subject: [Pyparsing] Function parameters that may have brackets Hi, I have been trying to solve this, but just can't figure it out. I have a function with parameters that I want to parse. The function is of this form: NUMBERTOSTR(param) where param is any expression that evaluates to an integer or floating point number, for example: 3 + 2 (3 + 2) / 4 ((3 + 2) / 4) + 7 param thus can have any number of brackets. >>>>>>>>>>>>>>>>>>>>>>>>>> Check out fourFn.py or simpleArith.py for examples on how to parse arithmetic expressions. You are on the right track in that Forward() is required, but you aren't really using it correctly. If you model your grammar after simpleArith, then your code will use a call to operatorPrecedence, but the implementation of operatorPrecedence uses Forward to define the nested expressions. The grammar in fourFn.py will show you how Forward should be used. If you don't have these in your pyparsing examples directory, you can get them directly from the pyparsing wiki Examples page: http://pyparsing.wikispaces.com/Examples. Or write back for more help. -- Paul |
From: happybrowndog <hap...@ho...> - 2008-05-18 06:44:57
|
Hi, I have been trying to solve this, but just can't figure it out. I have a function with parameters that I want to parse. The function is of this form: NUMBERTOSTR(param) where param is any expression that evaluates to an integer or floating point number, for example: 3 + 2 (3 + 2) / 4 ((3 + 2) / 4) + 7 param thus can have any number of brackets. So I coded param using the following pyparsing expression: param = Forward() param << ZeroOrMany("(") + Operand + Operator + Operand + ZeroOrMany(")") + ZeroOrMany("(") + Optional(param) + ZeroOrMany(")") and NUMBERTOSTR is coded as follows: NUMBERTOSTR = Literal("NUMBERTOSTR(") + param + Literal(")") Here is the problem: NUMBERTOSTR.parseString("NUMBERTOSTR(( ( (3+2)/4) + 7 ) )") returns an exception of not being able to find a ")" character. The reason appears to be that ZeroOrMany(")") consumes all the right brackets so that Literal(")") in NUMBERTOSTR is left wanting. How do I solve this? |
From: Paul M. <pt...@au...> - 2008-05-12 08:41:54
|
Eike, et al. - I'm happy to report that I've successfully added ErrorStop-like behavior to pyparsing. In the last 6 weeks or so, there has been a flurry of interest and comment on this feature, and between the various proposals, and some offline parser work (in which I was converting an EBNF to pyparsing), I finally got my thoughts to gel on how to add this important feature to pyparsing. I'll excerpt comments from a posting I made to the wiki a few hours ago (in response to a pyparsing user who needed to raise a syntax error from an expression wrapped in an Optional, and so proposed a mod to Optional to correct the problem): >>>>>>>> It turns out that this issue affects many parts of pyparsing, not just Optional. The root problem actually occurs in the And class, in that if a succession of expressions does not parse completely, than a routine ParseException is raised. For example, in your grammar, you found the need to modify Optional because you did not get the desired error location from: port_clause = "(" + ...body of port definition... + ")" entity = Literal("entity") + "(" + \ Optional( Keyword("port") + port_clause ) + \ ")" ParseException is "routine" because it is a way for any expression to indicate that no match occurred, and other alternatives should be tried. However, in this case, we want non-routine behavior. If the parser reads "port" and it is not followed by "(" and the other interesting port items, then the parser should stop immediately. This is a different flavor of And - when "port" is read, you know that the next items in the string should be the port data, and if it isn't then this is a syntax error. Since normal And sequencing is defined using '+' signs, I'm trying to insert the syntax error trapping using another operator. The logical choice for this operator would be '-'; it is equal to '+' in precedence, and it is visually intuitive as a sequence connector. The distinction will be that, if a parser error occurs after passing the '-' operator, then this error will be flagged immediately as a syntax error. (I am adding the exception class ParseSyntaxException, derived from ParseFatalException.) In your case, your code would become: port_clause = "(" + ...body of port definition... + ")" entity = Literal("entity") + "(" + \ Optional( Keyword("port") - port_clause ) + \ ")" The syntax would be the same if Optional were replaced with ZeroOrMore, OneOrMore, or any of the other repetition classes. It is possible now to have a lot of control over just where syntax errors get signaled. You could define an expression as: expr = A + B + C - D + E + F and any parsing mismatch after having matched A, B, and C would be raised as a syntax error, and parsing would stop immediately. <<<<<<<<<<< So that's it. To implement ErrorStop, I've just added the '-' operator, so that "A - B + C" becomes "A + errorStop + B + C". ErrorStop itself is implemented as a private, internal class to And, and I modified And's parseImpl method to do the right thing when detecting errorStop. It should be noted that you shouldn't just blindly go replacing all of your '+' operators with '-'s; backtracking *is* an important feature for most grammars. The general rule for using '-' is to insert it after an element in your grammar that unambiguously determines a particular path in the grammar, so that backtracking would not find any better match. If you want to experiment with this new feature, you can download it from the pyparsing SVN repository on SourceForge. (You'll note that I've bumped the version to 1.5.0 with this update - the number of new features is really moving us to another level of the package, so I'm probably a little overdue in calling this 1.5.0 instead of 1.4.*.) Since early in the life of pyparsing, I have been writing apologetic e-mails about pyparsing's inability to report helpful syntax error locations. I hope this new feature will help address this deficiency. Thanks to all! -- Paul |
From: Gre7g L. <haf...@ya...> - 2008-05-06 16:22:12
|
--- Kjell Magne Fauske <kj...@gm...> wrote: > I stumbled upon the same problem some time ago. > According to Paul, > there is a bug in QuotedString. You can find a > discussion of the > problem, and a workaround here: > http://pyparsing.wikispaces.com/message/view/home/3778969 > > Here is what I ended up with: <snipped> Ah, excellent! Many thanks. I ended up using the more verbose parser that I gave the link to in my second post. Both that and your very-cool-but-funky-regex worked, but the larger parser version made my Python code more easy-to-read and that always wins. :) Thanks again, this community rocks. Gre7g __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |
From: Kjell M. F. <kj...@gm...> - 2008-05-06 06:03:23
|
On Tue, May 6, 2008 at 12:42 AM, Gre7g Luterman <haf...@ya...> wrote: > Hey guys! My parser has been working great, but I've > run into a small snag in parsing C-style escapes. For > example: > > >>> import pyparsing as PP > >>> P = PP.QuotedString(quoteChar='"', escChar="\\") > >>> P.parseString('"This is a quote: \\" This is a > CR: \\n"') > (['This is a quote: " This is a CR: n'], {}) > > As you can see, specifying escChar="\\" worked well at > first, as the parser recognized that \" is a character > in the string and not an end of quote. HOWEVER, when > the \n didn't match the \" pattern it was looking for, > instead of leaving it alone, it completely dropped the > \. > > I need it to leave that \ intact so that I can find > other string constants such as \n, \7, \xFF, etc. I > do this by re-parsing the strings with my code. > Perhaps that is not best? What should I do? > I stumbled upon the same problem some time ago. According to Paul, there is a bug in QuotedString. You can find a discussion of the problem, and a workaround here: http://pyparsing.wikispaces.com/message/view/home/3778969 Here is what I ended up with: ----- ffrom pyparsing import Regex import re s = '"This is a quote: \\" This is a CR: \\n"' qstring2 = Regex(r'\"(?:\\\"|\\\\|[^"])*\"', re.MULTILINE) print qstring2.parseString(s) Output: ['"This is a quote: \\" This is a CR: \\n"'] ---- Hope this helps! - Kjell Magne Fauske |
From: Gre7g L. <haf...@ya...> - 2008-05-05 22:42:22
|
Hey guys! My parser has been working great, but I've run into a small snag in parsing C-style escapes. For example: >>> import pyparsing as PP >>> P = PP.QuotedString(quoteChar='"', escChar="\\") >>> P.parseString('"This is a quote: \\" This is a CR: \\n"') (['This is a quote: " This is a CR: n'], {}) As you can see, specifying escChar="\\" worked well at first, as the parser recognized that \" is a character in the string and not an end of quote. HOWEVER, when the \n didn't match the \" pattern it was looking for, instead of leaving it alone, it completely dropped the \. I need it to leave that \ intact so that I can find other string constants such as \n, \7, \xFF, etc. I do this by re-parsing the strings with my code. Perhaps that is not best? What should I do? TIA, Gre7g __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |
From: Stefaan H. <ste...@gm...> - 2008-05-02 07:55:40
|
Hello Paul, > Yes, it is a bug in nestedExpr. You can get > a patched version of pyparsing from the wiki home page I got it, and it seems to work! Thanks a lot for your kind help. Best regards, Stefaan. > Write back if this does not fix the problem. > > Oops... it fixed the problem, and I still wrote back ;) |
From: Paul M. <pt...@au...> - 2008-05-02 07:24:53
|
Stefaan - Ouch, good catch on this one. Yes, it is a bug in nestedExpr. You can get a patched version of pyparsing from the wiki home page (*still* locked out of my SF SVN account!), which contains a fix to this bug. Write back if this does not fix the problem. Thanks for the note, Stefaan. -- Paul |
From: stefaan.himpe <ste...@gm...> - 2008-05-01 16:40:38
|
> setResultsName creates a copy of the object. The original is left > unchanged. Hello Eike, Thanks a lot! Now I completely understand. Best regards, Stefaan. |
From: Stefaan H. <ste...@gm...> - 2008-05-01 16:27:22
|
Hello Ralph, > I would guess the intention is you can do > > address = ... > command = address.setResultsName('from') + \ > address.setResultsName('to') > > So your `version 1' is having little effect. Thank you! I hadn't looked at it this way. What to me seemed like totally inconsistent behaviour makes kind of sense now. As I understand it now, giving a resultsname only has an effect when using a parser as opposed to when defining one. Thanks! Stefaan. |
From: Eike W. <eik...@gm...> - 2008-05-01 16:18:29
|
On Thursday 01 May 2008 17:54, stefaan.himpe wrote: > Hello, > > I have difficulties in understanding how to use setResultsName > correctly. The problem is illustrated in the code below: > In the first version, r.BODY is empty. In the second version, it > contains what I expected it to contain. But I am not sure what the > difference is between these two versions? setResultsName creates a copy of the object. The original is left unchanged. > > Thanks for any insights. > > Best regards, > Stefaan. > > --- > > import pyparsing as p > > s = """ > SECTION > { > // { > if (1) { dosomething; } else { if (0) then { a; } else { b } } > printf ( "{" ); /* { > { > { > */ > { nog; iets; } > } > """ > > # VERSION 1 > section_body = p.Combine(p.nestedExpr("{","}", > ignoreExpr=p.quotedString|p.cppStyleComment|p.cStyleComment)) > section_body.setParseAction(p.keepOriginalText) This statement has no effect. The parser with the name attached is forgotten. > section_body.setResultsName("BODY") > grammar = p.CaselessKeyword("SECTION").suppress() + section_body > r = grammar.parseString(s) > print r.BODY > > > # VERSION 2 > section_body = p.Combine(p.nestedExpr("{","}", > ignoreExpr=p.quotedString|p.cppStyleComment|p.cStyleComment)) > section_body.setParseAction(p.keepOriginalText) > grammar = p.CaselessKeyword("SECTION").suppress() + \ > section_body.setResultsName("BODY") > r = grammar.parseString(s) > print r.BODY > Kind regards, Eike. |
From: Ralph C. <ra...@in...> - 2008-05-01 16:09:53
|
Hi Stefaan, > I have difficulties in understanding how to use setResultsName > correctly. The problem is illustrated in the code below: In the first > version, r.BODY is empty. In the second version, it contains what I > expected it to contain. But I am not sure what the difference is > between these two versions? > > # VERSION 1 > section_body = p.Combine(p.nestedExpr("{","}", > ignoreExpr=p.quotedString|p.cppStyleComment|p.cStyleComment)) > section_body.setParseAction(p.keepOriginalText) > section_body.setResultsName("BODY") > grammar = p.CaselessKeyword("SECTION").suppress() + section_body > r = grammar.parseString(s) > print r.BODY > > # VERSION 2 > section_body = p.Combine(p.nestedExpr("{","}", > ignoreExpr=p.quotedString|p.cppStyleComment|p.cStyleComment)) > section_body.setParseAction(p.keepOriginalText) > grammar = p.CaselessKeyword("SECTION").suppress() + \ > section_body.setResultsName("BODY") > r = grammar.parseString(s) > print r.BODY I would guess the intention is you can do address = ... command = address.setResultsName('from') + \ address.setResultsName('to') So your `version 1' is having little effect. Cheers, Ralph. |
From: stefaan.himpe <ste...@gm...> - 2008-05-01 15:55:27
|
Hello, I have difficulties in understanding how to use setResultsName correctly. The problem is illustrated in the code below: In the first version, r.BODY is empty. In the second version, it contains what I expected it to contain. But I am not sure what the difference is between these two versions? Thanks for any insights. Best regards, Stefaan. --- import pyparsing as p s = """ SECTION { // { if (1) { dosomething; } else { if (0) then { a; } else { b } } printf ( "{" ); /* { { { */ { nog; iets; } } """ # VERSION 1 section_body = p.Combine(p.nestedExpr("{","}", ignoreExpr=p.quotedString|p.cppStyleComment|p.cStyleComment)) section_body.setParseAction(p.keepOriginalText) section_body.setResultsName("BODY") grammar = p.CaselessKeyword("SECTION").suppress() + section_body r = grammar.parseString(s) print r.BODY # VERSION 2 section_body = p.Combine(p.nestedExpr("{","}", ignoreExpr=p.quotedString|p.cppStyleComment|p.cStyleComment)) section_body.setParseAction(p.keepOriginalText) grammar = p.CaselessKeyword("SECTION").suppress() + \ section_body.setResultsName("BODY") r = grammar.parseString(s) print r.BODY |
From: Stefaan H. <ste...@gm...> - 2008-05-01 14:53:34
|
Hello Paul, Thank you for the prompt and helpful response! And thank you for conceiving and implementing such a wonderful parsing package! Writing parsers never was so addictive... Best regards, Stefaan. |
From: Paul M. <pt...@au...> - 2008-05-01 14:05:59
|
To preserve the original text that matches an expression in your parser, try attaching the parse action keepOriginalText. nC = p.Combine(nE).setParseAction(p.keepOriginalText) -- Paul -----Original Message----- From: pyp...@li... [mailto:pyp...@li...] On Behalf Of stefaan.himpe Sent: Thursday, May 01, 2008 6:23 AM To: pyp...@li... Subject: [Pyparsing] SkipToMatching? Hello, I am trying to parse the following: SECTION { ... } where the ... stands for C++ code. I would like to extract the raw C++ code from this section { }. Problem of course is that the C++ code can also contain { and }. I have tried using nestedExpr, but I cannot seem to get raw C++ code with its original white space (or even with a whitespace that keeps the code compilable). The code below illustrates my latest attempt (I left out the "SECTION" as it is not relevant for the question) import pyparsing as p s = """ { // { if (1) { dosomething; } else { if (0) then { a; } else { b } } printf ( "{" ); /* { { { */ { nog; iets; } } """ nE = p.nestedExpr("{","}",ignoreExpr=p.quotedString|p.cppStyleComment|p.cStyleCom ment) nC = p.Combine(nE) r = nC.parseString(s) The "Combine" more or less returns the contents between the outer { and }, but everything is glued together without whitespace, and some { and } have disappeared: ['// {if(1)dosomething;elseif(0)thena;elsebprintf("{");/* { \n {\n {\n */nog;iets;'] Am I going this the right way? Would it be possible to add something like a SkipToMatching function instead? I imagine I could use it as follows: import pyparsing as p grammar = p.Literal("{").suppress() + \ p.SkipToMatching("{","}",ignoreExpr=p.quotedString|p.cppStyleComment|p.cStyl eComment).setResultsName("RawCode") Best regards, Stefaan. ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javao ne _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |
From: stefaan.himpe <ste...@gm...> - 2008-05-01 11:23:10
|
Hello, I am trying to parse the following: SECTION { ... } where the ... stands for C++ code. I would like to extract the raw C++ code from this section { }. Problem of course is that the C++ code can also contain { and }. I have tried using nestedExpr, but I cannot seem to get raw C++ code with its original white space (or even with a whitespace that keeps the code compilable). The code below illustrates my latest attempt (I left out the "SECTION" as it is not relevant for the question) import pyparsing as p s = """ { // { if (1) { dosomething; } else { if (0) then { a; } else { b } } printf ( "{" ); /* { { { */ { nog; iets; } } """ nE = p.nestedExpr("{","}",ignoreExpr=p.quotedString|p.cppStyleComment|p.cStyleComment) nC = p.Combine(nE) r = nC.parseString(s) The "Combine" more or less returns the contents between the outer { and }, but everything is glued together without whitespace, and some { and } have disappeared: ['// {if(1)dosomething;elseif(0)thena;elsebprintf("{");/* { \n {\n {\n */nog;iets;'] Am I going this the right way? Would it be possible to add something like a SkipToMatching function instead? I imagine I could use it as follows: import pyparsing as p grammar = p.Literal("{").suppress() + \ p.SkipToMatching("{","}",ignoreExpr=p.quotedString|p.cppStyleComment|p.cStyleComment).setResultsName("RawCode") Best regards, Stefaan. |
From: stefaan.himpe <ste...@gm...> - 2008-05-01 11:05:00
|
Hello list, The following works: import pyparsing as p t = """ { ( "{") } """ nE = p.nestedExpr("{","}",ignoreExpr=p.quotedString) r = nE.parseString(t) But after removing one space between the ( and the ", it crashes: import pyparsing as p t = """ { ("{") } """ nE = p.nestedExpr("{","}",ignoreExpr=p.quotedString) r = nE.parseString(t) This looks like a bug to me? Best regards, Stefaan. |
From: Paul M. <pt...@au...> - 2008-04-22 23:42:42
|
Yes, I was too quick to read that part of the RFC, I was working just off the content of json.org, which is a bit squishier on the format then the RFC is. If what we will get is JSON-text, then let's just call it jsonText, and define it as: jsonText = jsonObject | jsonArray jsonText.ignore(jsonComment) print jsonText.parseString(testdata) And we should be confident now that jsonText matches the JSON-Text from the RFC. -- Paul -----Original Message----- From: pyp...@li... [mailto:pyp...@li...] On Behalf Of thomas_h Sent: Tuesday, April 22, 2008 4:47 PM To: Paul McGuire Cc: pyp...@li... Subject: Re: [Pyparsing] jsonParser.py patch Paul, > Hrm, well, I can see that this works, but I'm not sold. A jsonObject > really > *isn't* a choice between a jsonDict and a jsonArray. It really is a '{' > members '}', as per the BNF. I think the real problem is elsewhere. Yes, I wasn't very happy with my naming but I'm not very good at naming anyway and I thought it would do. 'jsonObject' should probably be reserved for what I named 'jsonDict', and what I named 'jsonObject' should be called something else. > The RFC link you gave was very helpful. It states that the content > of the JSON string can be any JSON value, which you tripped over > because you encountered a JSON array when we were parsing for a JSON object. I read this too, but wasn't sure I could trust my understanding. Especially after I ran my current "reference implementation" (demjson) on a file containing just "this is a value", which was dismissed as illegal. The RFC defines (if I get it right) 'JSON-text' as the start symbol and says "A JSON text is a serialized object or array. JSON-text = object / array". That might speak in favor of my approach. I think the BNF (I presume you are referring to the sidebar of json.org) supports your understanding as well since there is no dedicated start symbol. (Actually, I can't find that text again where it says JSON is just a JSON value ...). > > The solution is not to change the definition of jsonObject. The > solution is that we should have been invoking this: > > jsonValue.parseString(testdata) Fair enough. It might be overgenerating, though. > jsonValue already encompasses checking for all the different JSON > value types - not just object and array, but also number, true, > false, etc. Try reverting to the original jsonParser.py, but change > this statement instead and see if you get successful parsing. I'm sure this will work, I'll give it a try. Keep up, Thomas ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javao ne _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |
From: thomas_h <th...@go...> - 2008-04-22 21:47:09
|
Paul, > Hrm, well, I can see that this works, but I'm not sold. A jsonObject really > *isn't* a choice between a jsonDict and a jsonArray. It really is a '{' > members '}', as per the BNF. I think the real problem is elsewhere. Yes, I wasn't very happy with my naming but I'm not very good at naming anyway and I thought it would do. 'jsonObject' should probably be reserved for what I named 'jsonDict', and what I named 'jsonObject' should be called something else. > The RFC link you gave was very helpful. It states that the content of the > JSON string can be any JSON value, which you tripped over because you > encountered a JSON array when we were parsing for a JSON object. I read this too, but wasn't sure I could trust my understanding. Especially after I ran my current "reference implementation" (demjson) on a file containing just "this is a value", which was dismissed as illegal. The RFC defines (if I get it right) 'JSON-text' as the start symbol and says "A JSON text is a serialized object or array. JSON-text = object / array". That might speak in favor of my approach. I think the BNF (I presume you are referring to the sidebar of json.org) supports your understanding as well since there is no dedicated start symbol. (Actually, I can't find that text again where it says JSON is just a JSON value ...). > > The solution is not to change the definition of jsonObject. The solution is > that we should have been invoking this: > > jsonValue.parseString(testdata) Fair enough. It might be overgenerating, though. > jsonValue already encompasses checking for all the different JSON value > types - not just object and array, but also number, true, false, etc. Try > reverting to the original jsonParser.py, but change this statement instead > and see if you get successful parsing. I'm sure this will work, I'll give it a try. Keep up, Thomas |
From: Paul M. <pt...@au...> - 2008-04-22 21:13:57
|
Thomas - Hrm, well, I can see that this works, but I'm not sold. A jsonObject really *isn't* a choice between a jsonDict and a jsonArray. It really is a '{' members '}', as per the BNF. I think the real problem is elsewhere. The RFC link you gave was very helpful. It states that the content of the JSON string can be any JSON value, which you tripped over because you encountered a JSON array when we were parsing for a JSON object. The solution is not to change the definition of jsonObject. The solution is that we should have been invoking this: jsonValue.parseString(testdata) not this: jsonObject.parseString(testdata) which was my mistake in the original code. Not thinking, I assumed that JSON object was the all-encompassing type, but JSON value is what we are really being given. jsonValue already encompasses checking for all the different JSON value types - not just object and array, but also number, true, false, etc. Try reverting to the original jsonParser.py, but change this statement instead and see if you get successful parsing. (You may find that the results are nested an extra level deep, you'll have to pick out element [0] of the data returned from parseString.) I'll fix this in the next release (coming soon, I hope)! Cheers, -- Paul -----Original Message----- From: pyp...@li... [mailto:pyp...@li...] On Behalf Of thomas_h Sent: Tuesday, April 22, 2008 2:43 PM To: pyp...@li... Subject: Re: [Pyparsing] jsonParser.py patch Done. It's at http://pyparsing.pastebin.com/m2c33d08d, if that is of help. Thanks for the hints. Thomas On Tue, Apr 22, 2008 at 6:40 PM, Paul McGuire <pt...@au...> wrote: > Your attachment was stripped somewhere along the way. Could you > please paste it to a http://pyparsing.pastebin.com/? > > Thanks, > -- Paul > > > > -----Original Message----- > From: pyp...@li... > [mailto:pyp...@li...] On Behalf Of > thomas_h > Sent: Tuesday, April 22, 2008 10:35 AM > To: pyp...@li... > Subject: [Pyparsing] jsonParser.py patch > > Hi all, > > I've changed the jsonParser.py example from the distro to accommodate > top-level arrays (as of RFC4627, http://tools.ietf.org/html/rfc4627). > Please find the patch attached. > > Cheers, > Thomas > > ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javao ne _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |