pyparsing-users Mailing List for Python parsing module (Page 29)
Brought to you by:
ptmcg
You can subscribe to this list here.
2004 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
(2) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(2) |
Feb
|
Mar
(2) |
Apr
(12) |
May
(2) |
Jun
|
Jul
|
Aug
(12) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
2006 |
Jan
(5) |
Feb
(1) |
Mar
(10) |
Apr
(3) |
May
(7) |
Jun
(2) |
Jul
(2) |
Aug
(7) |
Sep
(8) |
Oct
(17) |
Nov
|
Dec
(3) |
2007 |
Jan
(4) |
Feb
|
Mar
(10) |
Apr
|
May
(6) |
Jun
(11) |
Jul
(1) |
Aug
|
Sep
(19) |
Oct
(8) |
Nov
(32) |
Dec
(8) |
2008 |
Jan
(12) |
Feb
(6) |
Mar
(42) |
Apr
(47) |
May
(17) |
Jun
(15) |
Jul
(7) |
Aug
(2) |
Sep
(13) |
Oct
(6) |
Nov
(11) |
Dec
(3) |
2009 |
Jan
(2) |
Feb
(3) |
Mar
|
Apr
|
May
(11) |
Jun
(13) |
Jul
(19) |
Aug
(17) |
Sep
(8) |
Oct
(3) |
Nov
(7) |
Dec
(1) |
2010 |
Jan
(2) |
Feb
|
Mar
(19) |
Apr
(6) |
May
|
Jun
(2) |
Jul
|
Aug
(1) |
Sep
|
Oct
(4) |
Nov
(3) |
Dec
(2) |
2011 |
Jan
(4) |
Feb
|
Mar
(5) |
Apr
(1) |
May
(3) |
Jun
(8) |
Jul
(6) |
Aug
(8) |
Sep
(35) |
Oct
(1) |
Nov
(1) |
Dec
(2) |
2012 |
Jan
(2) |
Feb
|
Mar
(3) |
Apr
(4) |
May
|
Jun
(1) |
Jul
|
Aug
(6) |
Sep
(18) |
Oct
|
Nov
(1) |
Dec
|
2013 |
Jan
(7) |
Feb
(7) |
Mar
(1) |
Apr
(4) |
May
|
Jun
|
Jul
(1) |
Aug
(5) |
Sep
(3) |
Oct
(11) |
Nov
(3) |
Dec
|
2014 |
Jan
(3) |
Feb
(1) |
Mar
|
Apr
(6) |
May
(10) |
Jun
(4) |
Jul
|
Aug
(5) |
Sep
(2) |
Oct
(4) |
Nov
(1) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
(13) |
May
(1) |
Jun
|
Jul
(2) |
Aug
|
Sep
(9) |
Oct
(2) |
Nov
(11) |
Dec
(2) |
2016 |
Jan
|
Feb
(3) |
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
(3) |
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(4) |
2017 |
Jan
(2) |
Feb
(2) |
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
(4) |
Aug
|
Sep
|
Oct
(4) |
Nov
(3) |
Dec
|
2018 |
Jan
(10) |
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2020 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2023 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
(1) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(1) |
Aug
(3) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
From: Paul M. <pa...@al...> - 2006-08-16 14:00:15
|
Chris - Thanks for your extensive and thoughtful post. Let me try to respond to some of your comments: 1. I'm pretty sure pyparsing *does* qualify for "recursive descent"-ness. I know you are not just referring to the recursive nature of some grammars, as distinguished by the presence of Forward() expressions containing references to themselves, but to the actual implementation of pyparsing. I receieved an e-mail shortly after my article got posted at OReilly ONLamp, suggesting that pyparsing was not "recursive descent" because there were no examples of recursive grammars. Here is an excerpt from my reply (in which we also discussed the list-parsing example), marked with '>>>>>'s - especially note the paragraph beginning with "But even if": >>>>> Pyparsing does support recursive grammars. At least one is included in the examples directory that comes with pyparsing. Here is a simple grammar from my upcoming presentation at PyCon. It seems that about once a month, someone on comp.lang.python asks how to convert from a string representation of a list back to the original list, without using eval. Here is a non-recursive version, which parses lists that are composed only of simple quoted strings, integers, and floats: -------------- <snip - description of development of recursive list-parsing grammar, culminating in...> listStr = Forward() listItem = real | integer | quotedString.setParseAction(removeQuotes) | Group(listStr) listStr << ( lbrack + Optional(delimitedList(listItem)) + rbrack ) The << operator indicates that we are not binding a new expression to listStr, but "injecting" an expression into the placeholder previously defined as a Forward. With this minor change, we can now parse nested lists, and get back a nested result: test = "['a', 100, [], 3.14, [ +2.718, 'xyzzy', -1.414] ]" print listStr.parseString(test) -------------- prints: ['a', 100, [], 3.1400000000000001, [2.718, 'xyzzy', -1.4139999999999999]] I think I need to submit another article to ONLamp, with some more advanced grammar examples. But even if the example grammars in the article are not recursive, does this necessarily mean pyparsing's parsers are not "recursive descent"? The grammars themselves are composed of ParserElement objects organized into a hierarchy representing the grammar expressions and their respective composition from other, more basic grammar expressions. In this case, the top level expression, listStr, is made up of 3 expressions, which must all match in their turn. The first and third expressions are just literals, and they call their respective parse methods to see if there is a left and right bracket at the start and end of the string. But the second expression is a composite, made up of an Optional element, containing a delimited list of listItems. delimitedList(expr) is a helper function, that returns expr + ZeroOrMore(delim + expr) where the default delim is a literal comma. So from the top level listStr, we call the Optional expression's parse method, which calls the parse method of the And object created by delimitedList - ___recursively descending___ through the grammar until a basic token (such as a Literal or Word) is found, or no match is found and an exception raised. It is because we have used recursion to traverse the grammar structure, that we can use exceptions and exception handlers to take care of backtracking through the grammar, with no separate maintenance of stacks or backtrack pointers. On the other hand, this is the feature that makes pyparsing vulnerable to left recursion, which I have only partially succeeded in addressing with the validate() method. >>>>> It is also this use of the stack for backtracking that makes real parse error detection difficult. 2. From your message: > At each stage we know exactly what tokens we need in order to continue > parsing, otherwise we have an error. I think that it should > be possible to give pyparsing this behaviour, but my attempts at > analysing the source code have left me a bit lost at this point. > Well, in one sense, pyparsing already has this behavior - each parse expression contains its own idea of "what should come next", and they are in turn composed into more complex expressions. Do you mean that pyparsing should explode its grammar down to the raw character level, to become more of a character-based state machine rather than the "recursive traverser of parsing expression objects" that it currently is? Certainly, we could take the basic types such as Literal("abc") and convert to a state machine that at some point contains logic such as: ch = readnext() if ch != 'a': return (errorFlag,currentLocation) ch = readnext() if ch != 'b': return (errorFlag,currentLocation) ch = readnext() if ch != 'c': return (errorFlag,currentLocation) return 'abc' Or an expression to detect a proper noun - Word(alphas.upper(), alphas.lower()) - could be converted to: ch = readnext() if ch not in alphas.upper(): return (errorFlag, currentLocation) wd = ch while 1: ch = readnext() if ch not in alphas.lower(): unreadlastchar() break wd += ch return wd This is not too different from what pyparsing already does. So then let's look at merging these into a more complex expression: phrase = Literal("abc") | Word(alphas.upper(),alphas.lower()) Disjunctions are not really difficult for state machines, but the Python code gets messy. To keep things clean, let's pretend Python supports 'goto', and we'll limit ourselves to only downward jumps: ch = readnext() if ch != 'a': goto <<readProperNoun>> ch = readnext() if ch != 'b': return (errorFlag,currentLocation) ch = readnext() if ch != 'c': return (errorFlag,currentLocation) return 'abc' ch = readnext() <<readProperNoun>> if ch not in alphas.upper(): return (errorFlag, currentLocation) wd = ch while 1: ch = readnext() if ch not in alphas.lower(): unreadlastchar() break wd += ch return wd Still not terrible. What gets messy is when we start to combine disjunction with repetition. Let's just expand phrase slightly now: phrase = OneOrMore( Literal("abc") | Word(alphas.upper(),alphas.lower()) ) and let's parse the text "Honest abe". We fail on the second word, but what should be the reason? Because "abe" wasn't capitalized, or because it didn't end with a 'c'? So in fact, our OneOrMore is satisfied - we *did* find one matching word, "Honest" - so phrase succeeds, and we continue matching at the 'a' in "abe". In fact, if the overall grammar contains: phrase + Literal("abe") We're doing okay. But if the grammar is: phrase + stringEnd() Then we get an "expected end of string" error. So now we add a third candidate to our list of possible failures at the 'a' in "abe". Here's a fourth: what if the input text was really supposed to read "Honest Gabe"? It's possible that more knowledge of the parsing process could be gained by rearchitecting pyparsing to use richer return values and codes (such as the tuple you propose), rather than the current ParseResults for successful parsing and exceptions for failed parsing. Then a higher-level routine could look for the furthest successful parse before failing, and assume that that was where the actual error occurred. I've taken a stab at such logic, but it has fallen short due to the limited scope for the individual parse expression objects. Here's another idea on how to tackle this. Rewrite pyparsing to pass around a text source rather than just the original string. For one thing, this would allow the text source to be a stream instead of a string, which is something I've wanted to add to pyparsing for a while. The text source could also be an object that kept track of "furthest parse location", and possibly also "exception raised at furthest parse location". When parseString ultimately fails, consult the text source for the furthest progress, and report that as the error location. Anyway, I need to get to work, so I've got to sign off just now. If this gives you more ideas, keep 'em coming. -- Paul |
From: Chris L. <ch...@ka...> - 2006-08-15 22:07:35
|
Hi, I am not sure how I would classify a pyparsing parser, but I am not sure that I would call it a recursive decent parser. Lets define a recursive decent parser for parsing nested lists of the form: "[1,2,3,4,[1,2,3]]" A traditionally there is a separation between the lexer(tokenizer) and parser. For our purposes lets assume that we have a lex object available that has a next_token method that returns the next token and a pop method that returns and consumes the next token. It will return a (type, value, line) tuple. class ListParser(object): def __init__(self, lexer): self.lex = lexer def match(self, token_type): tokt,tokv,tokl = lex.pop() if tokt != token_type: raise ParseError("Parse error at line %d: Expected a %s, got a %s(%s)" % (tokl, token_type, tokt, str(tokv)) ) def parse(self): """ this will start the parsing process""" return list() def list(self): """ list ::= '[' <list_item> {',' <list_item>} ']' """ self.match('lbrace') val = [] while True: li = self.list_item() val.append(li) tokt, tokv, tokl = lex.next_token() if tokt == 'rbrace': break; self.match('comma') self.match('rbrace') return val def list_item(self): """ list_item ::= int | string | float | list """ tokt, tokv, tokl = lex.next_token() if tokt in ('int', 'float', 'string'): lex.pop() return tokv if tokt == 'lbrace': return self.list() raise ParseError("Parse error at line %d: Unexpected %s(%s)." % (tokl, tokt, tokv) At each stage we know exactly what tokens we need in order to continue parsing, otherwise we have an error. I think that it should be possible to give pyparsing this behaviour, but my attempts at analysing the source code have left me a bit lost at this point. Once the ParserElement tree has been generated by the user it should be possible to precompute the possible recursions that lead to a particular tuple. For instance for the above we might write the following with pyparsing: integer = Regex(r'[+-]?0|([1-9][0-9]*)') float = Regex(r'[+-]?[0-9]+\.[0-9]+([eE][+-]?[0-9]+)?') string = QuotedString('"', '\\', multiline=True) lbrace = Literal('[') rbrace = Literal('[') mylist = Forward() mylist << lbrace + delimitedList(integer | float | string | mylist ) + rbrace mysyntax = mylist.compile() There are 3 main non-terminal types represented above. First we have the anonymous Or, second, delimitedList which is really a ZeroOrMore, third we have the mylist which is an And object. Lets look at what happens for compile in each of these instances, starting with the anonymous Or In compile it would iterate through each of its sub elements asking for terminals and create a dictionary from a particular terminal to the next ParserElement in the chain: self.path_cache = dict((x.get_terminal(),x) for x in self.exprs) In a parse instead of trying a parse on each element, try a parse on each token: for terminal in self.path_cache: if terminal.matches(): self.path_cache[terminal].parse() break # we take the first mylist which is the And object: Compile procedes as above, except you might want to make the dictionary key on the non-terminal and point to the terminal type for that object. The other notable difference in this case is that delimitedList gives a bunch of terminals that could match(float, string, integer). parsing procedes as normal, but we know whether we fail or not based on our cahed terminal token matcher. In the ZeroOrMany case, we know whether we should continue parsing based on our cached token matcher. Ok, so this is a long post and I suspect it gets less coherent towards the end. If you are open for discussion on making this technique work let me know, I am happy to help and would give it a go myself if I fully understood what was happening in the parse functions as they are. -Chris On Mon, Aug 14, 2006 at 05:55:55PM -0500, Paul McGuire wrote: > Chris - > > I have struggled with this behavior since version 0.1 of pyparsing. I have > made a faint stab at addressing this using the ParseFatalException: when a > parse action detects some semantic incorrectness in input it can raise > ParseFatalException which will stop all parsing and report the exact > location. But since pyparsing uses exceptions as signals to proceed to the > next grammar expression, it's hard to know in advance which exceptions are > really, well, exceptional. > > If anyone can dig up some research on how other recursive-descent parsers > address this error-handling issue, I may be able to fold it into a future > release. But I can't delve into this kind of research on my own at the > moment (pyparsing doesn't pay like my day job does, and isn't likely to > anytime soon). > > -- Paul > > > > -----Original Message----- > > From: pyp...@li... > > [mailto:pyp...@li...] On > > Behalf Of Chris Lambacher > > Sent: Friday, August 11, 2006 4:59 PM > > To: pyp...@li... > > Subject: [Pyparsing] More useful parse errors > > > > Hi, > > > > I am using pyparsing to parse CIM mof files. It is a rather > > lengthy grammar > > with lots of instances of OneOrMore. The top level structure > > is a like this: > > > > bnf = OneOrMore(rest_of_grammar) + StringEnd() > > > > This raises an exception when there is a parse error, but the > > error is not > > flagged at the point where the error occurs. Since everyone > > here should be > > familiar with Python grammar, I will provide an example of > > what would happen > > if we had a pyparsing grammar for python and attempted to > > parse something with > > a syntax error: > > > > > > class blah(object): > > pass > > > > class error_says_here(object): > > def __init__(self): > > self.a = 1 > > self.b = 2 > > > > def error_really_here(self: > > pass > > > > In this case the error would flag as having occurred at > > error_says_here rather > > than error_really_here. > > > > Is there a way to work around this behavior? I have not yet > > come across an > > example of a grammar that looks like it does a really good > > job of error > > handling. Does the architecture of pyparsing make it easy to > > write a parser > > for data that you know is syntacticly correct, but difficult > > or impossible to > > write a parser that can give useful error reports to the user > > in the face of > > syntax errors? > > > > Traditional parsers would know all possible tokens that could > > be next and if > > it encountered something that wasn't one of the above tokens, > > it would flag an > > error at the location. pyparsing in contrast says, will this > > parse, no, ok > > will this parse, no, ok, will this parse? The result is that > > it can throw > > away large portions of valid syntax to eventually end > > immediately after parsing > > the last full rest_of_grammar from our OneOrMore(rest_of_grammar) + > > StringEnd() idiom. > > > > It may be possible for pyparsing to precompute what the > > possible next tokens > > are, which may have the side effect of making pyparsing more > > efficient since > > it won't have so much trial and error. > > > > -Chris > > > > > > -------------------------------------------------------------- > > ----------- > > Using Tomcat but need to do more? Need to support web > > services, security? > > Get stuff done quickly with pre-integrated technology to make > > your job easier > > Download IBM WebSphere Application Server v.1.0.1 based on > > Apache Geronimo > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057& > > dat=121642 > > _______________________________________________ > > Pyparsing-users mailing list > > Pyp...@li... > > https://lists.sourceforge.net/lists/listinfo/pyparsing-users > > > > |
From: Paul M. <pa...@al...> - 2006-08-14 22:56:00
|
Chris - I have struggled with this behavior since version 0.1 of pyparsing. I have made a faint stab at addressing this using the ParseFatalException: when a parse action detects some semantic incorrectness in input it can raise ParseFatalException which will stop all parsing and report the exact location. But since pyparsing uses exceptions as signals to proceed to the next grammar expression, it's hard to know in advance which exceptions are really, well, exceptional. If anyone can dig up some research on how other recursive-descent parsers address this error-handling issue, I may be able to fold it into a future release. But I can't delve into this kind of research on my own at the moment (pyparsing doesn't pay like my day job does, and isn't likely to anytime soon). -- Paul > -----Original Message----- > From: pyp...@li... > [mailto:pyp...@li...] On > Behalf Of Chris Lambacher > Sent: Friday, August 11, 2006 4:59 PM > To: pyp...@li... > Subject: [Pyparsing] More useful parse errors > > Hi, > > I am using pyparsing to parse CIM mof files. It is a rather > lengthy grammar > with lots of instances of OneOrMore. The top level structure > is a like this: > > bnf = OneOrMore(rest_of_grammar) + StringEnd() > > This raises an exception when there is a parse error, but the > error is not > flagged at the point where the error occurs. Since everyone > here should be > familiar with Python grammar, I will provide an example of > what would happen > if we had a pyparsing grammar for python and attempted to > parse something with > a syntax error: > > > class blah(object): > pass > > class error_says_here(object): > def __init__(self): > self.a = 1 > self.b = 2 > > def error_really_here(self: > pass > > In this case the error would flag as having occurred at > error_says_here rather > than error_really_here. > > Is there a way to work around this behavior? I have not yet > come across an > example of a grammar that looks like it does a really good > job of error > handling. Does the architecture of pyparsing make it easy to > write a parser > for data that you know is syntacticly correct, but difficult > or impossible to > write a parser that can give useful error reports to the user > in the face of > syntax errors? > > Traditional parsers would know all possible tokens that could > be next and if > it encountered something that wasn't one of the above tokens, > it would flag an > error at the location. pyparsing in contrast says, will this > parse, no, ok > will this parse, no, ok, will this parse? The result is that > it can throw > away large portions of valid syntax to eventually end > immediately after parsing > the last full rest_of_grammar from our OneOrMore(rest_of_grammar) + > StringEnd() idiom. > > It may be possible for pyparsing to precompute what the > possible next tokens > are, which may have the side effect of making pyparsing more > efficient since > it won't have so much trial and error. > > -Chris > > > -------------------------------------------------------------- > ----------- > Using Tomcat but need to do more? Need to support web > services, security? > Get stuff done quickly with pre-integrated technology to make > your job easier > Download IBM WebSphere Application Server v.1.0.1 based on > Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057& > dat=121642 > _______________________________________________ > Pyparsing-users mailing list > Pyp...@li... > https://lists.sourceforge.net/lists/listinfo/pyparsing-users > > |
From: Chris L. <ch...@ka...> - 2006-08-11 21:58:30
|
Hi, I am using pyparsing to parse CIM mof files. It is a rather lengthy grammar with lots of instances of OneOrMore. The top level structure is a like this: bnf = OneOrMore(rest_of_grammar) + StringEnd() This raises an exception when there is a parse error, but the error is not flagged at the point where the error occurs. Since everyone here should be familiar with Python grammar, I will provide an example of what would happen if we had a pyparsing grammar for python and attempted to parse something with a syntax error: class blah(object): pass class error_says_here(object): def __init__(self): self.a = 1 self.b = 2 def error_really_here(self: pass In this case the error would flag as having occurred at error_says_here rather than error_really_here. Is there a way to work around this behavior? I have not yet come across an example of a grammar that looks like it does a really good job of error handling. Does the architecture of pyparsing make it easy to write a parser for data that you know is syntacticly correct, but difficult or impossible to write a parser that can give useful error reports to the user in the face of syntax errors? Traditional parsers would know all possible tokens that could be next and if it encountered something that wasn't one of the above tokens, it would flag an error at the location. pyparsing in contrast says, will this parse, no, ok will this parse, no, ok, will this parse? The result is that it can throw away large portions of valid syntax to eventually end immediately after parsing the last full rest_of_grammar from our OneOrMore(rest_of_grammar) + StringEnd() idiom. It may be possible for pyparsing to precompute what the possible next tokens are, which may have the side effect of making pyparsing more efficient since it won't have so much trial and error. -Chris |
From: Paul M. <pa...@al...> - 2006-07-06 00:51:27
|
Nice work, Michel -- thanks for sharing! -- Paul > -----Original Message----- > From: pyp...@li... > [mailto:pyp...@li...] On > Behalf Of Michel Pelletier > Sent: Wednesday, July 05, 2006 6:29 PM > To: pyp...@li... > Subject: [Pyparsing] another pyparsing language > > > Hey folks, just thought I'd show off my latest pyparsing > language, xaql > a "Xapian Query Language" for querying the Xapian search engine: > http://www.xapian.org/ You can use it under a BSD license. > > pyparsing makes this stuff easy! > > Enjoy, > > -Michel > |
From: Michel P. <mi...@di...> - 2006-07-05 23:33:14
|
Hey folks, just thought I'd show off my latest pyparsing language, xaql a "Xapian Query Language" for querying the Xapian search engine: http://www.xapian.org/ You can use it under a BSD license. pyparsing makes this stuff easy! Enjoy, -Michel |
From: Paul M. <pa...@al...> - 2006-06-26 15:06:28
|
> -----Original Message----- > From: pyp...@li... > [mailto:pyp...@li...] On > Behalf Of Jean-Paul Calderone > Sent: Sunday, June 25, 2006 8:04 PM > To: pyp...@li... > Subject: [Pyparsing] PyParsing and unicode > > Hey, > > I'm wondering how to match any sequence of > whitespace-separated characters, > including non-ascii. For ASCII, I've just been using > pyparsing.Word(alphanums) but this approach doesn't seem to work for > unicode. > Well, there *are* quite a few other printable characters besides just letters and numbers. Pyparsing defines the constant pyparsing.printables as all non-whitespace 7-bit ASCII characters, that is '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+, -./: ;<=>?@[\\]^_`{|}~' (Contrast this with string.printable, which includes whitespace characters... just different interpretations of what "printable" means, I guess. Note also that the string module defines printable, but the str class does not.) > Also, while trying to figure this out, I tried this: > > > pyparsing.OneOrMore(pyparsing.NotAny(pyparsing.White())).parse > String("hello") > > Running this goes into an infinite loop consuming all CPU > resources. Not > sure if this is a bug worth fixing in PyParsing but I thought > I'd point it > out. > > Jean-Paul > NotAny is merely a negative lookahead, the opposite of FollowedBy. It does *not* advance the parse position, so OneOrMore(NotAny(whatever)) will just loop forever. I think what you are looking for is the opposite of Word, the pyparsing class CharsNotIn. Here's your example, entered at the Python prompt: >>> print pyparsing.OneOrMore(pyparsing.CharsNotIn(" \t\n\r\f")).parseString("hello") ['hello'] I've not tested this with unicode characters though. -- Paul |
From: Jean-Paul C. <ex...@di...> - 2006-06-26 01:03:40
|
Hey, I'm wondering how to match any sequence of whitespace-separated characters, including non-ascii. For ASCII, I've just been using pyparsing.Word(alphanums) but this approach doesn't seem to work for unicode. Also, while trying to figure this out, I tried this: pyparsing.OneOrMore(pyparsing.NotAny(pyparsing.White())).parseString("hello") Running this goes into an infinite loop consuming all CPU resources. Not sure if this is a bug worth fixing in PyParsing but I thought I'd point it out. Jean-Paul |
From: Paul M. <pa...@al...> - 2006-05-13 19:19:18
|
> -----Original Message----- > From: Jean-Paul Calderone [mailto:ex...@di...] > Sent: Saturday, May 13, 2006 1:37 PM > To: Paul McGuire > Cc: pyp...@li... > Subject: Re: [Pyparsing] RE: Pyparsing unicode > <snip> > > I took your lead and produced the attached stand-alone test > based on a couple actions in Imaginary. It includes a > slightly different version of targetString (one that uses > removeQuotes) and shows two different failures. Any insights > you have would be appreciated. > > Jean-Paul > Great test case! Ok, here is the problem: in 1.4.2, I expanded the concept of parse actions to support a list of parse actions, to be executed in a chain, instead of just one. Every call to setParseAction *adds* the current action to the expression's list of parse actions (yes, I see that this is *not* intuitive, and I will fix this! Instead of expanding the behavior of setParseAction, I should have added a new method, something like addParseAction, or something that clearly implies that we are adding the current parse action to whatever other actions have already been defined). Because you are calling setParseAction on the global quotedString, the results get pared back again and again. I might also adjust removeQuotes to only remove the first and last characters if they are in fact quote characters - I'm not sure about this yet, though. Here are 2 possible workarounds for you: 1. use a copy of quotedString in your definition of qstr inside targetString. Change: qstr = pyparsing.quotedString.setParseAction(pyparsing.removeQuotes) To: qstr = pyparsing.quotedString.copy().setParseAction(pyparsing.removeQuotes) 2. Take care to only attach removeQuotes to quotedString *once*, perhaps right after calling import. import pyparsing pyparsing.quotedString.setParseAction(pyparsing.removeQuotes) Then inside targetString, just assign qstr as: qstr = pyparsing.quotedString Sorry for the confusion! Let me know what you settle on doing. -- Paul |
From: Jean-Paul C. <ex...@di...> - 2006-05-13 18:37:04
|
On Sat, 13 May 2006 12:35:02 -0500, Paul McGuire > >I extracted your sample, and made a small test case. On the face of things, >the problem does not seem to be with quotedString, or with your stipper >routine (although you might convert over to using removeQuotes instead of >defining your own function). > >Also, the results from FailTest aren't unicode, but you are expecting >unicode. I think the problem may be in the expression that targetString() >is embedded in. > >Here's my extracted test. > >(As a side question, do your parse actions often assign into the toks >argument, like in the marked line below? I wont go so far as to say this >isn't supported, but I don't think I test for this case very rigorously. I >expected ParseResults to be read, not written. In general, ParseResults get >built up as parsing occurs, and *some* updates (such as del of a slice) are >explicitly implemented and tested. But assigning a specific item back into >an existing ParseResults is not something I had planned for.) Hey Paul, Thanks for the quick response. First, regarding the mutation of the toks argument, I believe stripper is the only function in Imaginary which does this. Since you pointed out pyparsing.removeQuotes, I'll just use that instead, and in the future I'll keep in mind that mutating that object might not be a good idea. I took your lead and produced the attached stand-alone test based on a couple actions in Imaginary. It includes a slightly different version of targetString (one that uses removeQuotes) and shows two different failures. Any insights you have would be appreciated. Jean-Paul |
From: Paul M. <pa...@al...> - 2006-05-13 17:35:09
|
> -----Original Message----- > From: Jean-Paul Calderone [mailto:ex...@di...] > Sent: Saturday, May 13, 2006 11:41 AM > To: Paul McGuire > Subject: Re: Pyparsing unicode > > On Fri, 5 May 2006 15:49:10 -0500, Paul McGuire > <pa...@al...> wrote: > >J-P, > > > >How are things for you with this issue? Is upgrading to 1.4.2 an acceptable > >option for you? I'm seeing through Google that this is giving you a fair > >amount of heartburn - I'm sorry for this, let me know if things are still a > >problem for you. > > Hey Paul, > > I finally got a chance to try out 1.4.2 with Imaginary. The > unicode difficulties seem to be resolved (at least, my test > suite no longer complains about getting str instead of > unicode), but the previously broken tests still don't > actually pass, and actually five new tests are now failing. > > I haven't looked at the failures closely yet, but it looks > like they are mostly in areas which rely on quoted strings > and are failing because instead of the content of the string > being handed back an empty string is received instead. > > Maybe my "targetString" expression isn't being defined > correctly? Here's the definition again: > > def stripper(s, loc, toks): > toks = toks.asList() > toks[0] = toks[0][1:-1] > return toks > > def targetString(name): > qstr = pyparsing.quotedString.setParseAction(stripper) > return ( > pyparsing.Word(pyparsing.alphanums) ^ > qstr).setResultsName(name) > > Failures from the test suite mostly come out looking like this: > > twisted.trial.unittest.FailTest: > ["> create pants 'pair of daisy dukes'", ' created.'] > did not match expected > [u"> create pants 'pair of daisy dukes'", 'Pair of daisy dukes created.'] > (Line 1) > > and: > > twisted.trial.unittest.FailTest: > ["> create 'vending machine' vendy", "Can't find ."] > did not match expected > [u"> create 'vending machine' vendy", 'Vendy created.'] > (Line 1) > > I'll probably investigate this a bit further myself today, > but if you have any hints, they'd be much appreciated. > > Feel free to CC the pyparsing list on your response, if you'd > like to take this discussion back there. > > Thanks, > > Jean-Paul > > Jean-Paul, I extracted your sample, and made a small test case. On the face of things, the problem does not seem to be with quotedString, or with your stipper routine (although you might convert over to using removeQuotes instead of defining your own function). Also, the results from FailTest aren't unicode, but you are expecting unicode. I think the problem may be in the expression that targetString() is embedded in. Here's my extracted test. (As a side question, do your parse actions often assign into the toks argument, like in the marked line below? I wont go so far as to say this isn't supported, but I don't think I test for this case very rigorously. I expected ParseResults to be read, not written. In general, ParseResults get built up as parsing occurs, and *some* updates (such as del of a slice) are explicitly implemented and tested. But assigning a specific item back into an existing ParseResults is not something I had planned for.) -- Paul -------------- import pyparsing def stripper(s, loc, toks): toks = toks.asList() toks[0] = toks[0][1:-1] # <--- assigning to toks return toks def targetString(name): qstr = pyparsing.quotedString.setParseAction(stripper) # stripper could be replaced with the removeQuotes parse action # qstr = pyparsing.quotedString.setParseAction(pyparsing.removeQuotes) return ( pyparsing.Word(pyparsing.alphanums) ^ qstr).setResultsName(name) input = u""" this is a Unicode string with a 'quoted string' in it """ for toks,s,e in targetString("word").scanString(input): print toks.asList() -------------- Gives me the result: [u'this'] [u'is'] [u'a'] [u'Unicode'] [u'string'] [u'with'] [u'a'] [u'quoted string'] [u'in'] [u'it'] |
From: Paul M. <pa...@al...> - 2006-05-04 02:01:17
|
Oh, here are the results under 1.3.3: 1.3.3 ["'foo'"] ['foo'] And here are the results under 1.4.2 (the current released version): 1.4.2 [u"'foo'"] [u'foo'] So I definitely see this problem exists under 1.3.3, but I'd rather not go through a release on this old version track if I can help it. Is it a problem for you to upgrade to 1.4.2? -- Paul |
From: Paul M. <pa...@al...> - 2006-05-04 01:51:22
|
Jean-Paul, Here are my tests with the latest version: from pyparsing import quotedString,__version__ print __version__ def stripper(s, loc, toks): toks = toks.asList() toks[0] = toks[0][1:-1] return toks input = u"'foo'" print quotedString.parseString(input) quotedString.setParseAction(stripper) print quotedString.parseString(input) Prints: 1.4.3 [u"'foo'"] [u'foo'] Can you send me an example of how the behavior changes depending on usage? So far, it looks like the current release does the right thing. -- Paul |
From: Paul M. <pa...@al...> - 2006-05-03 05:23:29
|
Jean-Paul - My first thought is that this is a bug in pyparsing. I'll look into what changed around the 1.3 time frame to see what may have caused this. There is also a more recent version of pyparsing than 1.3.3, you might download from SF and give it a try. I don't expect it to be different in this respect tho. -- Paul |
From: Jean-Paul C. <ex...@di...> - 2006-05-02 14:26:47
|
Hey All, I've been using PyParsing to handle commands in Imaginary (formerly Pottery). So far it's done most of the things I've asked of it, and I think I have some ideas to work around the rest, but the behavior with respect to unicode is a bit confusing. In 1.2 (Ubuntu Breezy packaged version), I could parse a unicode string and get back a unicode string: exarkun@boson:~$ python Python 2.4.2 (#2, Sep 30 2005, 21:19:01) [GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu8)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import pyparsing >>> pyparsing.__version__ '1.2' >>> pyparsing.quotedString.parseString(u"'foo'") ([u"'foo'"], {}) >>> exarkun@boson:~$ However, on upgrading to 1.3 (Ubuntu Dapper packaged version), this no longer appears to be the case: exarkun@kunai:~$ python Python 2.4.3 (#2, Apr 27 2006, 14:43:58) [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import pyparsing >>> pyparsing.__version__ '1.3.3' >>> pyparsing.quotedString.parseString(u"'foo'") (["'foo'"], {}) >>> exarkun@kunai:~$ More confusing, this behavior seems to depend on the exact expression you use to parse a string: sometimes the result will come out as unicode, sometimes not. The exact expression I am using (created by the targetString function here <http://divmod.org/trac/browser/trunk/Imaginary/imaginary/commands.py#L19>) allows either quoted or unquoted strings and, frustratingly, if the quotes are supplied the result is a str, but if they are omitted the result is unicode. I have considered wrapping my usage of PyParsing in an extra layer that does type-checking and decodes when appropriate, but this seems like a hackish work-around for a mis-feature of PyParsing, rather than the correct solution. Is this a bug, am I mis-using PyParsing, or does PyParsing really just not differentiate between these two types? Thanks in advance, Jean-Paul |
From: Paul M. <pa...@al...> - 2006-04-27 19:25:09
|
Try using SkipTo, as in: StringStart() + Word(num,exact=2) + SkipTo(Word(num,exact=2)) + Word(num,exact=2) + StringEnd() Or in the spirit of DRY: twoDigits = Word(nums,exact=2) expr = StringStart() + twoDigits + SkipTo(twoDigits) + twoDigits + StringEnd() -- Paul > -----Original Message----- > From: pyp...@li... > [mailto:pyp...@li...] On > Behalf Of Carl Shimer > Sent: Thursday, April 27, 2006 12:05 PM > To: pyp...@li... > Subject: [Pyparsing] help with catchall style token > > Hi, > > I would like to do something like this with pyparsing: > > Sample regex: > > r'^\d{2}.*\d{2}$' > > this matches on two numbers, followed by whatever is in the > middle, and ends with two numbers. > > I have tried the following with pyparsing but I can't find a > workable solution: > > StringStart() + Word(num,exact=2) + Regex('.*') + > Word(num,exact=2) + StringEnd() > > this doesn't work as the Regex eats everything. > > another thing I tried was > > StringStart() + Word(num,exact=2) + Word(alphas+alphas8bit) + > Word(num,exact=2) + StringEnd() > > this sort of works but doesn't work for utf-8 encoded > characters that may be in the range below alphas. > > Is there a solution here? > > > > |
From: Carl S. <car...@gm...> - 2006-04-27 17:11:24
|
Hi, I would like to do something like this with pyparsing: Sample regex: r'^\d{2}.*\d{2}$' this matches on two numbers, followed by whatever is in the middle, and end= s with two numbers. I have tried the following with pyparsing but I can't find a workable solution: StringStart() + Word(num,exact=3D2) + Regex('.*') + Word(num,exact=3D2) + StringEnd() this doesn't work as the Regex eats everything. another thing I tried was StringStart() + Word(num,exact=3D2) + Word(alphas+alphas8bit) + Word(num,exact=3D2) + StringEnd() this sort of works but doesn't work for utf-8 encoded characters that may b= e in the range below alphas. Is there a solution here? |
From: Paul M. <pa...@al...> - 2006-04-02 07:29:26
|
I've included my PyCon06 presentations and their associated source code examples with the latest pyparsing release, in the new docs directory. Enjoy! -- Paul -----Original Message----- From: pyp...@li... [mailto:pyp...@li...] On Behalf Of imcs ee Sent: Monday, March 27, 2006 10:15 PM To: pyp...@li... Subject: [Pyparsing] where can i get the PyCon06 Pyparsing Presentations i got the below news from sourceforge PyCon06 Pyparsing Presentations Prominently Posted <http://sourceforge.net/forum/forum.php?forum_id=545440> I've posted the S5 HTML files and supporting source code - you can download it at http://www.geocities.com/ptmcg/python/index.html . but i can't access the geo...site many years. is there another method to get these materail? |
From: imcs e. <im...@gm...> - 2006-03-28 04:15:18
|
i got the below news from sourceforge PyCon06 Pyparsing Presentations Prominently Posted<http://sourceforge.net/forum/forum.php?forum_id=3D545440> I've posted the S5 HTML files and supporting source code - you can downloa= d it at http://www.geocities.com/ptmcg/python/index.html . but i can't access the geo...site many years. is there another method to ge= t these materail? |
From: Paul M. <pa...@al...> - 2006-03-26 00:47:58
|
I only suggested the game example as another data point. Best of luck with pyshell. -- Paul -----Original Message----- From: pyp...@li... [mailto:pyp...@li...] On Behalf Of pyp...@mg... Sent: Saturday, March 25, 2006 6:11 PM To: pyp...@li... Subject: Re: [Pyparsing] Combining strings Thank you very much, Paul. I've actually got quite a few more expansions to write so I guess you are correct that multiple passes will probably be better. I've also got to fit these into the correct order with my quoting and escaping code too. I did look at your game, and thought it was pretty neat. However, I'm not really concerned about the actual command execution at this point, there's still too much parsing to do. Also, my original implementation used optparse to handle all the arguments to the internal commands. It works really well and I'm not sure if I'm ready to reinvent the wheel on that yet. I'm working on this a little bit at a time... let me see where it goes. This text and language processing stuff is new to me but it is gradually getting clearer. If anyone is interested in what I'm aiming for, I've uploaded my original prototype shell here to sourceforge as well: http://sourceforge.net/projects/pyshell/ I took over an abandoned project with the same name. Most everything is working pretty well in the original, but I am interested in writing a robust parser for it so it could actually be used professionally, instead of just being a toy. Thanks again, -Mike ------------------------------------------------------------- Mike Miller Earth, Sol, Orion Arm, Milky Way Paul McGuire wrote: ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |
From: <pyp...@mg...> - 2006-03-26 00:06:57
|
Thank you very much, Paul. I've actually got quite a few more expansions to write so I guess you are correct that multiple passes will probably be better. I've also got to fit these into the correct order with my quoting and escaping code too. I did look at your game, and thought it was pretty neat. However, I'm not really concerned about the actual command execution at this point, there's still too much parsing to do. Also, my original implementation used optparse to handle all the arguments to the internal commands. It works really well and I'm not sure if I'm ready to reinvent the wheel on that yet. I'm working on this a little bit at a time... let me see where it goes. This text and language processing stuff is new to me but it is gradually getting clearer. If anyone is interested in what I'm aiming for, I've uploaded my original prototype shell here to sourceforge as well: http://sourceforge.net/projects/pyshell/ I took over an abandoned project with the same name. Most everything is working pretty well in the original, but I am interested in writing a robust parser for it so it could actually be used professionally, instead of just being a toy. Thanks again, -Mike ------------------------------------------------------------- Mike Miller Earth, Sol, Orion Arm, Milky Way Paul McGuire wrote: |
From: Paul M. <pa...@al...> - 2006-03-24 13:59:40
|
Mike - Welcome to the world of pyparsing! This is a good start at implementing a little shell language (perhaps to provide a safe scripting environment?). Your program has just about all the necessary pieces to solve your problem. Actually, I think most of the solution lies in *removing* code! I went through 2 stages in looking at your problem. First I thought of doing a 2-pass parse on this input: first pass to just do the symbol substitution, then the second pass to interpret the modified command string. Second stage was to incorporate the symbol substitution into the overall command parsing, as your program had been doing. I'm still deciding which approach I like better, maybe we'll come back to that. So in stage 1, I looked just at your pyvar definition. (It looks like you want to replace '%(xxx)' or '%xxx' with the contents of some previously defined value 'xxx'): pyvar = Combine( Optional(printables) + Literal('%').suppress() + Optional('(').suppress() + Word(variable) + Optional(')').suppress() + Optional(printables) ) There's too much included in this definition. I suspect you started with something simple, then tried to add the leading and trailing Optional(printables) when your pyvar was embedded in something larger, like a path. If we leave pyvar this way, then the parse action will drop the surrounding data on the floor. (This COULD be fixed by updating the parse action, but I'd prefer to keep pyvar to be just about the '%xxx' syntax.) pyvar = Combine( Literal('%').suppress() + Optional('(').suppress() + Word(variable) + Optional(')').suppress() ) Your input string is: echo /home/%(user)/path/%(prog)/bin/%prog With your simplified pyvar, you can just use it to do the variable substitution in a first pass: print pyvar.transformString(cmdstring) Prints out: echo /home/mgmiller/path/foo/bin/foo So if you add: cmdstring = pyvar.transformString(cmdstring) This will do all the symbol substitution, up front. Now your definition of statement can focus on what your intended command syntax should be: statement = keyword + ZeroOrMore(args) Which parses as you expect, print statement.parseString(cmdstring) Gives: ['echo', '/home/mgmiller/path/foo/bin/foo'] I am a fan of using results names, so I jazzed up your statement definition to read: statement = ( keyword.setResultsName("command") + ZeroOrMore(args).setResultsName("args") ) Which now allows me to write: inputCmd = statement.parseString(cmdstring) print inputCmd.command print inputCmd.args And I get: echo ['/home/mgmiller/path/foo/bin/foo'] Stage 2 required changing around the definition of args, to comprehend that it migh include some embedded pyvars: args = Combine( OneOrMore( Word(alphanums + '_-=/"\'') | pyvar ) ) This give us a working single pass parser, but I'm not keen on this 'pollution' of the args definition. This grammar is fairly simple, so it's not a big deal. But in general, I think variable or macro substitution needs to happen in its own transformation pass first - otherwise, the grammar will need to include lots of "| pyvar" type elements, since you can't predict where they will show up. By separating into two passes, the substitution part is very simple and clean, and the application grammar is simple and clean. Please check out the adventure game presentation I gave at the last PyCon - you can get the source code and see how I implemented a Command pattern to structure the execution of the parsed commands. Good luck! -- Paul -----Original Message----- From: pyp...@li... [mailto:pyp...@li...] On Behalf Of pyp...@mg... Sent: Friday, March 24, 2006 3:56 AM To: pyp...@li... Subject: [Pyparsing] Combining strings Hi, I'm trying to get a string back in one piece and the Combine class doesn't seem to be working. I'm not sure how to fix it. Can anyone help? Here's the output: echo /home/%(user)/apps/%(prog)/bin/%prog ['echo', '/home/', 'mgmiller', '/apps/', 'foo', '/bin/', 'foo'] I'd like to get the path back as one string ... how can I do that? Thanks if you can help, -Mike ---------------------- #!/bin/env python from pyparsing import * import string, os namespace = { 'a': 5, 'user':'mgmiller', 'prog':'foo' } cmdstring = r''' echo /home/%(user)/path/%(prog)/bin/%prog ''' def getpyvar(s, loc, match): 'return a python type variable from our own namespace' if namespace.has_key(match[0]): return namespace[match[0]] # define grammar keyword = oneOf('alias dir echo ver') args = Word(alphanums + '_-=/"\'') variable = alphas + '_' # expansions pyvar = Combine( Optional(printables) + Literal('%').suppress() + Optional('(').suppress() + Word(variable) + Optional(')').suppress() + Optional(printables) ) pyvar.setParseAction(getpyvar) statement = ( keyword + ZeroOrMore(pyvar | args) ) print cmdstring print statement.parseString(cmdstring) ---------------------- -Mike ------------------------------------------------------------- Mike Miller Earth, Sol, Orion Arm, Milky Way ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |
From: <pyp...@mg...> - 2006-03-24 09:52:08
|
Hi, I'm trying to get a string back in one piece and the Combine class doesn't seem to be working. I'm not sure how to fix it. Can anyone help? Here's the output: echo /home/%(user)/apps/%(prog)/bin/%prog ['echo', '/home/', 'mgmiller', '/apps/', 'foo', '/bin/', 'foo'] I'd like to get the path back as one string ... how can I do that? Thanks if you can help, -Mike ---------------------- #!/bin/env python from pyparsing import * import string, os namespace = { 'a': 5, 'user':'mgmiller', 'prog':'foo' } cmdstring = r''' echo /home/%(user)/path/%(prog)/bin/%prog ''' def getpyvar(s, loc, match): 'return a python type variable from our own namespace' if namespace.has_key(match[0]): return namespace[match[0]] # define grammar keyword = oneOf('alias dir echo ver') args = Word(alphanums + '_-=/"\'') variable = alphas + '_' # expansions pyvar = Combine( Optional(printables) + Literal('%').suppress() + Optional('(').suppress() + Word(variable) + Optional(')').suppress() + Optional(printables) ) pyvar.setParseAction(getpyvar) statement = ( keyword + ZeroOrMore(pyvar | args) ) print cmdstring print statement.parseString(cmdstring) ---------------------- -Mike ------------------------------------------------------------- Mike Miller Earth, Sol, Orion Arm, Milky Way |
From: <pyp...@li...> - 2006-03-09 22:50:51
|
Hi, I'm trying to parse a potentially complex quoted string. I've used the quotedString with success for simple cases, unfortunately the grammar I need it a bit more complex. I need it to take into account an escape char "\" and also throw an exception when the number of quotes is not even. Like a bash script it must follow these rules, and maybe more I'm forgetting. * Quotes must be same type and paired, otherwise throw exception (how do I force it to do that?) * An unquoted or double quoted escape char encodes next char and escape is removed (including quotes) * A single quoted escape char is ignored and left unchanged * An escaped quote at the edge or middle of a word should not be split up, should stay where it is I've got this much code for this problem, but it doesn't meet the criteria above. The escape chars are not found before the double quotes, and interior quotes are being split on. Also I need and exception to be thrown for the last token. #!/bin/env python from pyparsing import * cmdstring = r''' echo \z "\Hello" '\Hello' 'Hel\lo' \'Hello' ''' print cmdstring def encode(s, loc, match): 'encode a char into url format' return '%%%s' % ord(match[0]) # define grammar sq_args = sglQuotedString.setParseAction(removeQuotes) dq_args = dblQuotedString.setParseAction(removeQuotes) args = Word(alphanums + '_-=/') #CharsNotIn(''' "' ''') nextchar = Word(printables, exact=1) nextchar.setParseAction(encode) escaped = Suppress('\\') + nextchar statement = ( ZeroOrMore( sq_args | escaped | dq_args | args ) ) try: print statement.parseString(cmdstring) except ParseException, pe: print pe Thanks if you can help, -Mike ------------------------------------------------------------- Mike Miller Earth, Sol, Orion Arm, Milky Way |
From: Paul M. <pa...@al...> - 2006-03-07 02:03:52
|
Well, perhaps you could use something like: SkipTo( "?" | stringEnd ) This will read everything up to the first '?', or to the end of the string (assuming that you are parsing just the request url. -- Paul -----Original Message----- From: pyp...@li... [mailto:pyp...@li...] On Behalf Of Tom Wiebe Sent: Monday, March 06, 2006 7:07 PM To: pyp...@li... Subject: Re: [Pyparsing] splitting the query from a url? Cool, I was afraid that I was missing something simple, some sort of 'splitOn('?')' function. <snip> |