Re: [Pyparsing] Beginner parsing problem
Brought to you by:
ptmcg
From: Florian L. <mai...@xg...> - 2013-01-05 20:16:52
|
Hey Paul! Thanks for the welcome and the thorough explanation! I have more or less solved my original problem in the mean time, but still I'm at the very beginning! Unfortunately I don't have a formal description of the language I try to model. It's thought along the lines of c++. So I'll need to fiddle with the allowed characters in my primitives. (BTW: It's the configuration language of the OpenFOAM CFD tool box http://www.openfoam.com/) My primitives are: ident = Word(alphanums + ".") semi = Literal(";").suppress() lcb = Literal("{").suppress() rcb = Literal("}").suppress() I'll keep in mind what you said about excludeChars and maybe change ident that way, I'll have to try out. One problem I've encountered is that a key-value pair could be like that key and all this is value; I catch that with: FKeyValue = Group(ident + SkipTo(semi) + semi) Since the file could also have key value pairs at the root level (not within any dict) I do: ParameterFile = ZeroOrMore(FDictionary | FKeyValue) Dictionaries could be arbitrarily nested FDictionary = Forward() FDictionary << Dict(Group(ident + lcb + Dict(ZeroOrMore(FKeyValue | FDictionary)) + rcb)) I still have problems getting the recursive definition right (the underlying problem is probably getting the recursive defintion right ;-) My sample text is: prob = """dictname { subdict { key value; key2 value2; } }""" and parsing.dump() that gives: [['dictname', ['subdict', '{\n key value'], ['key2', 'value2']]] - dictname: [['subdict', '{\n key value'], ['key2', 'value2']] - key2: value2 - subdict: { key value Thanks for any suggestions and have a nice weekend! Florian Am Samstag, 5. Januar 2013, 10:59:04 schrieb Paul McGuire: > Florian - > > Welcome to pyparsing! > > When writing your parser, you'll have to keep in mind that pyparsing does > not do any kind of lookahead unless you explicitly tell it to. "printables" > is a string containing all ASCII characters that are not whitespace - this > includes the ';' character. So when you define your FKeyValue value part as > "Word(printables)", this will consume all non-whitespace characters, even > the terminating ';'. This is in contrast to something you might do in a > regular expression, in which ".*;" would match "lslsd;" - the regular > expression implicitly terminates the ".*" when it sees the semicolon. But > pyparsing is purely left-to-right, unless you include some lookahead escapes > of your own. > > One way to do this in the Word construct is to be more selective in the > string that you use to create the expression - in this case, we'll try just > doing every printable character except for ';'. Instead of > "Word(printables)", you could do "Word(''.join(c for c in printables if c != > ';'))". I found myself doing this quite a lot and it annoyed me, so I > added a convenience argument to Word, excludeChars. You can define a Word > using a large string of characters, and then just exclude one or two of > them, in your case like this: Word(printables, excludeChars=';') Now if > you use this expression for your value expression in FKeyValue, it should > parse better. > > By extension, I would also suggest that you narrow down what you expect to > see as the identifiers in your key and dictionary, so that you don't > accidentally read in braces or other punctuation, perhaps something like: > > identifier = Word(alphas, alphanums) > FKeyValue = identifier + Word(printables,excludeChars=';') + ";" > FDictionary = identifier + "{" + OneOrMore( Group(FKeyValue) ) + "}" > > Also, by Grouping your FKeyValue's, it will help you iterate over the > key-value pairs, as it will give them more organizing structure. > > Please look over some of the articles that are linked from the wiki's > Documentation page (http://pyparsing.wikispaces.com/Documentation), for more > examples and expression topics. Also, the Discussion tab > (http://pyparsing.wikispaces.com/page/messages/home) of the wiki's Home page > includes many Q&A threads on various pyparsing problems. > > Best of luck, > -- Paul McGuire > > > > -----Original Message----- > From: Florian Lindner [mailto:mai...@xg...] > Sent: Saturday, January 05, 2013 4:40 AM > To: pyp...@li... > Subject: [Pyparsing] Beginner parsing problem > > Hello, > > I've just started working with pyparsing > > > from pyparsing import * > > text = """ > FoamFile > { > version 2.0; > format ascii; > class volVectorField; > object U; > }""" > > text2 = " class volVectorField;" > > FKeyValue = Word(printables) + Word(printables) + ";" > FDictionary = Word(printables) + "{" + OneOrMore( FKeyValue ) + "}" > > print FKeyValue.parseString(text2) # Works fine print > FDictionary.parseString(text) # Fails <<< > > (I use the F prefix to avoid name clashes with pyparsing stuff, might change > set and switch to more selective import). > > The last print fails: > > Traceback (most recent call last): > File "parse.py", line 22, in <module> > print FKeyValue.parseString(text2) > File "/home/florian/scratch/pyparsing.py", line 1006, in parseString > raise exc > pyparsing.ParseException: Expected ";" (at char 28), (line:1, col:29) > > > What is wrong there? If I understood the documentation right, newlines are > ignored, just like whitespace. > > It's pyparsing downloaded from the 1.5.x svn branch. > > Thanks, > Florian > > ---------------------------------------------------------------------------- > -- > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, > Windows 8 Apps, JavaScript and much more. Keep your skills current with > LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and > experts. SALE $99.99 this month only -- learn more at: > http://p.sf.net/sfu/learnmore_122912 > _______________________________________________ > Pyparsing-users mailing list > Pyp...@li... > https://lists.sourceforge.net/lists/listinfo/pyparsing-users |