Re: [Pyparsing] Problem with grammar
Brought to you by:
ptmcg
From: Paul M. <pt...@au...> - 2004-11-14 21:07:34
|
Andy - Welcome to pyparsing!!! This e-mail list has gotten very little activity so far, so pretty much *everyone* is a first-timer! Here are some comments on your grammar. You really have most things "working", but I'd like to clear up some of the concepts that you are just a bit off-center. 1. It is not necessary, and in some cases incorrect, to pre-declare all of you parse expressions as Forward(). Forward() is only necessary when creating a recursive grammar (such as an arithmetic expression, which may contain within itself other arithmetic expressions, or lists within lists, etc.). I can see how you would incrementally build up your grammar this way (since your definition of procToken is built up from several other expressions, procName, dataStmt and semi) - I do this also, sort of a top-down definition. But pyparsing (given its Python basis) requires bottom-up definition, so you approached this by just declaring Forwards for the sub expressions. This is okay, but when you go the next step and start filling in the blanks of the sub-expressions, you should move back up in the code, and *replace* the Forward() with Literal(';') (in the case of semi, for instance). 2. You must be careful when mixing strings and expressions. This works: expr = Literal("proc") + ":" This doesn't: semi = Forward() procToken = CaselessLiteral( "proc" ) + procName + Optional(dataStmt) + semi semi = ';' The definition of procToken will reference an empty Forward(), not the string ';'. If you are coming from C++ world, it is important to know that in Python, '=' is *not* an operator. So you can't define special behavior for assignment if the target is of a special class. 3. I have confused you about the meaning of delimitedList. This is *not* for declaring a list of items in your grammar. It is a short cut for declaring that you expect a list of items in the incoming text stream. For example, to read in a set of integers separated by commas, such as "0, 1, 2,3,4,11,0,12", use: integer = Word(nums) integerList = delimitedList(integer) # ',' delimiter is the default In your example, in which you expect one of several specific procNames, you can try using the oneOf helper function. procName = oneOf( "print tabulate summary" ) Or you could use the '|' operator, as in: procName = Literal("print") | Literal("tabulate") | Literal("summary") which actually matches your grammar definition more closely. Or do as you did, using MatchFirst's constructor from a list: procName = MatchFirst( [ "print", "tabulate", "summary" ] ) But all of these statements will result in the exact same grammar, since oneOf generates a MatchFirst of the words listed in the incoming string. In summary, your grammar is actually quite close, just the confusion over delimitedList and the reversed order. You can try making the corrections yourself, or glance over the following suggested version. -- Paul > ident = Word( alphas, alphanums ) > semi = Literal(";") > dataStmt = CaselessLiteral( "data" ) + "=" + ident > procName = MatchFirst( [ "print", "tabulate", "summary" ] ) # this works, but I prefer using '|' or oneOf > runStmt = CaselessLiteral( "run" ) + semi > procToken = CaselessLiteral( "proc" ) + procName + Optional(dataStmt) + semi ----- Original Message ----- From: "Andy Elvey" <and...@pa...> To: <pyp...@li...> Sent: Sunday, November 14, 2004 1:55 PM Subject: [Pyparsing] Problem with grammar > Hi all - > I'm a first-timer here, and am having a problem with my parser. > I'm trying to write a parser that follows the following rules - > > proc_statement = "proc" + procname + Optional(data_statement) + semicolon + > run_statement > proc_name = "print" or "summary" or "tabulate" > data_statement = "data" + "=" + dataset_name + semicolon > run_statement = "run" + semicolon > > I am getting this error, and I have no idea why - > Traceback (most recent call last): > File "minisas.py", line 28, in ? > procName = MatchFirst( delimitedList( "print", "tabulate", "summary", > delim="," ) ) > TypeError: delimitedList() got multiple values for keyword argument 'delim' > > The following examples would all be legal, according to the grammar > ( Note - the "run" does not have to be on the same line as the proc > statement ) > *** Start of examples *** > proc print; run; > > proc print data=fred; run; > > proc summary; run; > > proc summary data=mydata2; run; > > proc tabulate; run; > *** End of examples *** > > Here is what I have so far - > *** Start of code *** > > # minisas.py > # > # simple demo of a SAS-like language > # > from pyparsing import * > > def test( str ): > print str,"->" > try: > tokens = sasgrammar.parseString( str ) > print "tokens = ", tokens > except ParseException, err: > print " "*err.loc + "^\n" + err.msg > print err > print > > > # Define tokens > sasprog = Forward() > procName = Forward() > semi = Forward() > dataStmt = Forward() > runStmt = Forward() > ident = Forward() > procToken = CaselessLiteral( "proc" ) + procName + Optional(dataStmt) + semi > procName = MatchFirst( delimitedList( "print", "tabulate", "summary", > delim="," ) ) > dataStmt = CaselessLiteral( "data" ) + "=" + ident > ident = Word( alphas, alphanums ) > runStmt = CaselessLiteral( "run" ) + semi > semi = ";" > > # Define the grammar > sasgrammar << sasprog > > # Test the grammar > test( "proc print ; run ; " ) > test( "proc print data = fred ; run ; " ) > test( "proc contents ; run ; " ) > test( "proc contents data = mydata2 ; run ; " ) > test( "proc tabulate; " ) > test( "proc tabulate data = foo3; run ; " ) > > *** End of code *** > > So, any help is very much appreciated, as I am totally lost ... :-) > Many thanks in advance - > - Andy > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: InterSystems CACHE > FREE OODBMS DOWNLOAD - A multidimensional database that combines > robust object and relational technologies, making it a perfect match > for Java, C++,COM, XML, ODBC and JDBC. www.intersystems.com/match8 > _______________________________________________ > Pyparsing-users mailing list > Pyp...@li... > https://lists.sourceforge.net/lists/listinfo/pyparsing-users |