Re: [Pyparsing] RE: python indentation grammar
Brought to you by:
ptmcg
From: Michel P. <mi...@di...> - 2005-08-31 00:38:17
|
On Thu, 2005-08-18 at 13:57 -0500, Paul McGuire wrote: > I have made a few attempts at indentation-based parsing in the past, but I > looked at them last night, and they are really not so good. I think the key > will be in a) using a parse action with col() to detect the indentation > level of the current line, and b) keeping a global stack of indentations > levels seen thus far, so that you can tell if your current line is part of > the current indent level, a deeper level or a higher level. Well I have made a bit more progress on this, as well as some great progress on the sparql parser with pyparsing. On the indentation problem, I have the following module. The relavent pyparsing code is down near the end: https://svn.cignex.com/public/slipr/slipr/slipr.py It's pretty self contained. When this module is run, it tries to parse the test file: https://svn.cignex.com/public/slipr/data/pyinrdf.slpr and I've got everything matching fine, except the whitespace. ;) For some reason I can't get the whitespace action to work right, it only matches about every other whitespace in the doc. Here's some of the output. Notice at the end how some of the whitespace is not matched between tags: <tag> <name> <identifier>RDF</identifier> </name> <attrs> <name> <identifier>python</identifier> </name> <string>"http://namespaces.zemantic.org/python#"</string> </attrs> </tag> [' '] [' '] <tag> <name> <identifier>Ontology</identifier> </name> <attrs> <name> <identifier>python</identifier> </name> <value> <identifier>bob</identifier> </value> </attrs> </tag> [' '] [' '] <tag> <name> <identifier>Class</identifier> </name> <attrs> <name> <identifier>Object</identifier> </name> </attrs> </tag> [' '] <tag> <name> <identifier>issubclass</identifier> </name> <attrs> <name> <identifier>Object</identifier> </name> </attrs> </tag> <tag> <name> <identifier>isinstance</identifier> </name> <attrs> <name> <identifier>Object</identifier> </name> </attrs> </tag> I'm not sure what's wrong, can anyone spot a simple error or suggest another way to handle the indentation issue? Thanks, -Michel |