|
From: David G. <go...@py...> - 2002-12-05 03:16:31
|
I have begun work on a Python source Reader component for Docutils. I expect the work to go slowly, as there is lots to absorb, much earlier work to study and learn from, and little spare time to devote. I'm trying to keep it as simple as possible, mostly for my own benefit (lest my brain explode). I've looked over the HappyDoc code and Tony "Tibs" Ibbs' PySource prototype. HappyDoc uses the stdlib "parser" module to parse Python modules into abstract syntax trees (ASTs), but that seems difficult and fragile, the ASTs being so low-level. Tibs' prototype uses the much higher-level ASTs built by the stdlib "compiler" module, which are much easier to understand. I've decided to use the "compiler" module also. My first stumbling block is in parsing assignments. I want to extract the right-hand side (RHS) of assignments straight from the source. In his prototype, Tibs rebuilds the RHS from the AST, but that seems rather roundabout and the results may not match the source perfectly (equivalent, but not character-for-character). I think using the "tokenize" module in parallel with "compiler" may allow the code to extract the raw RHS text, as well as other raw text that doesn't make it verbatim to the AST. So, is there any prior art out there? Any pointers or advice? -- David Goodger <go...@py...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |