|
From: David G. <go...@py...> - 2012-05-29 18:54:51
|
On Fri, May 18, 2012 at 6:30 PM, Marvin Humphrey <ma...@re...> wrote: > But one of the questions I'd have here is whether it makes more sense to > approach parsing RST from the top down (LL, recursive descent) or from the > bottom up (LR, LALR). If the reference parser for RST uses something like an > extremely complex regular expression, maybe it makes sense to hand-code a > recursive descent parser? > > http://en.wikipedia.org/wiki/Recursive_descent_parser#C_implementation I've forgotten most of what I ever knew about formal parsing, and I didn't approach writing the reST parser in a formal way. The reST parser grew from a finite state machine that I wrote to filter log files in a complex way (e.g. "give me lines that begin with X within 10 lines of lines that contain Y but not after lines that contain Z"). The module docstring of docutils.parsers.rst.states contains a "Parser Overview" (http://docutils.sourceforge.net/docutils/parsers/rst/states.py). It begins, "The reStructuredText parser is implemented as a recursive state machine, examining its input one line at a time." > It might be worthwhile to collect some opinions on stackoverflow.com. And > maybe it's time I bought the Dragon book and read the chapter on parsing. :) You can't go wrong reading the Dragon book. And if you do post on stackoverflow, please provide a link here. -- David Goodger <http://python.net/~goodger> |