From: David A. <da...@bo...> - 2004-01-19 18:51:51
|
Mark Nodine <no...@so...> writes: > David Abrahams wrote: >> >> David Goodger <go...@py...> writes: >> >> > David Abrahams wrote: >> > > Here's another question: the way I've coded it, once problematic >> > > text is found, I stop trying to recursively find nested markup. >> > > Would you like it to warn about the problem as it does now, and then >> > > just continue to parse it for inline markup? >> > >> > Can you show examples of what your code does now? Here is some input: >> > >> > *emph **strong *prob ``literal``, end of strong**, end of emph* >> > >> > Ideally, I'd like this to parse to: >> > >> > <paragraph> >> > <emphasis> >> > emph >> > <strong> >> > strong >> > <problematic ...> >> > * >> > prob >> > <literal> >> > literal >> > , end of strong >> > , end of emph >> >> It can't. > > It can. The Perl parser gets what David G. wanted. > >> I guess that's the >> interpretation which results the fewest errors, but I think we could >> probably construct cases where the other interpretation would be more >> sensible: >> >> *emph *prob **strong ``literal``, end of strong**, end of emph* > > However, this one did trip up my Perl parser :-(. I'll have to see > what's going on. It's not clear there's a "right answer". The algorithm I outlined in private mail to you and David gets: (*)emph ((*)prob ((**)strong ((``)literal(``)), end of strong(**)), end of emph(*)) ^^^ ^^^-------unmatched If you want: ((*)emph (*)prob ((**)strong ((``)literal(``)), end of strong(**)), end of emph(*)) ^^^ unmatched---^^^ You need a non-deterministic parser and a rule which values outer matching earlier start strings more than later ones (or you have to parse it backwards ;->). Non-deterministic parsers are possible (I've built them), but to do that would be more Perlish than Pythonic, IMO. It's just a case of giving in to the temptation to guess. -- Dave Abrahams Boost Consulting www.boost-consulting.com |