From: Morten W. P. <mo...@ni...> - 2003-03-05 15:55:32
|
> Morten W. Petersen wrote: >> I'm skimming through the "reStructuredText Markup Specification" and >> I'm wondering how to remove certain elements from the markup, such as >> inline literals. Any ideas? > > What do you mean? Remove marked-up text from a document, or remove > functionality from the parser? Remove functionality from the parser, make the parser ignore certain elements (without using the :: markup). Regards, Morten W. Petersen Technologies: Zope, Linux, Python, HTML, CSS, PHP Homepage: http://www.nidelven-it.no Phone number: (+47) 45 44 00 69 |
From: Morten W. P. <mo...@ni...> - 2003-03-05 16:31:06
|
> First, why? What's your use case? The simplest solution is, just > don't use that markup in your documents. If it's important, make it > a policy decision in your organization. I'm trying to find a mix of simplified RST that can be used straight away, and at most have 4-5 different markups (easily taught), for example emphasis, bullet lists, blockquotes and simple tables. At the same time, I want to remove the ability to mess things up (``'' is turned into some sort of hyperlink) so that users can do whatever they want when not using the agreed upon markups. > Second, what should the parser do with the markup it ignores? Don't touch it, 'pass it on' in other words. > Third, which type of markup do you want to suppress, inline markup or > block-level markup? Both, I guess. > The parser (implemented in module docutils.parsers.rst.states) deals > with the two separately and differently. Block-level markup is > recognized via an ordered dispatch table; remove the entry for the > markup you don't want, and it won't be recognized. Inline > markup (class Inliner) is recognized with a large regular expression > (Inliner.patterns.initial) that's built from a data structure > (Inliner.parts), plus a table for standalone/implicit markup (URLs > etc.; Inliner.implicit_dispatch). Alter the data structure, rebuild > and reinstall the regexp, and go from there. Of course, you should > work on subclasses or instances so as not to step on toes. OK, I'll have a look at that. Again, useful info. :) > The parser is not set up for this to be really easy to do, because it > hasn't been needed yet. If there are more people out there like me, maybe it would be an idea to refactor a bit to make parts of docutils more like a 'markup parsing framework'? Make it easier to mix-n-match different markups, which could lead to diverse markup 'dialects' of STX; diverse enough to be used by common people without screwing up, and diverse enough for the syntatic programmer. > > (without using the :: markup). > > What does this mean? Something like "making the parser ignore markup without literal blocks". Regards, Morten W. Petersen Technologies: Zope, Linux, Python, HTML, CSS, PHP Homepage: http://www.nidelven-it.no Phone number: (+47) 45 44 00 69 |
From: David G. <go...@py...> - 2003-03-05 19:04:16
|
Morten W. Petersen wrote: >> First, why? What's your use case? The simplest solution is, just >> don't use that markup in your documents. If it's important, make it >> a policy decision in your organization. > > I'm trying to find a mix of simplified RST that can be used straight > away, and at most have 4-5 different markups (easily taught), for > example emphasis, bullet lists, blockquotes and simple tables. What do you do when the user needs one more construct, that isn't included in your simplified set? Personally, I'd rather begin educating with a small number of core constructs, but using the full parser. Simultaneously, give references to the full docs for those who are interested in going further. We don't teach programming languages or natural languages using limited subsets. We begin with simple concepts and build from there. From PEP 287, Questions & Answers, #2: Is reStructuredText *too* rich? For specific applications or individuals, perhaps. In general, no. Since the very beginning, whenever a docstring markup syntax has been proposed on the Doc-SIG_, someone has complained about the lack of support for some construct or other. The reply was often something like, "These are docstrings we're talking about, and docstrings shouldn't have complex markup." The problem is that a construct that seems superfluous to one person may be absolutely essential to another. reStructuredText takes the opposite approach: it provides a rich set of implicit markup constructs (plus a generic extension mechanism for explicit markup), allowing for all kinds of documents. If the set of constructs is too rich for a particular application, the unused constructs can either be removed from the parser (via application-specific overrides) or simply omitted by convention. I'd emphasize the final "or simply omitted by convention" as preferable. > At the same time, I want to remove the ability to mess things up > ... so that users can > do whatever they want when not using the agreed upon markups. > >> Second, what should the parser do with the markup it ignores? > > Don't touch it, 'pass it on' in other words. Please consider carefully: would you really be doing your users a service with this approach? I think back to when I learned Japanese. The class spent the first week learning hiragana, the basic Japanese syllable characters (similar to an alphabet), so we wouldn't get hooked on using "roma-ji" (roman letters, A-Z) as a crutch. When I lived in Japan, I found that people who had learned Japanese with roma-ji reached a point -- learning written language -- beyond which it was very difficult to progress, whereas those who learned with hiragana had a much easier time. Of course, reStructuredText is a much simpler language, but I believe the same principles apply. > (``'' is turned into some sort of hyperlink) It's an error that's turned into a "problematic" element, with a link to the diagnostic explanation. It's an error because of unbalanced double-backquotes. >> The parser is not set up for this to be really easy to do, because it >> hasn't been needed yet. > > If there are more people out there like me, maybe it would be an idea > to refactor a bit to make parts of docutils more like a 'markup > parsing framework'? Make it easier to mix-n-match different > markups, which could lead to diverse markup 'dialects' of STX; > diverse enough to be used by common people without screwing > up, and diverse enough for the syntatic programmer. I don't think encouraging dialects is a good idea. It introduces incompatibilities. This use case doesn't sufficiently justify such changes to me. 'Course, patches are always welcome. -- David Goodger http://starship.python.net/~goodger Programmer/sysadmin for hire: http://starship.python.net/~goodger/cv |
From: Morten W. P. <mo...@ni...> - 2003-03-06 16:01:53
|
> What do you do when the user needs one more construct, that isn't > included in your simplified set? Personally, I'd rather begin > educating with a small number of core constructs, but using the > full parser. Simultaneously, give references to the full docs > for those who are interested in going further. > > We don't teach programming languages or natural languages using > limited subsets. We begin with simple concepts and build from there. > > >From PEP 287, Questions & Answers, #2: > > Is reStructuredText *too* rich? > > For specific applications or individuals, perhaps. In general, no. Exactly. Althought I appreciate your advice, I'd like to try out a basic markup and see if it's enough. The idea isn't to teach them a basic set and later teach them the whole thing; the idea is to find a basic set that's easy to teach, and powerful enough for most (90%+) of the things they need to markup. > I'd emphasize the final "or simply omitted by convention" as > preferable. And what happens when the user does something that's an error according to RST but not part of the markup he/she has learned? > > Don't touch it, 'pass it on' in other words. > > Please consider carefully: would you really be doing your users a > service with this approach? I believe so, yes. :) > I think back to when I learned Japanese. [...] > Of course, reStructuredText is a much simpler language, but I believe > the same principles apply. Yes, the principles might apply if the intention is to learn them everything, starting with a basic markup. It isn't the intention. >> If there are more people out there like me, maybe it would be an idea >> to refactor a bit to make parts of docutils more like a 'markup >> parsing framework'? Make it easier to mix-n-match different >> markups, which could lead to diverse markup 'dialects' of STX; >> diverse enough to be used by common people without screwing >> up, and diverse enough for the syntatic programmer. > > I don't think encouraging dialects is a good idea. It introduces > incompatibilities. This use case doesn't sufficiently justify such > changes to me. 'Course, patches are always welcome. I'd like to play around with it; if there's time, I will. :) Regards, Morten W. Petersen Technologies: Zope, Linux, Python, HTML, CSS, PHP Homepage: http://www.nidelven-it.no Phone number: (+47) 45 44 00 69 |
From: David G. <go...@py...> - 2003-03-05 16:16:22
|
Morten W. Petersen wrote: >> What do you mean? Remove marked-up text from a document, or remove >> functionality from the parser? > > Remove functionality from the parser, make the parser ignore certain > elements First, why? What's your use case? The simplest solution is, just don't use that markup in your documents. If it's important, make it a policy decision in your organization. Second, what should the parser do with the markup it ignores? Third, which type of markup do you want to suppress, inline markup or block-level markup? The parser (implemented in module docutils.parsers.rst.states) deals with the two separately and differently. Block-level markup is recognized via an ordered dispatch table; remove the entry for the markup you don't want, and it won't be recognized. Inline markup (class Inliner) is recognized with a large regular expression (Inliner.patterns.initial) that's built from a data structure (Inliner.parts), plus a table for standalone/implicit markup (URLs etc.; Inliner.implicit_dispatch). Alter the data structure, rebuild and reinstall the regexp, and go from there. Of course, you should work on subclasses or instances so as not to step on toes. The parser is not set up for this to be really easy to do, because it hasn't been needed yet. > (without using the :: markup). What does this mean? -- David Goodger http://starship.python.net/~goodger Programmer/sysadmin for hire: http://starship.python.net/~goodger/cv |