From: Beni C. <cb...@us...> - 2004-11-21 07:23:57
|
I'm working on a normal implementaion of my math hacks as directives and roles in the sandbox. It made me think of several issues: 1. A directive to set the default role is needed. What should be the scope? I presume effecting the rest of the file is simplest to implement and understand. It should store that information somewhere on the parser, currently it's global. 2. The current implementation of the ``role::`` directive is broken: the new role is registered globally and lives until the python process exits. There is code in DocutilsTestSupport.py that masks this bug by flushing `roles._roles` before every document. This is not done by real code, only by testing (so e.g. buildhtml.py has a problem) and is a dirty approach anyway. And the same problem seems to happen with translations. I can see how to refactor the code into a class. My only question is what should `register_canonical_role()` and especially `register_local_role()` do? If an "application" should call `register_local_role()`, should it call on a roles registry object associated with a specific parser? How should the API look towards it? 3. The current state of affairs for sandbox implementations of directives and roles is that one must wrap them in a special writer. That's very problematic because if one wants to use two sanbox extensions at the same time, he needs to write a special writer. We need some kind of plugins mechanism. Plugins should be selected by the user, not the document (for security reasons). It's best if they can be installed just by placing files somewhere; command-line options and environment variables are much less convenient because they can't be packaged for installation with things like RPM. -- Beni Cherniavsky <cb...@us...>, who can only read email on weekends. |
From: Felix W. <Fel...@gm...> - 2004-11-24 16:06:26
|
Beni Cherniavsky wrote: > We need some kind of plugins mechanism. True. > Plugins should be selected by the user, not the document (for security > reasons). It's best if they can be installed just by placing files > somewhere I agree. Should they be activated in the reST document? Like:: .. use:: itex (Or "plugin" or "import" instead of "use".) IMO that would be better, because such a directive (like "use") indicates that a plugin is needed. Otherwise incompatible reST documents would result and people sometimes wouldn't even know *why* they are incompatible. -- When replying to my email address, please ensure that the mail header contains 'Felix Wiemann'. http://www.ososo.de/ |
From: Felix W. <Fel...@gm...> - 2004-11-24 17:13:02
|
Felix Wiemann wrote: > [plugins:] Should they be activated in the reST document? I mean: "Should they have to be explicitly activated in the reST document in order to be used?" (As opposed to loading plugins automatically.) -- When replying to my email address, please ensure that the mail header contains 'Felix Wiemann'. http://www.ososo.de/ |
From: Beni C. <cb...@us...> - 2004-12-03 10:38:01
|
Felix Wiemann wrote: > Beni Cherniavsky wrote: > >>Plugins should be selected by the user, not the document (for security >>reasons). It's best if they can be installed just by placing files >>somewhere > > I agree. Should they be activated in the reST document? Like:: > > .. use:: itex > > (Or "plugin" or "import" instead of "use".) > Let me add ``require`` to the alternatives. I'm not sure which is best. First we should decide whether the name in the directive names a given plugin (then it's hard to have multiple plugins implementing the same feature) or a given reST extension > IMO that would be better, because such a directive (like "use") > indicates that a plugin is needed. Otherwise incompatible reST > documents would result and people sometimes wouldn't even know *why* > they are incompatible. > On one hand, it's important to have a hint in the document for the plugins needed to process it. On the other hand, I see at least some of these plugins being integrated into docutils core as they mature and then such a statement would become redudant. The ``from __future__`` approach in Python seems good: after becoming built-in, the directive is allowed and ignored. +1. -- Beni Cherniavsky <cb...@us...>, who can only read email on weekends. |
From: Felix W. <Fel...@gm...> - 2004-12-17 21:38:08
|
Beni Cherniavsky wrote: > Felix Wiemann wrote: > >> Should they [plugins] be activated in the reST document? Like:: >> .. use:: itex >> (Or "plugin" or "import" instead of "use".) > > Let me add ``require`` to the alternatives. ``require`` looks fine. > I'm not sure which is best. First we should decide whether the name in > the directive names a given plugin (then it's hard to have multiple > plugins implementing the same feature) or a given reST extension I agree that it should refer to an extension [1]_, not a plugin, also because a plugin can contain multiple extensions. .. [1] Though a Docutils, not a reST extension. :-) >> IMO that would be better, because such a directive (like "use") >> indicates that a plugin is needed. > > On one hand, it's important to have a hint in the document for the > plugins needed to process it. On the other hand, I see at least some > of these plugins being integrated into docutils core as they mature > and then such a statement would become redudant. The ``from > __future__`` approach in Python seems good: after becoming built-in, > the directive is allowed and ignored. +1. -- When replying to my email address, please ensure that the mail header contains 'Felix Wiemann'. http://www.ososo.de/ |
From: David G. <go...@py...> - 2004-12-23 23:02:11
Attachments:
signature.asc
|
[Felix Wiemann] >>> Should they [plugins] be activated in the reST document? Like:: >>> .. use:: itex >>> (Or "plugin" or "import" instead of "use".) [Beni Cherniavsky] >> Let me add ``require`` to the alternatives. [Felix Wiemann] > ``require`` looks fine. +1 What happens if a required extension isn't installed? >> I'm not sure which is best. First we should decide whether the name >> in the directive names a given plugin (then it's hard to have >> multiple plugins implementing the same feature) or a given reST >> extension > > I agree that it should refer to an extension [1]_, not a plugin, > also because a plugin can contain multiple extensions. > > .. [1] Though a Docutils, not a reST extension. :-) +1 Each extension in a plugin module would have the equivalent of Emacs-Lisp's "provide". A plugin module could look something like this: def my_role(role, rawtext, text, lineno, inliner, options={}, content=[]): ... def my_extension(publisher): # check for applicable component here? publisher.parser.register_local_role('my_role', my_role) docutils_extensions = {'my_extension': my_extension} "install_plugins" could be a method of the Publisher, called in docutils.core.publish_cmdline & .publish_programmatically, immediately before the calls to pub.publish. >>> IMO that would be better, because such a directive (like "use") >>> indicates that a plugin is needed. >> >> On one hand, it's important to have a hint in the document for the >> plugins needed to process it. On the other hand, I see at least >> some of these plugins being integrated into docutils core as they >> mature and then such a statement would become redudant. The ``from >> __future__`` approach in Python seems good: after becoming >> built-in, the directive is allowed and ignored. > > +1. +1. That means that we'd have to maintain a list of extensions that have become built-ins though. Not a biggie. -- David Goodger <http://python.net/~goodger> |
From: Beni C. <cb...@us...> - 2004-12-25 20:07:06
|
David Goodger wrote: > [Felix Wiemann] > >>> Should they [plugins] be activated in the reST document? Like:: > >>> .. use:: itex > >>> (Or "plugin" or "import" instead of "use".) > > [Beni Cherniavsky] > >> Let me add ``require`` to the alternatives. > > [Felix Wiemann] > > ``require`` looks fine. > > +1 > > What happens if a required extension isn't installed? > An error is reported, which typically results in the document still built but with a big red message (and probably there will be errors at the points the extension is used). The error might link to some page on docutils.sf.net with explainations and a list of all known plugins. > >> I'm not sure which is best. First we should decide whether the name > >> in the directive names a given plugin (then it's hard to have > >> multiple plugins implementing the same feature) or a given reST > >> extension > > > > I agree that it should refer to an extension [1]_, not a plugin, > > also because a plugin can contain multiple extensions. > > > > .. [1] Though a Docutils, not a reST extension. :-) > > +1 > > Each extension in a plugin module would have the equivalent of > Emacs-Lisp's "provide". A plugin module could look something like > this: > > def my_role(role, rawtext, text, lineno, inliner, > options={}, content=[]): > ... > > def my_extension(publisher): > # check for applicable component here? > publisher.parser.register_local_role('my_role', my_role) > > docutils_extensions = {'my_extension': my_extension} > > "install_plugins" could be a method of the Publisher, called in > docutils.core.publish_cmdline & .publish_programmatically, immediately > before the calls to pub.publish. > This means you import all of them and have an extra layer to choose which to import. I'm not sure what's the win over a simple scheme where requiring ``foo`` uses the plugin module ``foo.py`` and calls `foo.install(publisher)`. -- Beni Cherniavsky <cb...@us...>, who can only read email on weekends. |
From: Beni C. <cb...@us...> - 2004-12-25 22:53:05
|
Felix Wiemann wrote: > Beni Cherniavsky wrote: > >>Felix Wiemann wrote: >> >>>Should they [plugins] be activated in the reST document? Like:: >>> .. use:: itex >>>(Or "plugin" or "import" instead of "use".) >> >>I'm not sure which is best. First we should decide whether the name in >>the directive names a given plugin (then it's hard to have multiple >>plugins implementing the same feature) or a given reST extension > > I agree that it should refer to an extension [1]_, not a plugin, also > because a plugin can contain multiple extensions. > > .. [1] Though a Docutils, not a reST extension. :-) > Let's hash out the extension(s) <-> plugin(s) relationships on math support, assuming it'll be done entirely with plugins: - We want multiple input formats (LaTeX, asciimath, Lout, etc.). Each format will have a directive and a role to write it. - Each format can be translated to multiple formats. There can be alternative implementations for a single output format (e.g. LaTeX can be converted to MathML or to images for HTML output). Consulting PEP 258, the output of the directive/role should probably be a custom math node, recording the input format either in the node type or as an attribute of the node. These custom nodes can be handled by writer-dependent transforms, outputting raw nodes. A reST document containg latex-math should certainly not indicate that it needs a transform that e.g. outputs raw html nodes containing MathML. It should only ``.. require latex-math`` and be processable to any format with many plugins. But somebody must indicate which transform gets it. For different output formats, plugins could not install themselves unless the writer is correct. For multiple implementations of the same format, this seems a perfect job for a config file. This way, plugin names (which are more transient than document content) are kept outside the document. The way to select should not be individual enable/disable options because then merging several config files can easily lead to 0 or more than 1 being selected. An option giving an order of plugin priorities seems best. BTW, I asked in another mail whether we need plugin names separate from the extension they implement. It is now clear to me that we do. Should each competing plugin should implement the same ``latex-math`` directive and role with redudant code? Apperently they have to, because each should be self-contained and we cannot assume all plugins translate it to the same node type. What happens if two plugins implementing the same directive are enabled simultaneosly? I think this should signal an error - the user must select which one he wants. -- Beni Cherniavsky <cb...@us...>, who can only read email on weekends. |
From: Felix W. <Fel...@gm...> - 2004-12-26 12:39:27
|
Beni Cherniavsky wrote: > Let's hash out the extension(s) <-> plugin(s) relationships on math > support, assuming it'll be done entirely with plugins: > > - We want multiple input formats (LaTeX, asciimath, Lout, etc.). Each format > will have a directive and a role to write it. > > - Each format can be translated to multiple formats. There can be alternative > implementations for a single output format (e.g. LaTeX can be converted to > MathML or to images for HTML output). > > Consulting PEP 258, the output of the directive/role should probably > be a custom math node, recording the input format either in the node > type or as an attribute of the node. These custom nodes can be > handled by writer-dependent transforms, outputting raw nodes. I wouldn't use transforms but rather visit_/depart_ methods. > A reST document containg latex-math should certainly not indicate that > it needs a transform that e.g. outputs raw html nodes containing > MathML. It should only ``.. require latex-math`` Agreed. > and be processable to any format with many plugins. You could theoretically replace the plugin providing the requested extension, but I do not think it's a good idea to have an extension provided by more than one plugin. I imagine to have a plugin directory in which you store all your plugins (the *.py files). Docutils imports all plugins, scans them for available extensions and remembers the extensions. If an extension is .. require'd by a reST document, Docutils activates the requested extension. But if the extension is provided by two or more plugins, Docutils complains (using a system message) and refuses to load any of the matching extensions. For how to implement support for multiple ways of generating output (say, MathML vs. images), I'd suggest using an option (like --latexmath-html-images and --latexmath-html-mathml). The visit_latexmath (and depart_latexmath) method supplied by the latex-math extension then checks which option has been set. > But somebody must indicate which transform gets it. For different > output formats, plugins could not install themselves unless the writer > is correct. I don't think plugins or extensions should (have to) install themselves at all. It's better to have the extension provide an attribute which basically says that the extension consists of the parts X, Y and Z (e.g., a role, a directive, and writer support). And the extension parts X, Y and Z each have an attribute stating which component(s) it belongs to. (E.g., the role would belong to the "restructuredtext parser".) Docutils does the job of determining the matching component and registering the extension part. (See sandbox/felixwiemann/plugins/interface.py for an example of how this might look like from a plugin's point of view.) > For multiple implementations of the same format, this seems a perfect > job for a config file. This way, plugin names (which are more > transient than document content) are kept outside the document. IMO plugin names shouldn't occur anyway, not even in the config file. The plugins should be picked up by Docutils automatically. (But see above on how I think that should work.) > The way to select should not be individual enable/disable options > because then merging several config files can easily lead to 0 or more > than 1 being selected. If you don't enable or disable transforms (as you suggested above) but use a visit_ and a depart_ method, the extension simply has to provide options which have a common target and thus override the previous state. No problem with merging. > Should each competing plugin should implement the same ``latex-math`` > directive and role with redudant code? (Plugins don't provide directives. Extensions do.) Generally, if two extensions partly need the same code, they can either ask Docutils to provide the other extension. Or, more probably, you'd implement the two extensions in the same plugin and let them share common functions and classes. > What happens if two plugins implementing the same directive are > enabled simultaneosly? s/plugins/extensions/ That's an error, of course. This means that the two extensions are simply incompatible. At least when used in conjunction with the reST parser (because otherwise the directive isn't registered anywhere). -- When replying to my email address, please ensure that the mail header contains 'Felix Wiemann'. http://www.ososo.de/ |
From: Beni C. <cb...@us...> - 2005-01-02 00:07:31
|
Felix Wiemann wrote: > Beni Cherniavsky wrote: > >>Let's hash out the extension(s) <-> plugin(s) relationships on math >>support, assuming it'll be done entirely with plugins: >> >>- We want multiple input formats (LaTeX, asciimath, Lout, etc.). Each format >> will have a directive and a role to write it. >> >>- Each format can be translated to multiple formats. There can be alternative >> implementations for a single output format (e.g. LaTeX can be converted to >> MathML or to images for HTML output). >> >>Consulting PEP 258, the output of the directive/role should probably >>be a custom math node, recording the input format either in the node >>type or as an attribute of the node. These custom nodes can be >>handled by writer-dependent transforms, outputting raw nodes. > > I wouldn't use transforms but rather visit_/depart_ methods. > I was assuming per PEP 258 that it's bad to use non-standard nodes between the transfromer and the writer. But it is indeed cleaner so let's say that a plugin can extend the list of "standard" nodes ;-). >>A reST document containg latex-math should certainly not indicate that >>it needs a transform that e.g. outputs raw html nodes containing >>MathML. It should only ``.. require latex-math`` > > Agreed. > >>and be processable to any format with many plugins. > > You could theoretically replace the plugin providing the requested > extension, but I do not think it's a good idea to have an extension > provided by more than one plugin. > > I imagine to have a plugin directory in which you store all your plugins > (the *.py files). Docutils imports all plugins, scans them for > available extensions and remembers the extensions. If an extension is > .. require'd by a reST document, Docutils activates the requested > extension. But if the extension is provided by two or more plugins, > Docutils complains (using a system message) and refuses to load any of > the matching extensions. > > For how to implement support for multiple ways of generating output > (say, MathML vs. images), I'd suggest using an option (like > --latexmath-html-images and --latexmath-html-mathml). The > visit_latexmath (and depart_latexmath) method supplied by the latex-math > extension then checks which option has been set. > Having such an option would be nice but it requires all latex-math conversions to be implemented in the same plugin. I don't think we can assume that. LaTeX->MathML and LaTeX->images convertors will probably be implemented by different persons. Even if they are implemented as one plugin, any new output variant will have to be added as part of the same plugin. That would return us to square one: to allow independent implementions, the latex-math plugin would have to implement a system of sub-plugins... >>But somebody must indicate which transform gets it. For different >>output formats, plugins could not install themselves unless the writer >>is correct. > > I don't think plugins or extensions should (have to) install themselves > at all. It's better to have the extension provide an attribute which > basically says that the extension consists of the parts X, Y and Z > (e.g., a role, a directive, and writer support). And the extension > parts X, Y and Z each have an attribute stating which component(s) it > belongs to. (E.g., the role would belong to the "restructuredtext > parser".) Docutils does the job of determining the matching component > and registering the extension part. > +1 >>For multiple implementations of the same format, this seems a perfect >>job for a config file. This way, plugin names (which are more >>transient than document content) are kept outside the document. > > IMO plugin names shouldn't occur anyway, not even in the config file. > The plugins should be picked up by Docutils automatically. (But see > above on how I think that should work.) > Sure, as long as you have at most one plugin providing the same extension, you don't need plugin names. In fact, they don't have to have any name distinct from the extension name. What I propose to allow multiple plugins exclusively implementing the same extension, without complicating the simple case: - There can be installed 0 or more plugins implementing any single extension. - When an extension is required by a file, docutils checks which plugins implement it. - If there is exactly one, it is silently used. - If there are none, it's an error. - If there are more than one, the command line / config file must choose which one to use - otherwise it's an error. - The choice is done by plugin name, which is just the file/directory name which it is installed (so that even if two people from different galaxies write two plugins with the same name, it's trivial for the user to install them under distinct names). - Of course, it's better not to rename them unless necessary, otherwise config files referring to them become non-portable. - An alternative would be to use Java-style or xmlns-style URL-based plugin IDs. I don't think it's worth the trouble. The other upside of helping to find it on the net is negligible given Google. > >>The way to select should not be individual enable/disable options >>because then merging several config files can easily lead to 0 or more >>than 1 being selected. > > If you don't enable or disable transforms (as you suggested above) but > use a visit_ and a depart_ method, the extension simply has to provide > options which have a common target and thus override the previous state. > No problem with merging. > No problem for any single plugin. I was talking solely of the options (or equivallently, command-line args) needed to select a plugin among several that implement the same extension. > >>Should each competing plugin should implement the same ``latex-math`` >>directive and role with redudant code? > > (Plugins don't provide directives. Extensions do.) > > Generally, if two extensions partly need the same code, they can either > ask Docutils to provide the other extension. Or, more probably, you'd > implement the two extensions in the same plugin and let them share > common functions and classes. > >>What happens if two plugins implementing the same directive are >>enabled simultaneosly? > > s/plugins/extensions/ > I'm not sure I get your model right here. Do you mean that a "plugin" is a bundle of code distributed together that consists of one or more "extensions", each independently requirable by the user? To me, that is a bundle of plugins distributed together, whereas an "extension" is not code at all - it's the addition to reST syntax implemented by them (e.g. the ability to use ``latex-math`` directives and roles). -- Beni Cherniavsky <cb...@us...>, who can only read email on weekends. |
From: Bob M. <bob...@mc...> - 2004-12-26 20:56:44
|
I'm going to jump in here...I've been following this thread, and I am the author/maintainer of latexwiki. I would like to get my latex code into docutils as a plugin/extension. Right now latexwiki will only do latex with stx because it can be placed in-line without too much trouble. But I would like a reST + LaTeX mode as well, and much code can be shared. My code generates in-line images for HTML output and aligns them, and also interfaces with itex for MathML output. What work is being done on this, and by whom, and is there any test code I can use as a starting point? Please forgive my ignorance of the inner workings of docutils...reST is my weak point in this discussion. ;) Beni Cherniavsky [cb...@us...] wrote: > Let's hash out the extension(s) <-> plugin(s) relationships on math > support, assuming it'll be done entirely with plugins: > > - We want multiple input formats (LaTeX, asciimath, Lout, etc.). Each > format > will have a directive and a role to write it. Do we really want asciimath or Lout? There is exactly one well-understood, widely used syntax for math, and that is LaTeX. Lout appears to have near-zero usage, and I'd be happy to give you a long list of what is wrong with the asciimath syntax from a mathematical perspective. > - Each format can be translated to multiple formats. There can be > alternative > implementations for a single output format (e.g. LaTeX can be converted to > MathML or to images for HTML output). This is not really a "translation" but rather a function of the output. I don't see how a plugin/extension could possibly be required to know about all possible output formats. Yes, xml, html, latex, output can be handled, but what about when someone writes a pdf or svg output target? Such a "plugin" is fundamentally intertwined with the output module. > Consulting PEP 258, the output of the directive/role should probably be a > custom math node, recording the input format either in the node type or as > an attribute of the node. These custom nodes can be handled by > writer-dependent transforms, outputting raw nodes. I think 'math' should be a node. > A reST document containg latex-math should certainly not indicate that it > needs a transform that e.g. outputs raw html nodes containing MathML. It > should only ``.. require latex-math`` and be processable to any format with > many plugins. At the input document level, this is an extension of the reST input syntax, and we can parse it (especially if we require exactly one math syntax) into the document tree, ignorant of whether the output module can actually deal with it. The output plugins should therefore be of the form: math->html4.0 + gif math->xhtml + png math->xhtml + gif math->xml + mathml e.g. they are output-only and there are many, for each possible type of output. (and as mentioned before, whether you use png's or gif's should be configured in a config file or as a command line option that gets passed to the writer -- not part of the input document) > But somebody must indicate which transform gets it. For different output > formats, plugins could not install themselves unless the writer is correct. Does it do any harm to parse the math input directive into the document tree? Can the output simply generate a warning: "don't know what to do with node 'math'...skipping". > For multiple implementations of the same format, this seems a perfect job > for a config file. Or command line options to rest2html. ;) > BTW, I asked in another mail whether we need plugin names separate from the > extension they implement. It is now clear to me that we do. Yes, see the above list of math transforms. I could have the math->xml+mathml output-plugin installed, but not the others. > Should each competing plugin should implement the same ``latex-math`` > directive and role with redudant code? Apperently they have to, because > each should be self-contained and we cannot assume all plugins translate it > to the same node type. What happens if two plugins implementing the same > directive are enabled simultaneosly? I think this should signal an error - > the user must select which one he wants. I think a 'math' directive should be part of the base docutils, the plugins should only provide the output side. Or perhaps latex-input and mathml-output can be two separate plugins, the latter depending on the former. Of course if you insist on alternative math input syntaxes then the input and output plugins must be separated. But again I think latex is the only reasonable one to implement. Note that computer algebra systems (the other major source of mathematical content) can all generate latex output. This whole discussion is stretching my abstract-discussion abilities. I need something to hack on to understand it better... -- Cheers, Bob McElrath [Univ. of California at Davis, Department of Physics] "It's not the people who vote that count. It's the people who count the votes." -- Joseph Stalin |
From: Felix W. <Fel...@gm...> - 2004-12-25 21:26:58
|
David Goodger wrote: > [.. require:: extension] > > What happens if a required extension isn't installed? System message level 3 or 4, I'd say. > Each extension in a plugin module would have the equivalent of > Emacs-Lisp's "provide". A plugin module could look something like > this: > > def my_role(role, rawtext, text, lineno, inliner, > options={}, content=[]): > ... > > def my_extension(publisher): > # check for applicable component here? > publisher.parser.register_local_role('my_role', my_role) > > docutils_extensions = {'my_extension': my_extension} > > "install_plugins" could be a method of the Publisher, called in > docutils.core.publish_cmdline & .publish_programmatically, immediately > before the calls to pub.publish. I'd rather do it all (role and extension) using classes. I'm certainly not an OOP fanatic, but the functions simply lack extensibility. And these copy'n'paste parameter lists are bad; they can be replaced by instance variables. Furthermore, functions lack the introspection feature (using issubclass) to automatically pick up all available extensions (Docutils finds all subclasses of Extension in a module) or to recognize the type of an extension-component (the reST parser finds out whether the class being registered is a subclass of Role or Directive). Extensions shouldn't need to register their components (like with "publisher.parser.register_local_role" above). That's too complicated, and it's unnecessary because it is theoretically sufficient for an extension to tell Docutils which components it has. Docutils then can do the registering. I checked in an example plugin at sandbox/felixwiemann/interface.py as a kind of an informal design proposal, which demonstrates how I imagine writing a plugin, following the objective that contributing (i.e. writing a plugin) should be made as easy as possible. Please feel free to add your comments in the source (of interface.py), make any changes you want, and make a mess as you like. (I'll take the diff in the checkin mail as your reply to my proposal.) Or you can write a reply to the checkin mail generated by the addition of interface.py, if you prefer that. -- When replying to my email address, please ensure that the mail header contains 'Felix Wiemann'. http://www.ososo.de/ |
From: Felix W. <Fel...@gm...> - 2004-12-25 21:57:00
|
Beni Cherniavsky wrote: > David Goodger wrote: > >> "install_plugins" could be a method of the Publisher, called in >> docutils.core.publish_cmdline & .publish_programmatically, >> immediately before the calls to pub.publish. Do I understand you correctly that you want to import all available plugins using the install_plugins() method? (If not, then what's the purpose of install_plugins()?) By the way, would you agree on the terminology plugin=module, extension=something-that-is-supplied-by-a-plugin? We also need another method the reST parser calls when encoutering a "require" directive. E.g. "install_extension" (or "require_extension"), with the name of the extension supplied as first parameter. (This method should live in the Publisher, too, I presume.) > This means you import all of them and have an extra layer to choose > which to import. I'm not sure what's the win over a simple scheme > where requiring ``foo`` uses the plugin module ``foo.py`` and calls > `foo.install(publisher)`. One plugin (i.e. a module, e.g. foo.py) may provide multiple extensions (e.g. "foo", "foo-2" and "foobar"). So you cannot guess the module name (i.e. plugin name) from the extension name and thus all plugins have to be imported to determine the available extensions. -- When replying to my email address, please ensure that the mail header contains 'Felix Wiemann'. http://www.ososo.de/ |
From: Felix W. <Fel...@gm...> - 2005-01-13 21:15:15
|
Beni Cherniavsky wrote: > I was assuming per PEP 258 that it's bad to use non-standard nodes > between the transfromer and the writer. But it is indeed cleaner so > let's say that a plugin can extend the list of "standard" nodes ;-). Yes. >> For how to implement support for multiple ways of generating output >> (say, MathML vs. images), I'd suggest using an option (like >> --latexmath-html-images and --latexmath-html-mathml). The >> visit_latexmath (and depart_latexmath) method supplied by the >> latex-math extension then checks which option has been set. > > Having such an option would be nice but it requires all latex-math > conversions to be implemented in the same plugin. I don't think we > can assume that. True. I hadn't thought of that, but it's an important point. We'll have to keep that in mind while designing the extension interface. See below for my comments. (I just checked in a new revision of the extension-interface-design file, incorporating design changes which I deemed necessary after reading your posting. Sorry for changing things all the time. But I think it's getting better...) > - If there are more than one [plugins implementing the same > extension], the command line / config file must choose which one > to use - otherwise it's an error. Generally yes, but I'd say: "If there are more than one writer extensions implementing support for the same node, ...". ... because extensions are currently (in the interface in my sandbox) only extending *one* component. I.e., there is an extension which implements support for one writer for a given node and does nothing else. In the math example, there might be a MathHTMLImages and a MathHTMLMathML extension. So the actual math support is implemented using *many* extensions, not only one. (Further extensions required for math support might be a MathRole extension, which implements the role in the reST parser, and a MathLaTeX extension, which implements math support for the LaTeX writer.) For another example, please have a look at <http://docutils.sourceforge.net/sandbox/felixwiemann/plugins/interface.py>. For the :key: role plugin (which this file implements), there are *three* extensions: A ParserExtension subclass (the KeyRole) and two WriterExtension subclasses (KeyNodeHTMLSupport and KeyNodeLaTeXSupport). When writing ".. require:: key" in reST, only the KeyRole-extension is activated (= registered). The KeyNodeHTMLSupport extension is only registered if the HTML writer encounters a KeyNode (which it cannot handle natively). Same for LaTeX. > - The choice is done by plugin name, which is just the > file/directory name which it is installed No. That's bad. The two implementation choices might be implemented in the same file. IMO a plugin, as a physical unit, should not be used like a logical unit. When the issue of "competing" implementations arises (and I'm quite sure it will, sooner or later), we can add support for it. E.g., we could add a `name` attribute to the two competing WriterExtensions and then expect the user to pass a --prefer=math-html-mathml or --prefer=math-html-images option, which just says which extension should be preferred if there are two extension matching a given request. I don't think we should add support immediately, because it complicates things a little, it would be kind of over-design (contrary to the XP spirit), it's not difficult to add support for competing extensions in a compatible way later, and I want to get plugin support as soon (and quickly) as possible. > I'm not sure I get your model right here. Do you mean that a "plugin" > is a bundle of code distributed together that consists of one or more > "extensions", each independently requirable by the user? Basically yes, except that only *some* extensions (usually ParserExtensions) need to be explicitly required by the user. Others are registered always, or implicitly when an unknown node is encountered. > To me, that is a bundle of plugins distributed together, whereas an > "extension" is not code at all - it's the addition to reST syntax > implemented by them (e.g. the ability to use ``latex-math`` directives > and roles). This definition has the disadvantage of centering around the (reST) parser, which is unnecessarily specific. I don't think it can work that way. -- When replying to my email address, please ensure that the mail header contains 'Felix Wiemann'. http://www.ososo.de/ |
From: Felix W. <Fel...@gm...> - 2005-01-14 23:42:33
|
Oh sorry, this posting seems to have suffered from a 'little' delay of ten days... -- When replying to my email address, please ensure that the mail header contains 'Felix Wiemann'. http://www.ososo.de/ |
From: David G. <go...@py...> - 2004-12-23 22:58:45
Attachments:
signature.asc
|
[Beni Cherniavsky] > I'm working on a normal implementaion of my math hacks as directives > and roles in the sandbox. It made me think of several issues: > > 1. A directive to set the default role is needed. What should be > the scope? I presume effecting the rest of the file is simplest > to implement and understand. It should affect the remainder of the current document, since a document may consist of several included files. Minor point. > It should store that information somewhere on the parser, > currently it's global. True, +1. > 2. The current implementation of the ``role::`` directive is broken: > the new role is registered globally and lives until the python > process exits. There is code in DocutilsTestSupport.py that > masks this bug by flushing `roles._roles` before every document. > This is not done by real code, only by testing (so > e.g. buildhtml.py has a problem) and is a dirty approach anyway. > And the same problem seems to happen with translations. What do you mean about translations? > I can see how to refactor the code into a class. My only > question is what should `register_canonical_role()` and > especially `register_local_role()` do? They should be methods of the role registry object, and should modify that object's attributes. The existing _role_registry dictionary could hold the initial instance of a role registry object, which could imitate a dictionary for backward compatibility. > If an "application" should call `register_local_role()`, should > it call on a roles registry object associated with a specific > parser? Yes, sounds good. > How should the API look towards it? An application "rolling its own" Publisher would have access to the Publisher object and therefore to the Parser object and other components. See my next message for a possible API. An application using the publish_* convenience functions has no such access though. Perhaps we should add an application callback function to these functions' parameters (default none), so that applications can have access to the underlying components. Thoughts? > 3. The current state of affairs for sandbox implementations of > directives and roles is that one must wrap them in a special > writer. That's very problematic because if one wants to use two > sanbox extensions at the same time, he needs to write a special > writer. We need some kind of plugins mechanism. +1 > Plugins should be selected by the user, not the document (for > security reasons). It's best if they can be installed just by > placing files somewhere; +1. The location could be a runtime setting (set by config file or command line option) with a reasonable default (what should that be?). An environment variable too (DOCUTILS_PLUGINS_PATH)? -- David Goodger <http://python.net/~goodger> |
From: Beni C. <cb...@us...> - 2004-12-25 19:36:41
|
David Goodger wrote: > [Beni Cherniavsky] > > I'm working on a normal implementaion of my math hacks as directives > > and roles in the sandbox. It made me think of several issues: > > > > 1. A directive to set the default role is needed. What should be > > the scope? I presume effecting the rest of the file is simplest > > to implement and understand. > > It should affect the remainder of the current document, since a > document may consist of several included files. Minor point. > Sure, that's what I meant. Wording mistake on my part. > > It should store that information somewhere on the parser, > > currently it's global. > > True, +1. > > > 2. The current implementation of the ``role::`` directive is broken: > > the new role is registered globally and lives until the python > > process exits. There is code in DocutilsTestSupport.py that > > masks this bug by flushing `roles._roles` before every document. > > This is not done by real code, only by testing (so > > e.g. buildhtml.py has a problem) and is a dirty approach anyway. > > And the same problem seems to happen with translations. > > What do you mean about translations? > `roles.role()` caches the resolved role as _role_registry[role_name] where role_name can be a translated name. Thus after processing a document in language X, you can use any X role name that appeared there in subsequent documents even if you process them with another language-code. See attached demonstration for both bugs. I'll add a test for the translations bugs when I commit a fix. > > I can see how to refactor the code into a class. My only > > question is what should `register_canonical_role()` and > > especially `register_local_role()` do? > > They should be methods of the role registry object, and should modify > that object's attributes. The existing _role_registry dictionary > could hold the initial instance of a role registry object, which could > imitate a dictionary for backward compatibility. > I don't see any reason to emulate a dictionary for compatibility. Nobody should have accessed it directly. I should probably retain the global registration function for compatibility, issuing a warning(?). > > If an "application" should call `register_local_role()`, should > > it call on a roles registry object associated with a specific > > parser? > > Yes, sounds good. > > > How should the API look towards it? > > An application "rolling its own" Publisher would have access to the > Publisher object and therefore to the Parser object and other > components. See my next message for a possible API. > > An application using the publish_* convenience functions has no such > access though. Perhaps we should add an application callback function > to these functions' parameters (default none), so that applications > can have access to the underlying components. Thoughts? > I think it's simplest to let the application pass a list of plugins to the publish_* convenience functions. This can wait until we implement plugins. > > 3. The current state of affairs for sandbox implementations of > > directives and roles is that one must wrap them in a special > > writer. That's very problematic because if one wants to use two > > sanbox extensions at the same time, he needs to write a special > > writer. We need some kind of plugins mechanism. > > +1 > > > Plugins should be selected by the user, not the document (for > > security reasons). It's best if they can be installed just by > > placing files somewhere; > > +1. The location could be a runtime setting (set by config file or > command line option) with a reasonable default (what should that be?). > An environment variable too (DOCUTILS_PLUGINS_PATH)? > I'm not sure that getting the location from a config file is wise from a security point of view. AFAIK there are currently no options settable from the config file that could cause anything more severe than reading a file and including it in the document (e.g. inlined stylesheets). So processing foreign files with foreign config files is pretty safe. An extension path config setting would allow a config file to execute any Python file. An environment variable would be OK. I'm just not sure we need any variable beyond PYTHONPATH. Plugin ``foo`` could just import ``docutils.plugins.foo`` or ``docutils_plugin_foo`` or something like that (not sure now, need to think out the consequences). If we do have a variable, it should probably work by extending the `__path__` of docutils or some sub-package of it. -- Beni Cherniavsky <cb...@us...>, who can only read email on weekends. |
From: Beni C. <cb...@us...> - 2004-12-25 19:49:00
|
Beni Cherniavsky wrote: > David Goodger wrote: > >> [Beni Cherniavsky] >> >> > 2. The current implementation of the ``role::`` directive is broken: >> > the new role is registered globally and lives until the python >> > process exits. There is code in DocutilsTestSupport.py that >> > masks this bug by flushing `roles._roles` before every document. >> > This is not done by real code, only by testing (so >> > e.g. buildhtml.py has a problem) and is a dirty approach anyway. >> > And the same problem seems to happen with translations. >> >> What do you mean about translations? >> > `roles.role()` caches the resolved role as _role_registry[role_name] > where role_name can be a translated name. Thus after processing a > document in language X, you can use any X role name that appeared there > in subsequent documents even if you process them with another > language-code. See attached demonstration for both bugs. I'll add a > test for the translations bugs when I commit a fix. > Forgot the attachment. Here is the demonstration: [beni@gurthang ROLES-BUG]$ find */ -type f -not -name '*html' | xargs head ==> 0-undefined/custom.txt <== foo :custom:`bar`. ==> 0-undefined/non-es.txt <== :enfasis:`text with emphasis` ==> 1-defines/defines-custom.txt <== .. role:: custom(emphasis) foo :custom:`bar`. ==> 1-defines/es.txt <== :enfasis:`texto con enfasis` ==> 1-defines/docutils.conf <== [general] language-code: es ==> 2-wrongly-defined/custom.txt <== foo :custom:`bar`. ==> 2-wrongly-defined/non-es.txt <== :enfasis:`text with emphasis` [beni@gurthang ROLES-BUG]$ buildhtml.py 0-undefined/ 1-defines/ 2-wrongly-defined/ /// Processing directory: 0-undefined/ ::: Processing: custom.txt 0-undefined/custom.txt:1: (ERROR/3) Unknown interpreted text role "custom". ::: Processing: non-es.txt 0-undefined/non-es.txt:1: (ERROR/3) Unknown interpreted text role "enfasis". /// Processing directory: 1-defines/ ::: Processing: defines-custom.txt ::: Processing: es.txt /// Processing directory: 2-wrongly-defined/ ::: Processing: custom.txt ::: Processing: non-es.txt -- Beni Cherniavsky <cb...@us...>, who can only read email on weekends. |
From: Felix W. <Fel...@gm...> - 2004-12-25 22:23:37
|
Beni Cherniavsky wrote: > David Goodger wrote: >> >> Beni Cherniavsky wrote: >> >>> Plugins should be selected by the user, not the document (for >>> security reasons). It's best if they can be installed just by >>> placing files somewhere; +1. >> +1. The location could be a runtime setting (set by config file or >> command line option) with a reasonable default (what should that >> be?). An environment variable too (DOCUTILS_PLUGINS_PATH)? Not sure if we need an environment variable. It's always possible to set a plugin path in ~/.docutils. > I'm not sure that getting the location from a config file is wise from > a security point of view. AFAIK there are currently no options > settable from the config file that could cause anything more severe > than reading a file and including it in the document (e.g. inlined > stylesheets). You can include arbitrary files from a reST document, which is bad when using reST in a wiki. And you can ask Docutils to download a current set of Debian ISO images by using the raw directive's :url: option, which is bad when you have a fast internet connection and pay for traffic. > So processing foreign files with foreign config files is pretty safe. Not really. > An extension path config setting would allow a config file to execute > any Python file. ... on the local system. And this possible threat could be mentioned in the announcement mail of the release introducing plugins. What you have to do to protect yourself is just glancing over the config file before running Docutils. Furthermore, it might possible to create a one-way option like --disable-plugins, which cannot be overridden and completely disables importing of plugins. > An environment variable would be OK. But it's not as easy. -- When replying to my email address, please ensure that the mail header contains 'Felix Wiemann'. http://www.ososo.de/ |
From: Beni C. <cb...@us...> - 2004-12-25 23:21:34
|
Felix Wiemann wrote: > Beni Cherniavsky wrote: >>I'm not sure that getting the location from a config file is wise from >>a security point of view. AFAIK there are currently no options >>settable from the config file that could cause anything more severe >>than reading a file and including it in the document (e.g. inlined >>stylesheets). > > You can include arbitrary files from a reST document, which is bad when > using reST in a wiki. True. Anybody running a reST wiki should take it into account and protect himself. But it's harmless as long as you are processing a document for your own reading. (As long as you don't have a filesystem that launches a nuke as soon as you read from /dev/nuke; most people don't - AFAIK /proc has nothing dangerous and /dev is also quite safe when non-root.) > And you can ask Docutils to download a current > set of Debian ISO images by using the raw directive's :url: option, > which is bad when you have a fast internet connection and pay for > traffic. > Yes. But downloading debian images is at best a denial-of-service attack. One could also use this to explore the intranet of a wiki server. >>So processing foreign files with foreign config files is pretty safe. > > Not really. > See above. I meant processing for your own reading. > >>An extension path config setting would allow a config file to execute >>any Python file. > > ... on the local system. > If you download an archive with documents and a config file, it might also contain a Python file with whatever malicious code. Of course, we are smart, so we would *never* run code without auditting it first ;-) but we wouldn't expect that running buildhtml.py to read the documentation first already runs that code without asking :-(. > And this possible threat could be mentioned in the announcement mail of > the release introducing plugins. What you have to do to protect > yourself is just glancing over the config file before running Docutils. > That's bad because it's easy to use a config file accidentally. Docutils picks up ./docutils.conf and buildhtml.py picks it up in all levels of the directory hierarchy. > Furthermore, it might possible to create a one-way option like > --disable-plugins, which cannot be overridden and completely disables > importing of plugins. > Not a solution: people *want* plugins. You run it without plugins, it doesn't work => all right, you run it with plugins :-(. >>An environment variable would be OK. > > But it's not as easy. > Sure it is. I'll implement it :-). Or do you mean not as easy to use? Just have a sensible default. E.g. ``~/.docutils-plugins:<docutils_package_dir>/plugins``.n -- Beni Cherniavsky <cb...@us...>, who can only read email on weekends. |
From: Felix W. <Fel...@gm...> - 2004-12-26 13:28:13
|
Beni Cherniavsky wrote: > Felix Wiemann wrote: > >> And you can ask Docutils to download a current set of Debian ISO >> images by using the raw directive's :url: option, which is bad when >> you have a fast internet connection and pay for traffic. > > Yes. But downloading debian images is at best a denial-of-service > attack. Well, it costs about 5 EUR per image (depending on your ISP's prices it might be more or less, of course). And I can certainly construct some circumstances where you wouldn't notice that some huge files are being downloaded... :-) > If you download an archive with documents and a config file, it might > also contain a Python file with whatever malicious code. [...] it's > easy to use a config file accidentally. Docutils picks up > ./docutils.conf and buildhtml.py picks it up in all levels of the > directory hierarchy. > > Just have a sensible default [for the environment variable]. > E.g. ``~/.docutils-plugins:<docutils_package_dir>/plugins``. OK, then let's handle it using an environment variable only. But I don't think ~/.docutils-plugins is a good choice, because it clutters the home directory. I'd rather like to see a directory ~/.docutils/ which contains the config file (~/.docutils/config) and the plugin directory (~/.docutils/plugins/). (If Docutils encounters a config file at ~/.docutils instead of ~/.docutils/config, it could warn and say that it has moved to ~/.docutils/config.) After all, this is the way it's most commonly handled, AFAICS. I find quite some files matching "~/.*/config" and "~/.*/*plug*" on my system. And it seems the cleanest approach, because maybe we want to add more files later (e.g. stylesheets or templates). -- When replying to my email address, please ensure that the mail header contains 'Felix Wiemann'. http://www.ososo.de/ |
From: David G. <go...@py...> - 2004-12-31 17:32:39
Attachments:
signature.asc
|
[Beni Cherniavsky] >> Just have a sensible default [for the environment variable]. >> E.g. ``~/.docutils-plugins:<docutils_package_dir>/plugins``. [Felix Wiemann] > OK, then let's handle it using an environment variable only. I think we should support a command-line option in addition to an environment variable, but not a config file setting. The command-line option takes priority over the envar. > But I don't think ~/.docutils-plugins is a good choice, because it > clutters the home directory. > > I'd rather like to see a directory ~/.docutils/ which contains the > config file (~/.docutils/config) and the plugin directory > (~/.docutils/plugins/). +1 > (If Docutils encounters a config file at ~/.docutils instead of > ~/.docutils/config, it could warn and say that it has moved to > ~/.docutils/config.) -1. Let's keep backward compatibility. Just stat ~/.docutils. If it's a file, consider it a config file. If it's a directory, look for ~/.docutils/config and ~/.docutils/plugins/ etc. -- David Goodger <http://python.net/~goodger> |
From: Aahz <aa...@py...> - 2004-12-26 04:55:12
|
[I've been skimming the docutils list lately due to lack of time, so my apologies if my point has been addressed already] On Thu, Dec 23, 2004, David Goodger wrote: > [Beni Cherniavsky] >> >> 3. The current state of affairs for sandbox implementations of >> directives and roles is that one must wrap them in a special >> writer. That's very problematic because if one wants to use two >> sanbox extensions at the same time, he needs to write a special >> writer. We need some kind of plugins mechanism. > > +1 I'm not sure that Beni is correct. I've reviewed my implementation of a MIF writer for my book, and it does not appear that a custom writer was needed for custom directives and roles -- provided that the roles and directives do not generate new types of nodes. I'll agree that it's *awkward* without a plugin mechanism. -- Aahz (aa...@py...) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis |
From: Felix W. <Fel...@gm...> - 2005-01-01 13:15:28
|
David Goodger wrote: > Beni Cherniavsky wrote: > >> Felix Wiemann wrote: >> >>> Just have a sensible default [for the environment variable]. >>> E.g. ``~/.docutils-plugins:<docutils_package_dir>/plugins``. >> >> OK, then let's handle it using an environment variable only. > > I think we should support a command-line option in addition to an > environment variable, but not a config file setting. Generally that'd be OK with me, but are you sure that this is not too difficult to implement (elegantly)? Might not be worth the effort, if it requires much code. >> (If Docutils encounters a config file at ~/.docutils instead of >> ~/.docutils/config, it could warn and say that it has moved to >> ~/.docutils/config.) > > -1. Let's keep backward compatibility. Just stat ~/.docutils. If > it's a file, consider it a config file. If it's a directory, look for > ~/.docutils/config and ~/.docutils/plugins/ etc. Yes, that's good. -- When replying to my email address, please ensure that the mail header contains 'Felix Wiemann'. http://www.ososo.de/ |