From: Ronny P. <Ron...@gm...> - 2007-05-30 10:18:17
|
Hi, since a format-specific directive is no option for a highlighter in docutils, i decided to try writing a pygments-formatter thats able to create a doctree. Unfortunately i have no idea how i should generate it. So - any hints ? greets Ronny Pfannschmidt |
From: G. M. <mi...@us...> - 2007-05-30 11:01:30
Attachments:
pygments_with_docutils.py
|
On 30.05.07, Ronny Pfannschmidt wrote: > since a format-specific directive is no option for a highlighter in > docutils, I decided to try writing a pygments-formatter thats able to > create a doctree. Felix did post a "draft" of this approach (attached). However, * it makes the doctree very verbose (try runing the draft to see this), * html-output does not work with the stylesheets produced by pygments (e.g. with the `-S` option to pygmentize), * latex-output seems even less usable. My favourite solution of the syntax-highligt problem is to use classes to mark a "literal-block" doctree element as source code in a specific language (Proposal2 in http://pylit.berlios.de/features/syntax-highlight.html). Proposal: literal block with configurable classes """"""""""""""""""""""""""""""""""""""""""""""""""" * Enhance writers to support syntax highlight for "literal-block" doctree elements with ``classes="code-block LANGUAGE"``. The standard html and latex writers could use `pygments`_ as a plug-in: if the "pygments" module is found, code-blocks will get syntax highlight, if not, the standard "literal-block" rendering is used. Alternatively, separate writers and back-end scripts for syntax-highlighting conversion could be added. A LaTeX writer could also set the block in an "lstlisting" environment (either as option of the "standard syntax highlighting latex writer" or as "latex writer with listing support"). The "class" directive can already be used to specify classes of a literal block, e.g.:: colourful Python code .. class:: code-block python :: def hello(): print "hello world" which means that a highlight-enhanced writer that works with backwards compatible documents is possible without changes to the reStructured text syntax. Motivation '''''''''' Classes are used in the `docutils document tree`_ to carry non-vital information from reader to writer. * "code-block" and "LANGUAGE" classes provide non-essential information about an "is a" relationship that the writer can use for improved output. * If a writer supports the "code-block" class and the specified language, it can parse the literal block's content and highlight it accordingly. * The fall-back is to render the block in a monospaced font with whitespace preserved (as currently done by all writers) This solves the problem of compatibility with "non-enhanced" writers that do not provide syntax highlight which arises when a "sourcecode" or "code-block" directive is used. .. _`Docutils Document Tree`: http://docutils.sf.net/docs/ref/doctree.html#classes I'd like to implement a writer that supports a "code-block" class, however I still struggle with the right way to subclass a standard writer and provide a back-end for the "enhanced" writer (pointers to documentation welcome). Guenter |
From: David G. <go...@py...> - 2007-05-30 13:59:29
|
On 5/30/07, G. Milde <mi...@us...> wrote: > Felix did post a "draft" of this approach (attached). However, This approach is good. > * it makes the doctree very verbose (try runing the draft to see > this), So what? > * html-output does not work with the stylesheets produced by > pygments (e.g. with the `-S` option to pygmentize), > * latex-output seems even less usable. Those are implementation details. The approach is correct. > My favourite solution of the syntax-highligt problem is to use classes to > mark a "literal-block" doctree element as source code in a specific > language (Proposal2 in > http://pylit.berlios.de/features/syntax-highlight.html). > > Proposal: literal block with configurable classes > """"""""""""""""""""""""""""""""""""""""""""""""""" > > * Enhance writers to support syntax highlight for "literal-block" doctree > elements with ``classes="code-block LANGUAGE"``. > > The standard html and latex writers could use `pygments`_ as a plug-in: if > the "pygments" module is found, code-blocks will get syntax highlight, if > not, the standard "literal-block" rendering is used. I strongly oppose this approach. Writers should not be responsible for parsing, in any form. -- David Goodger <http://python.net/~goodger> |
From: G. M. <mi...@us...> - 2007-05-30 15:51:47
|
On 30.05.07, David Goodger wrote: > On 5/30/07, G. Milde <mi...@us...> wrote: > > Felix did post a "draft" of this approach (attached). > This approach is good. My alternative proposal was > > * Enhance writers to support syntax highlight for "literal-block" doctree > > elements with ``classes="code-block LANGUAGE"``. > > ... > I strongly oppose this approach. Writers should not be responsible for > parsing, in any form. Thanks David for your clear decision. Once it is clear that source-code parsing for syntax highlight should not happen inside a writer, the question remains whether it should be implemented * in the parser, or * as a transformation of "literal_block" nodes with "sourcecode" (or "code-block") in the classes list. Is there a way to do this as "plug-in" (Analog to the examples in http://docutils.sf.net/docs/howto/rst-directives.html)? Guenter |
From: David G. <go...@py...> - 2007-05-30 19:59:48
|
On 5/30/07, G. Milde <mi...@us...> wrote: > Once it is clear that source-code parsing for syntax highlight > should not happen inside a writer, the question remains whether it > should be implemented > > * in the parser, or > > * as a transformation of "literal_block" nodes with "sourcecode" (or > "code-block") in the classes list. I think the syntax highlighting should be done by a directive, while being parsed by the parser. There's no need to delay processing (by a transform). The syntax should be something like this: .. code-block:: python while indented: print "Python code here" > Is there a way to do this as "plug-in" (Analog to the examples in > http://docutils.sf.net/docs/howto/rst-directives.html)? I don't know what you mean by this. -- David Goodger <http://python.net/~goodger> |
From: G. M. <mi...@us...> - 2007-06-01 14:07:24
|
On 30.05.07, David Goodger wrote: > On 5/30/07, G. Milde <mi...@us...> wrote: > I think the syntax highlighting should be done by a directive, while > being parsed by the parser. There's no need to delay processing (by a > transform). The syntax should be something like this: > .. code-block:: python > while indented: > print "Python code here" Looks good to me. Clean, clear, and compatible with many existing extensions. > > Is there a way to do this as "plug-in" (Analog to the examples in > > http://docutils.sf.net/docs/howto/rst-directives.html)? > I don't know what you mean by this. An working example of the "code-block" directive is implemented in `rst2html-pygments`_ following a receipe in http://pygments.org/docs/rstdirective/ which seems to be based on the howto in http://docutils.sf.net/docs/howto/rst-directives.html. It uses the function:: directives.register_directive('code-block', pygments_directive) to implement the directive where the `pygments_directive` function is, however, writer-specific. If I understand it right, it injects "raw" content targeted at a specific writer into the document tree. Would it be possible to write a front-end script analog to rst2html-pygments_ that implements "the right way" of sourcecode parsing and uses pygments to produces a "rich" literal-block doctree element. In other words, how could I "plug-in" the code from the `proof of concept`_ into docutils so that it will process an input file and not just a sample string? Thanks Guenter .. _rst2html-pygments: http://pylit.berlios.de/features/rst2html-pygments .. _proof of concept: http://pylit.berlios.de/features/pygments_with_docutils.py |
From: David G. <go...@py...> - 2007-06-01 15:24:28
|
On 6/1/07, G. Milde <mi...@us...> wrote: > Would it be possible to write a front-end script analog to > rst2html-pygments_ that implements "the right way" of > sourcecode parsing and uses pygments to > produces a "rich" literal-block doctree element. Yes. > In other words, how could I "plug-in" the code from the > `proof of concept`_ into docutils so that it will process > an input file and not just a sample string? Exactly the same way. The only difference is that the new directive will return a "literal_block" node (the other directive returned a "raw" node). > .. _proof of concept: > http://pylit.berlios.de/features/pygments_with_docutils.py I get 403 Forbidden on that. -- David Goodger <http://python.net/~goodger> |
From: Alan G I. <ai...@am...> - 2007-06-01 15:42:10
|
> On 30.05.07, David Goodger wrote: >> I think the syntax highlighting should be done by >> a directive, while being parsed by the parser. There's >> no need to delay processing (by a transform). The syntax >> should be something like this: >> .. code-block:: python >> while indented: >> print "Python code here" On Fri, 1 Jun 2007, "G. Milde" apparently wrote: > Looks good to me. Clean, clear, and compatible with many > existing extensions. Two user requests as you go this direction: 1. support this for the ``include`` directive. E.g., :: .. include:: mycode.py :code-block: python 2. Allow docutils parsing for syntax highlighting to be turned off. (Yes, this is another attempt to get a literal block in a code-block element, but it seems like a very reasonable way given the project here, no?) Maybe something like .. code-block:: python :literal: but of course I care about the functionality, not the chosen semantics. The functionality would follow from writers knowing that they have a code block that has not been parsed for syntax by Docutils. Writers can then handle this as they wish. Additionally, one question. Will Docutils "choke" on unknown languages, or will it allow them but not parse for syntax? I hope the latter. Thank you, Alan Isaac PS Thanks much to Guenter and David for helping me better understand some of the issues. Also, in response to a note I got off-list, please note that "apparently wrote" is in no sense derogatory; it is a simple "post-modern" joke, if you will, turning on the nature of our knowledge of identity in such forums. |
From: David G. <go...@py...> - 2007-06-01 18:16:38
|
On 6/1/07, Alan G Isaac <ai...@am...> wrote: > Two user requests as you go this direction: > > 1. support this for the ``include`` directive. E.g., :: > > .. include:: mycode.py > :code-block: python It would be better for a "code-block" directive to have a "file" option (like the "raw" directive): .. code-block:: python :file: mycode.py > 2. Allow docutils parsing for syntax highlighting to be > turned off. No. There's no reason for it. > (Yes, this is another attempt to get a literal block > in a code-block element, There is no such thing as a code-block element, and there won't be. > but it seems like a very > reasonable way given the project here, no?) No. > The functionality would follow > from writers knowing that they have a code block that > has not been parsed for syntax by Docutils. Writers can > then handle this as they wish. Writers are part of Docutils. Your terminology is confused. > Additionally, one question. Will Docutils "choke" on > unknown languages, or will it allow them but not parse > for syntax? I hope the latter. I agree with Guenter: a warning or info system message should be generated, and an unparsed literal_block should result. > Also, in response to a note > I got off-list, please note that "apparently wrote" is in no > sense derogatory; it is a simple "post-modern" joke, if you > will, Then perhaps add a smiley. Implicit jokes don't work well in email. Small annoyances can easily become offensive. -- David Goodger <http://python.net/~goodger> |
From: Alan G I. <ai...@am...> - 2007-06-01 20:38:57
|
> On 6/1/07, Alan G Isaac <ai...@am...> wrote: >> .. include:: mycode.py >> :code-block: python On Fri, 1 Jun 2007, David Goodger wrote: > It would be better for a "code-block" directive to have a "file" > option (like the "raw" directive): > .. code-block:: python > :file: mycode.py You can currently ``include`` code fragments, which is very useful. Would this be added to a ``code-block`` directive as well? Cheers, Alan Isaac |
From: Lea W. <lew...@gm...> - 2007-06-03 02:32:07
|
Alan G Isaac wrote: > You can currently ``include`` code fragments, > which is very useful. Would this be added > to a ``code-block`` directive as well? You mean having a "start-after" and "end-before" option for the code-block directive? I'm not quite convinced that adding such options to directives that have a :file: option is a good idea, even though it's certainly possible. It's an interesting use case though! (It's probably also necessary to add an option to specify the encoding of the external file. In fact, the csv-table directive already has such an option. Option proliferation ahead!) .. code-block:: :file: module.py :start-after: TAG :encoding: latin1 Crazy alternative (should be possible in principle though [but may require some work]) -- nested options: .. code-block:: :file: module.py :start-after: TAG :encoding: latin1 These are just random ideas... Other suggestions welcome! // Lea |
From: Alan G I. <ai...@am...> - 2007-05-31 17:59:38
|
On Thu, 31 May 2007, David Goodger apparently wrote: > But that is incorrect. Syntax highlighting requires parsing. Parsers > do the parsing, not writers. The parser takes the input and turns it > into a document tree, and the writer turns the document tree into HTML > or LaTeX or whatever. Please understand that this common internal > data structure (the document tree), and this separation of > responsibilities, are essential to how Docutils works. This will not > change. I do not believe I am asking that writers do any parsing. I believe I am talking about the document tree. I will try to be clearer. 1. Set aside the issue of syntax highlighting. I meant this to be separate. I meant to ask: - might it be sensible to have a code-block body element? (a specific kind of literal block) - if so, can the writer have access to the language option? (This says nothing about what the writer might do with that information.) The example syntax:: .. include:: mycode.py :code-block: python was meant only to convey the need to let the writer know more about the nature of this body element than is currently possible. (As a very rough analogy, there is a generic admonition element, but there are also specific admonition elements.) 2. Returning briefly to syntax highlighting, this might be requested as a parsed code block:: .. include:: mycode.py :parsed-code-block: python But really I am only noting this to emphasize the distinction I am trying to draw. Cheers, Alan Isaac |
From: Alan G I. <ai...@am...> - 2007-05-31 17:12:00
|
On Wed, 30 May 2007, David Goodger apparently wrote: > If all you want to do is pass a class through to the latex > writer, just d= o: > .. class:: python-code-block-for-latex > :: > print "This is Python code;" > print "LaTeX, do your magic!" It seems to me that code blocks are a common use case and code always has a language, so there is a semantic function for docutils to play. That is, :: .. include:: mycode.py :literal: :class: my-personal-class-name is not equivalent to allowing (for example):: .. include:: mycode.py :code-block: python since the whole point of the latter would be to provide a consistent hook for writers. Anyway, thanks for the feedback. Cheers, Alan Isaac |
From: David G. <go...@py...> - 2007-05-31 17:29:26
|
On 5/31/07, Alan G Isaac <ai...@am...> wrote: > It seems to me that code blocks are a common use case > and code always has a language, so there is a semantic > function for docutils to play. That is, :: > > .. include:: mycode.py > :literal: > :class: my-personal-class-name > > is not equivalent to allowing (for example):: > > .. include:: mycode.py > :code-block: python That is correct. > since the whole point of the latter would be to provide > a consistent hook for writers. But that is incorrect. Syntax highlighting requires parsing. Parsers do the parsing, not writers. The parser takes the input and turns it into a document tree, and the writer turns the document tree into HTML or LaTeX or whatever. Please understand that this common internal data structure (the document tree), and this separation of responsibilities, are essential to how Docutils works. This will not change. -- David Goodger <http://python.net/~goodger> |
From: G. M. <mi...@us...> - 2007-06-01 11:13:25
|
On 31.05.07, David Goodger wrote: > On 5/31/07, Alan G Isaac <ai...@am...> wrote: > > :: > > > > .. include:: mycode.py > > :literal: > > :class: listing python > > > > is not equivalent to allowing (for example):: > > > > .. include:: mycode.py > > :code-block: python > That is correct. If I got it right, the first example corresponds to:: :class: listing python :: # file: mycode.py print "hello world" The parser will convert it to a "literal_block" doctree element without parsing and attach the class info. The current standard latex writers should ignore the class information. However, it would be possible to create a latex writer that is `aware` of the "listing" class and would use the class information to put the content of the doctree node in a "lstlistings" environment with the language argument set to "python". The second example is a nice syntax proposal for inserting an external file as code-block. It would correspondent to:: ..code-block:: python # file: mycode.py print "hello world" The parser will parse it and convert it to a "literal_block" doctree element without nested inline elements carrying class information about the tokens. The latex writer will be updated to handle "literal_block" doctree elements with classified tokens. (The details of this should be discussed in a separate thread.) > > since the whole point of the latter would be to provide > > a consistent hook for writers. > But that is incorrect. The first example provides a hook for the writer while the second example (as well as the "code_block" directive) triggers parsing of the content by the docutils parser. Günter |
From: Alan G I. <ai...@am...> - 2007-05-30 20:36:20
|
On Wed, 30 May 2007, David Goodger apparently wrote: > .. code-block:: python > while indented: > print "Python code here" Is a ``code-block`` directive like this destined soon for reST? (It doesn't exist yet, right? Not documented anyway.) Is that better than an explicit ``literal`` directive? E.g., :: .. literal:: :code-block: python In either case, it would really be nice to be able to specify enough information that e.g. a LaTeX writer could use the ``listings`` package and specify the language, which would automatically sophisticated (and configurable) syntax highlighting. Cheers, Alan Isaac |
From: David G. <go...@py...> - 2007-05-30 21:53:43
|
On 5/30/07, Alan G Isaac <ai...@am...> wrote: > Is a ``code-block`` directive like this destined soon for > reST? Hopefully, but don't know when. We're discussing its implementation. > (It doesn't exist yet, right? Not documented anyway.) Correct. > Is that better than an explicit ``literal`` directive? > E.g., :: > > .. literal:: > :code-block: python I don't know why you'd want that. It's more verbose, but what's the point? What would it do differently? > In either case, it would really be nice to be able to > specify enough information that e.g. a LaTeX writer could > use the ``listings`` package and specify the language, > which would automatically sophisticated (and configurable) > syntax highlighting. If there are output formats (like LaTeX) which already do syntax highlighting, it's conceivable that a code-block directive could allow it to do the work somehow. That would require that the directive to *know* what writer is being used though, which is not a good thing. We're talking about writer-independent syntax highlighting. The reason why there's no built-in code-block directive now, is because there's no writer-independent implementation. Please don't complicate the issue. -- David Goodger <http://python.net/~goodger> |
From: Alan G I. <ai...@am...> - 2007-05-31 03:15:44
|
On Wed, 30 May 2007, David Goodger apparently wrote: > Please don't complicate the issue. Despite your apparent suspicion, I have no such desire. It seems to me that there are two separate goals, both reasonable. 1. Supply writer independent syntax highlighting. 2. Facilitate writer dependent syntax highlighting, by signalling to the writer that a literal block is code of a certain type. As for the first case---writer independent syntax highlighting---I wonder if the Vim syntax files might prove a major resource? (They look pretty easy to parse and are thoughtfully constructed.) But my immediate need is for the second case: writer dependent syntax highlighting. I do not wish my expression of this need to complicate discussion of the former, but it does seem more readily achievable. As a user, I do not claim to understand docutils internals. It was my hope that a directive and option like :: .. literal:: :code-block: python would offer a natural way to communicate the 2nd case to writers. I imagined it being matched by a similar option for literal inclusions. Bottom line for case 2: Some writers, like LaTeX, can provide access to extensive and highly sophisticated syntax highlighting, and enabling reST users to access such functionality seems a worthy goal, right? Alan Isaac |
From: David G. <go...@py...> - 2007-05-31 03:45:08
Attachments:
signature.asc
|
[Alan G Isaac] > It seems to me that there are two separate goals, > both reasonable. >=20 > 1. Supply writer independent syntax highlighting. >=20 > 2. Facilitate writer dependent syntax highlighting, > by signalling to the writer that a literal block > is code of a certain type. #1 has never been done properly. If it had been, we wouldn't be having t= his conversation. #2 has been done several times in different ways. Note that none are inc= luded with Docutils. > But my immediate need is for the second case: writer=20 > dependent syntax highlighting. I do not wish my expression=20 > of this need to complicate discussion of the former, but it=20 > does seem more readily achievable. Maybe, if somebody cares to implement it. > As a user, I do not claim to understand docutils internals. > It was my hope that a directive and option like :: >=20 > .. literal:: > :code-block: python >=20 > would offer a natural way to communicate the 2nd case to > writers. I imagined it being matched by a similar option > for literal inclusions. If all you want to do is pass a class through to the latex writer, just d= o: .. class:: python-code-block-for-latex :: print "This is Python code;" print "LaTeX, do your magic!" The hard part is the other end. > Bottom line for case 2: > Some writers, like LaTeX, can provide access to > extensive and highly sophisticated syntax highlighting, > and enabling reST users to access such functionality seems=20 > a worthy goal, right? Not to me. I'm not interested, sorry. --=20 David Goodger <http://python.net/~goodger> |
From: G. M. <mi...@us...> - 2007-06-01 10:33:54
|
On 30.05.07, Alan G Isaac wrote: > On Wed, 30 May 2007, David Goodger wrote: > > .. code-block:: python > > while indented: > > print "Python code here" > Is a ``code-block`` directive like this destined soon for reST? If I got David right, this is the reST syntax that will be used for inclusion of source code that should be parsed and output with syntax highlight by docutils. > Is that better than an explicit ``literal`` directive? Yes. (it is shorter (and IMO nicer) and backwards compatible to the majority of existing `highlighting extensions`_. > (It doesn't exist yet, right? Not documented anyway.) It does *not* exist in the core docutils but is already used by many `highlighting extensions`_. You can use it in your documents right away and translate it to HTML or LaTeX with the `pygments enhanced docutils front ends`_. Once syntax highlight support becomes a standard docutils feature, the uncha documents should be translable with the standard front ends without need for change. However, there might be the need for a different style-sheet than the one provided at the link above. cheers Guenter .. _highlighting extensions: http://pylit.berlios.de/features/syntax-highlight.html#existing-highlighting-additions-to-docutils .. _pygments enhanced docutils front ends: http://pylit.berlios.de/features/syntax-highlight.html#pygments-enhanced-docutils-front-ends |
From: David G. <go...@py...> - 2007-05-31 19:38:48
|
On 5/31/07, Alan G Isaac <ai...@am...> wrote: > 1. Set aside the issue of syntax highlighting. > I meant this to be separate. > > I meant to ask: > - might it be sensible to have a code-block body > element? (a specific kind of literal block) It might be convenient, but new elements need a lot of support code. We have to draw a line somewhere to limit the number of doctree elements. I don't see the value in having a separate code-block element. What's the practical difference between these? <python-code-block> <code_block language="python"> <literal_block classes="code-block python"> It's a matter of style, how many elements are available, and how much implementation work we want to put in. I'm comfortable with having only <literal-block>. > - if so, can the writer have access to the language > option? (This says nothing about what the > writer might do with that information.) Sounds a lot like you're trying to make an end-run around the separation of responsibilities in Docutils. > The example syntax:: > > .. include:: mycode.py > :code-block: python > > was meant only to convey the need to let the writer > know more about the nature of this body element than > is currently possible. Why does the writer need to know this, if it's not going to do any parsing? I think you may be approaching this from too much of a LaTeX-centric perspective. LaTeX is effectively a full-blown programming language, Turing-complete. Docutils is using LaTeX as a dumb back-end formatter, the same as HTML. If Docutils is to remain output-format-neutral, it *cannot* take full advantage of LaTeX's features. To do so would marginalize the other formats. > (As a very rough analogy, > there is a generic admonition element, but there > are also specific admonition elements.) That's not a great example; it may be a wart in reST. Perhaps there shouldn't have been any specific admonition elements, only <admonition class="danger"> etc. > 2. Returning briefly to syntax highlighting, this might > be requested as a parsed code block:: > > .. include:: mycode.py > :parsed-code-block: python > > But really I am only noting this to emphasize the > distinction I am trying to draw. What is a "parsed code block"? How is it different from a regular (unparsed?) code block? -- David Goodger <http://python.net/~goodger> |
From: Alan G I. <ai...@am...> - 2007-05-31 20:35:07
|
On Thu, 31 May 2007, David Goodger apparently wrote: > What's the practical difference between these? ... > <code_block language="python"> > <literal_block classes="code-block python"> I'd say the practical difference arises whenever semantics are relevant. (See below.) The problem, unless I misunderstand, is that such names of classes are completely arbitrary and have no semantic content (in the context of Docutils). On Thu, 31 May 2007, David Goodger apparently wrote: > Sounds a lot like you're trying to make an end-run around > the separation of responsibilities in Docutils. Isn't there a distinction between the following? - keeping responsibilities separate *in* Docutils - allowing (not requiring) some responsibilities to be passed outside of Docutils >> The example syntax:: >> .. include:: mycode.py >> :code-block: python >> was meant only to convey the need to let the writer >> know more about the nature of this body element than >> is currently possible. On Thu, 31 May 2007, David Goodger apparently wrote: > Why does the writer need to know this, if it's not going to > do any parsing? Because semantic content can be relevant to how a literal block is written. For example, when writing to LaTeX, it makes sense for a code-block to be in a code listing environment, ideally with the language stated, while this does not make sense if the literal block is a data listing. It is easy to imagine *consistently* using the same information in a CSS style. I do not see where the consistency is to come from without the semantic content. On Thu, 31 May 2007, David Goodger apparently wrote: > If Docutils is to remain output-format-neutral, it cannot > take full advantage of LaTeX's features. To do so would > marginalize the other formats. I want to draw a distinction between having Docutils "take full advantage" of LaTeX and havind Docutils provide some semantics that *any* writer might in principle exploit. On Thu, 31 May 2007, David Goodger apparently wrote: > What is a "parsed code block"? How is it different from > a regular (unparsed?) code block? I may misunderstand how things work, but my intent was to use the phrase "parsed code block" for a code block destined to be parsed by the Docutils parser for syntax highlighting purposes, while a regular code-block would not be parsed in this fashion. It was my assumption that however syntax highlighting was implemented, not all literal blocks would be subjected to the relevant parasing, so I was simply pointing to that distinction. (I had in mind a very loose analogy between a parsed-literal and a literal block.) Again, thanks for all the feedback, Alan |
From: G. M. <mi...@us...> - 2007-06-01 12:07:13
|
On 31.05.07, Alan G Isaac wrote: > On Thu, 31 May 2007, David Goodger apparently wrote: > > What's the practical difference between these? ... > > <code_block language="python"> > > <literal_block classes="code-block python"> > The problem, unless I misunderstand, is that such names of classes are > completely arbitrary and have no semantic content (in the context of > Docutils). This is only partially correct: class names have no semantic content for the docutils *parser* but might have semantic content for a subset or all of docutils *writers*. The practical difference between the above examples is that * in the first case all docutils writers need to be updated to handle a <code_block> doctree element. * in the second case, writers are "free" to ignore the class information (and should already be programmed to do so). This is why I prefer the second implementation proposal. > Isn't there a distinction between the following? > - keeping responsibilities separate *in* Docutils > - allowing (not requiring) some responsibilities to be > passed outside of Docutils Unfortunately, you did not make this distincion clear > On 30.05.07, Alan G Isaac wrote: > > > Bottom line for case 2: > > Some writers, like LaTeX, can provide access to > > extensive and highly sophisticated syntax highlighting However, the parsing and syntax highlighting is in this case not done by the *writer*, but by a LaTeX package that has nothing to do with docutils. Enhancing the docutils latex writer to pass on class information to LaTeX is a different issue from syntax-highlight support in docutils. It is complicated by the fact that LaTeX does not have a concept of classes similar to XML, HTML, and reST. (more on this in a separate thread). > >> The example syntax:: > >> .. include:: mycode.py > >> :code-block: python > >> was meant only to convey the need to let the writer > >> know more about the nature of this body element than > >> is currently possible. However, this is the wrong syntax. In order to let one writer know more about a doctree element (without the need of all writers to be able to handle this information), use the class attribute:: .. include:: mycode.py :literal: :class: listing python This way you ensure that * the content of mycode.py is not parsed but taken literally, * an "agilatex" writer could output the content in \begin{lstlisting}{python} \end{lstlisting{python} * standard writers continue to work with documents using this syntax. > On Thu, 31 May 2007, David Goodger apparently wrote: > > What is a "parsed code block"? How is it different from > > a regular (unparsed?) code block? > I may misunderstand how things work, but my intent was to > use the phrase "parsed code block" for a code block destined > to be parsed by the Docutils parser for syntax highlighting > purposes, while a regular code-block would not be parsed in > this fashion. It was my assumption that however syntax > highlighting was implemented, not all literal blocks would > be subjected to the relevant parasing, so I was simply > pointing to that distinction. (I had in mind a very loose > analogy between a parsed-literal and a literal block.) You need to distinguish in the different stages of rst processing: reST source -- reader --> document tree -- writer --> output There is some overlap of the element names in these stages but no one-to-one relationship. My picture is something like: reST syntax element document tree element output element (e.g. latex environment) ----------------------- --------------------- ---------------------- Literal-Block markup ``::`` literal-block {verbatim} Parsed-Literal direcive ``.. parsed-literal::`` literal-block {alltt} Code-Bock directive ``.. code-block::`` literal-block {alltt} Both, ``.. parsed-literal::`` and ``.. code-block::`` directives will trigger parsing of the content by the docutils reader (with different parsing rules of course). If you do not want the parsing in docutils, use Literal-Block markup, not a Code-Block directive. Guenter |
From: Alan I. <ai...@am...> - 2007-06-01 13:51:21
|
On Fri, 1 Jun 2007, "G. Milde" wrote: > This is only partially correct: class names have no semantic > content for the docutils parser but might have semantic content for > a subset or all of docutils writers. But this really obscures the issue I'm discussing, it seems to me. Any class name I introduce presumably has semantic content for me, and sure I can create a new writer to recognize that semantic content, but this is really beside my point. My point is that code blocks are a common use case, that they have an associated language, and that it makes sense for Docutils to embody these semantics (*rather* than just saying "oh well, its just a literal block, add a class if you wish"). > * in the first case all docutils writers need to be updated to handle a > <code_block> doctree element. Surely writers are written to be robust to newly defined elements?? Otherwise anytime a new element is introduced this will break all Docutils writers. (By robust, I simply mean that there is a standard practice for element substitution in such instances.) > * in the second case, writers are "free" to ignore the class information > (and should already be programmed to do so). > This is why I prefer the second implementation proposal. This again raises my previous question. I also do not understand your *implementation* proposal. Please elaborate. In what sense would Docutils implement this proposal? It seems to me that you just mean that users of a particular writer could settle on a conventional class name? > Enhancing the docutils latex writer to pass on class > information to LaTeX is a different issue from > syntax-highlight support in docutils. Absolutely. See my other messages. (And note the subject line.) > It is complicated by the fact that LaTeX does not have a concept of > classes similar to XML, HTML, and reST. (more on this in a separate > thread). I agree here too, in a very slightly qualified way. The slight qualification is that classes can often correspond to an option for an environment. Anyway, recalling that David has explicitly stated that he wants Docutils not to favor a particular writer, I see this is a complication (even from my "user's perspective"). > In order to let one writer know more about a doctree > element (without the need of all writers to be able to > handle this information), use the class attribute:: I think I responded to this proposal. Why are you talking about "one writer"? I have raised the question of whether there is semantic content here that *Docutils* could usefully recognize. (Here "usefully" means that *all* writers could potentially exploit it.) I hope that David's admonitions against attempting a LaTeX-centric approach to Docutils apply equally against attempting an HTML-centric approach. > If you do not want the parsing in docutils, use > Literal-Block markup, not a Code-Block directive. It seems that you are treating the code-block directive as extant, which it is not (in Docutils). I was merely drawing a distinction between two possible *future* directives, one that calls for parsing (for syntax highlighting purposes) and one that does not. I have only been discussing the latter. The names do not matter. Cheers, Alan Isaac |
From: G. M. <mi...@us...> - 2007-06-01 15:30:32
|
On 1.06.07, Alan Isaac wrote: > On Fri, 1 Jun 2007, "G. Milde" wrote: > My point is that code blocks are a common use case, that > they have an associated language, and that it makes sense > for Docutils to embody these semantics Agreed. > (*rather* than just saying "oh well, its just a literal block, add a > class if you wish"). This was my receipt for bypassing the parsing by the docutils reader as proposed by David for the content of a "code-block" directive. > I also do not understand your *implementation* proposal. > Please elaborate. After finding out that the raw content of a literal block is stored in the document tree as 'raw_source' attribute by the `proof-of-concept`_ script, I could imagine a custom latex-writer that undoes the parsing. It would * check "literal-block" doctree elements for "code-block" in the 'class' argument values, * if found + eventually replace the parsed content of the element with the content of its 'raw_source' argument + put the content in an lstlisting environment and set the language option to the first part of the 'class' argument that matches a list of supported languages. This algorithm would also work with the already existing syntax:: :class: code-block python :: python_code() > In what sense would Docutils implement this proposal? I doubt that "Docutils" will implement a listings-aware latex writer. But a user could provide such a writer as add-on (analogue to the non-official Odtwriter). Maybe the algorithm could become an option of the standard latex writer at some stage. > It seems to me that you just mean that users of a particular writer > could settle on a conventional class name? I see this as a decision of the author of the particular writer. For HTML output it is even possible for the average CSS stylesheet designer to assign a set of layout rules to a class name and use this class name for semantic markup in reST. > > In order to let one writer know more about a doctree > > element (without the need of all writers to be able to > > handle this information), use the class attribute:: > Why are you talking about "one writer"? Maybe I should have written "a writer" (both translates to "ein Schreiber" in my native language). > I have raised the question of whether there is semantic content here > that *Docutils* could usefully recognize. (Here "usefully" means that > *all* writers could potentially exploit it.) The `proof-of-concept`_ script produces a <literal_block> document tree element where * the semantic content is preserved as ``classes="code-block python"`` argument and * the raw content of a code-block directive is preserved as ``raw_source`` argument All writers *could* potentially exploit it, while more probably most writers will exploit the class info in the parsed content of the <literal-block> element. > > If you do not want the parsing in docutils, use > > Literal-Block markup, not a Code-Block directive. > It seems that you are treating the code-block directive as > extant, which it is not (in Docutils). I was merely drawing > a distinction between two possible *future* directives, > one that calls for parsing (for syntax highlighting > purposes) and one that does not. I have only been > discussing the latter. The names do not matter. After David's statement that parsing of a code-block should happen in the docutils reader, I assume this as existent part of a specification of a future ``.. code-block::`` directive. I don't see the need for an additional non-parsing directive. And I see the proposal of a non-parsing directive under the name "code-block" as confusing. Guenter .. _proof-of-concept: http://pylit.berlios.de/features/pygments_with_docutils.py |