From: David G. <go...@py...> - 2008-05-14 17:59:13
|
On Wed, May 14, 2008 at 1:38 PM, Alan Isaac <ai...@am...> wrote: > What I am hoping for is a ``math`` directive > that would be added to docutils. > The concern I take it is: > how is a writer to handle unrecognized formats. I don't think that a Writer should handle unrecognized formats. Recognizing formats and dealing with input is the job of a Parser or Reader. A Writer's job is to convert a standard doctree to the output format. Yes, I realize that this is dealing with a distinct data format, so the model may break down. But this is something to keep in mind. > Here is how I was conceiving it. > A math directive would always pass the math > literally to the writer, along with a format > attribute. Writers would handle formats they > recognize, and implement default behavior for > formats they do not recognize. Math processing should not be implemented in every Writer. It shouldn't be implemented in *any* Writer. It should be handled separately, in a Transform. Even if the output depends on the type of Writer, a Transform is the way to go. By the time the Writer sees it, the math should be just a blob to insert into the output stream. > Would that be an acceptable way for docutils > to grow a ``math`` directive? Yes, modulo where the math conversion takes place: in a Transform, before the Writer starts. -- David Goodger <http://python.net/~goodger> |
From: Alan G I. <ai...@am...> - 2008-05-14 20:27:40
|
On Wed, 14 May 2008, David Goodger apparently wrote: > Math processing should not be implemented in every Writer. > It shouldn't be implemented in any Writer. It should be > handled separately, in a Transform. Even if the output > depends on the type of Writer, a Transform is the way to > go. By the time the Writer sees it, the math should be > just a blob to insert into the output stream. OK, as a basic user, my primary interaction is with the publisher front ends, and I do not really understand nor mean to comment on things architectural. If I understand you: - you would consider adding to docutils a ``math`` directive with a ``format`` option (is that right?) - this looks to require a default transform of literal inclusion (?) along with Writer specific transforms (e.g., for LaTeX and XHTML writers), all of which is compatible with the docutils architecture - this would possibly allow Jens's work to move out of the sandbox and into docutils, so that for example his rst2latexmath front end could be ditched and users could just use, e.g., rst2latex (and similarly users wanting xhtml+mathml could just use rst2html) If that is right, what is the way forward? Thanks, Alan |
From: David G. <go...@py...> - 2008-05-14 20:42:51
|
On Wed, May 14, 2008 at 4:31 PM, Alan G Isaac <ai...@am...> wrote: > OK, as a basic user, my primary interaction is with the > publisher front ends, and I do not really understand nor > mean to comment on things architectural. Then why do you get into such detailed technical discussions? <0.5 wink> > If I understand you: > > - you would consider adding to docutils a ``math`` directive > with a ``format`` option (is that right?) "math" directive: yes. "format" option: only if it's optional. If the format of the math markup must be given, then it's not an option, but an argument (e.g. ".. math:: mathml" etc.). > - this looks to require a default transform of literal > inclusion (?) I have no idea what you mean by this. > along with Writer specific transforms (e.g., > for LaTeX and XHTML writers), all of which is compatible > with the docutils architecture Yes, something like that. > - this would possibly allow Jens's work to move out of the > sandbox and into docutils, so that for example his > rst2latexmath front end could be ditched and users could > just use, e.g., rst2latex (and similarly users wanting > xhtml+mathml could just use rst2html) > > If that is right, what is the way forward? Implementation. -- David Goodger <http://python.net/~goodger> |
From: Alan G I. <ai...@am...> - 2008-05-14 22:28:22
|
> Alan wrote: >> - this [math directive] looks to require a default >> transform of literal inclusion (?) On Wed, 14 May 2008, David Goodger wrote: > I have no idea what you mean by this. Hopefully the following is clearer. It seems that you are open to adding a ``math`` directive to docutils. If it is going to be part of docutils, then I assumed all writers would need to be able to use *some* transform of it. My thought was that the fall back behavior (for no specified format or for an unrecognized format) would be to treat the math block as a literal block. > Alan wrote: >> what is the way forward? On Wed, 14 May 2008, David Goodger wrote: > Implementation. That sounds like an invitation. I am willing to help with testing and with bug fixes. Someone who knows the architecture should take the lead. Jeff? Jens? Cheers, Alan |
From: Jeff A. <jef...@pr...> - 2008-05-15 01:55:40
Attachments:
signature.asc
|
Alan G Isaac wrote: >> Alan wrote: >> >>> - this [math directive] looks to require a default >>> transform of literal inclusion (?) >>> > > On Wed, 14 May 2008, David Goodger wrote: > >> I have no idea what you mean by this. >> > > Hopefully the following is clearer. > It seems that you are open to adding a ``math`` directive to docutils. > If it is going to be part of docutils, then I assumed all > writers would need to be able to use *some* transform of it. > My thought was that the fall back behavior (for no specified > format or for an unrecognized format) would be to treat the > math block as a literal block. > The idea that I am proposing is that the way to represent math in a math directive in docutils will follow the standard latex syntax. The writer will be responsible for making sure that this block of math code is represented properly in the output format that it is writing. Correct me if I am wrong, but the transform would be too early to rewrite the code for the equation. Here are a few cases: The latex writer: I definitely want to be able to pass the straight latex code to the latex writer. If I transform away from this way to represent equations, I'll have to transform back when I write the raw latex code. The html+png writer: This writer would pass the raw latex code to latex, and convert and crop. It needs to have the native latex syntax to be able to do this. The html+mathml writer: Many tools to display math in other ways expects at least a dialect of latex code. The html writer that wants to display mathml will be responsible to attempt to convert or fake anything that itex, doesn't quite do. An example of this would be the align environment. When the latex code is being converted to itex, and ultimately to mathml, it will be up to the writer to decide how to handle this align environment. It could ignore it, and treat it like a regular display block. It could use divs to fake an align-like effect when displaying. The philosophy is that since latex syntax for math is so common, if something needs something else, it should be responsible for handling those special cases. That way we don't have to re-invent how to represent equations. Let's use the de-facto way to represent equations-- latex code. It will need to be escaped to fit in docutil's xml code blocks, but it's human-readable, human-writable. Almost every way to represent or display math can understand it. I've not had as much time to be able to work on this as I'd have liked. I'll do a brain dump, and post a link to it here. Cheers! Jeff Anderson |
From: David G. <go...@py...> - 2008-05-15 03:01:41
Attachments:
signature.asc
|
[Jeff Anderson - 2008-05-14 21:55] > The idea that I am proposing is that the way to represent math in a math > directive in docutils will follow the standard latex syntax. That's fine, as long as all output formats (via Writers) are supported. .png is fine as a fallback. > The writer will be responsible for making sure that this block of math > code is represented properly in the output format that it is writing. No. The Writer is responsible for converting a document tree into the output format. Writers are not responsible for processing math markup or any other input. > Correct me if I am wrong, but the transform would be too early to > rewrite the code for the equation. No, a Transform is the correct place. The Writer is too late in the process. Here's why: there are multiple Writers, and each would need to implement the math processing. But this can and should be implemented once only, in a single Transform. Writers should not be concerned with the details of math processing; they simply convert a standard document tree to an output format. By implementing the math support in a Transform, we don't have to touch the Writer classes. Note that a Transform can be associated with a specific Writer, or can query the Writer to see what formats it supports. Querying is probably the best choice here. This is just a matter of adjusting your perspective a bit. It's not a big change. Your ideas are sound; just move them a bit earlier in the process. > Here are a few cases: > > The latex writer: > I definitely want to be able to pass the straight latex code to the > latex writer. If I transform away from this way to represent equations, > I'll have to transform back when I write the raw latex code. There's no need to convert excessively. The Transform would query the Writer, see that LaTeX is preferred, and merely tweak the input into raw LaTeX (adding whatever wrapper is necessary) for direct insertion by the Writer. In other words, the contents of the "math" directive is converted into a block of LaTeX as if it came from the "raw" directive. The LaTeX Writer is not affected. > The html+png writer: > This writer would pass the raw latex code to latex, and convert and > crop. It needs to have the native latex syntax to be able to do this. Just rethink it a bit: the Transform would pass the raw LaTeX code to a converter, get a PNG image back, and insert that into the document tree. The HTML Writer is not affected. > The html+mathml writer: > Many tools to display math in other ways expects at least a dialect of > latex code. The html writer that wants to display mathml will be > responsible to attempt to convert or fake anything that itex, doesn't > quite do. I don't know enough about LaTeX or iTeX to know what you mean here. This is just another case of conversion -- to "raw" HTML & MathML in this case. > An example of this would be the align environment. When the > latex code is being converted to itex, and ultimately to mathml, it will > be up to the writer to decide how to handle this align environment. It > could ignore it, and treat it like a regular display block. It could use > divs to fake an align-like effect when displaying. That sounds like the responsibility of a stylesheet. > The philosophy is that since latex syntax for math is so common, ... > Let's use the de-facto way to represent equations-- latex code. ... > That way we don't have to re-invent how to represent equations. I agree with all this completely. > if something needs something else, it should be responsible for > handling those special cases. All I'm saying is that this part -- the conversion of the LaTeX math input into various types of output -- should be handled in a single place. It should not be distributed into every Writer class. Writers have no business dealing with the input/source. Writers don't do processing of input. Writers deal with a fully-processed document tree only. With all this in mind, please read PEP 258 to try to understand the Docutils architecture. You'll see that a Transform is the correct place for math processing. -- David Goodger <http://python.net/~goodger> |
From: Alan G I. <ai...@am...> - 2008-05-15 03:48:39
|
> [Jeff Anderson - 2008-05-14 21:55] >> The idea that I am proposing is that the way to represent math in a math >> directive in docutils will follow the standard latex syntax. On Wed, 14 May 2008, David Goodger apparently wrote: > That's fine, as long as all output formats (via Writers) are supported. .png is > fine as a fallback. 1. What if the user has not installed LaTeX and does not want MathML? Is putting the LaTeX in a literal block fine as a fallback? (Comment: this approach is pretty common in the HTML representation of journal articles.) 2. I will be perfectly happy with only LaTeX math support. The reason I proposed that the format be specified as an option is that my recollection of a conversation from a couple years ago was that the developers did not want to shut out the use of other formats. Here are two already available alternatives. - for HTML writers, an ASCIIMathML equation could just be passed unaltered in a DIV with class math. For correct display, the user must have ASCIIMathML.js of course. When this cannot be assumed, display as a literal block is a very good alternative, which could be handled by a stylesheet (ASCIIMathML is designed for readability.) - Somewhat related, simple ASCII representation of math is often adequate. Of course a literal block could be used in this case, but that is less informative than explicitly marking the text as math. Again, support for LaTeX only works fine (!) for me; I never expect to write math any other way. But I felt I should recall the earlier conversation. Cheers, Alan PS Jeff, since Itex != LaTeX, it is probably best to focus on specifying a supported subset of LaTeX. Right? Keep in mind that Jens's work means the LaTeX->MathML transform is already largely developed. |
From: David G. <go...@py...> - 2008-05-15 13:04:09
Attachments:
signature.asc
|
[Alan G Isaac - 2008-05-14 23:51] >> [Jeff Anderson - 2008-05-14 21:55] >>> The idea that I am proposing is that the way to represent math in a math >>> directive in docutils will follow the standard latex syntax. > > On Wed, 14 May 2008, David Goodger apparently wrote: >> That's fine, as long as all output formats (via Writers) are supported. .png is >> fine as a fallback. > > 1. What if the user has not installed LaTeX and does not > want MathML? > Is putting the LaTeX in a literal block fine as a fallback? Yes, if that's the best that can be done. It would be useful to insert a comment into the output, like "unable to render math markup; required packages missing (see [URL])" > 2. I will be perfectly happy with only LaTeX math support. > The reason I proposed that the format be specified as an > option is that my recollection of a conversation from > a couple years ago was that the developers did not want to > shut out the use of other formats. Here are two already > available alternatives. It's up to the implementer. Working code that's "good enough" is better than perfect vaporware. Since we have nothing in the core now, it's better to concentrate on one concrete implementation than to insist on an ideal generalized design that may never get done. -- David Goodger <http://python.net/~goodger> |
From: G. M. <mi...@us...> - 2008-05-15 10:00:40
|
On 14.05.08, David Goodger wrote: > On Wed, May 14, 2008 at 1:38 PM, Alan Isaac <ai...@am...> wrote: > I don't think that a Writer should handle unrecognized formats. > Recognizing formats and dealing with input is the job of a Parser or > Reader. A Writer's job is to convert a standard doctree to the output > format. > Yes, I realize that this is dealing with a distinct data format, so > the model may break down. But this is something to keep in mind. We should care to differentiate between *input*, *internal*, and *output* format. The basic question for inclusion of a math-directive is "How should the standard doctree store the content of a math node?" I. e. which *internal* format should docutils use. The set of supported input and output formats can be extended later without change to the doctree specification. IMO, the main candidates for the *internal* data format are: LaTeX best graphical representation, relatively simply to type in directly, widely supported and established in the scientific community. Math-ML modern data-exchange format, standardised, "the future", hard to type in by hand. Unfortunately, conversion between them is not always loss-less, so it is desirable to keep input data that is in one of them in this format if the *output* data requires the same. > Math processing [...] should be handled [...] in a Transform. Even if > the output depends on the type of Writer, a Transform is the way to go. IMO, the Transform should convert the *input* format into the *internal* format (the one required by the current writer or both), normalising the doctree. Jens' latex-math provides the code for a LaTeX->MathML Transform, searching for a suitable MathML-LaTeX converter is the next important step. > By the time the Writer sees it, the math should be just a blob to > insert into the output stream. In the most basic cases, yes. But generally a writer will convert the *internal* format of the standard doctree to the *output* format. * the html+mathml writer just inserts the Math ML, * the latex writer inserts LaTeX code. However, some html writer variants (or options) would care for older browsers not understanding Math-ML * "html+pngmath" would produce graphical representations of the formulae from the LaTeX data (a la latex2html), * "html+htmlmath" would convert the Math-ML to a HTML+CSS substitution, * "html-jsmath" would write HTML+java-script for the jsmath extension... Other writers are feasible as well, e.g. * a "unicode" writer could convert the math node content to a textual representation using the Unicode chars for math symbols. (Unicode defines "all possible" mathematical symbols, using a fixed-width font, even large symbols (spanning multiple lines) can be constructed.) Günter |
From: Aahz <aa...@py...> - 2008-05-15 15:08:29
|
On Thu, May 15, 2008, David Goodger wrote: > > I want the Transform to take care of these cases, not the Writers. The > option will affect the Transform. The logical place for the variation is > in the Transform (singular), NOT in the Writers (plural). > > Math output formats do not correspond one-to-one with document output > formats. It's an M-to-N relationship. Given that I know only what's been posted to this thread, I probably shouldn't speak up, but when has that ever stopped me? ;-) It sounds like the "math" directive as currently envisioned should be called "math-latex" to allow for a later "math-ml" option -- I expect people will later likely want to paste MathML into reST documents. -- Aahz (aa...@py...) <*> http://www.pythoncraft.com/ Help a hearing-impaired person: http://rule6.info/hearing.html |
From: Alan G I. <ai...@am...> - 2008-05-15 15:30:58
|
On Thu, 15 May 2008, Aahz apparently wrote: > It sounds like the "math" directive as currently envisioned should be > called "math-latex" to allow for a later "math-ml" option -- I expect > people will later likely want to paste MathML into reST documents. I *very* much doubt it. (Have you ever played with MathML? Yech!) But in any case, that is the kind of thing I have suggested should be handled by a ``format`` option. Implementation help would be great! I can offer to test and bug fix. Cheers, Alan Isaac |
From: G. M. <mi...@us...> - 2008-05-16 06:34:18
|
On 15.05.08, David Goodger wrote: > On Thu, May 15, 2008 at 11:54 AM, G. Milde wrote: > >>> However, some html writer variants (or options) would care for older > >>> browsers not understanding Math-ML > > ... > >> I want the Transform to take care of these cases, not the Writers. The > >> option will affect the Transform. > > > > So a realisation of this concept could be: > > > > * Define a "math" directive (rst syntax ``.. math::``) and role (rst > > syntax :math:`2+3`) > > > > * The Parser stores the content in a pending "math" doctree node > > (without parsing the content). > > > > * The Transform converts the content ... > The default would be as follows. Each Writer specifies a list of > acceptable formats (initially for math, but it could be extended > for other objects, like images). These would be in order of > preference. For example, the HTML Writer might specify ["mathml", > "png", "jpeg", "ascii"]. If the math output format is not specified > explicitly (via the --math-output option), the Transform would > query the Writer for its preferences, and choose the first that > matches its capability (with a fallback default of a literal block > containing the math input). :literal: would explicitly ask for the literal block (or string, in case of a math role), while :raw: would pass-through the content and insert like a raw node. ... > I'd have a single --math-output option with arguments > (auto-converted to lowercase; typing LaTeX is hard!). > > Options can be set: > > > > + in the configuration file (generic or writer-dependent) > > with system defaults in the standard conf file. > > > > + from command line options > > > > * If the Transform cannot convert to the desired format, a warning is > > issued and the content is put in a literal block (eventually > > preceded by a helpfull message). The warning and message can be suppressed by explicitly asking for --math-output literal (or a supported format, of course). > > > > * The writer inserts the content like a "raw" node content. Should the Transform store the converted math in a raw node? + writers will do "the right thing" without changes, - information is lost. This might matter if a) a transformed doctree is stored for later use, b) a writer wants to add e.g. some class info to math nodes. > > Example > > > > :: > > > > a) There is no arguing that :math:`1 + 1 = 2`. > > > > b) However it is not clear whether > > > > .. math:: > > > > 0 * \infty = 3 > > > > > > > > With the math-latex transformation, a) would become ``$1 + 1 = 2$`` > > and b) would become:: > > > > \[ > > 0 * \infty = 3 > > \] > > > > (i.e. adding the math switches) while math-output literal would leave the content unprocessed. > > To prevent e.g. a Math-ML -> LaTeX -> Math-ML conversion, > > an option to the "math" directive could specify the input format. > If multiple math input formats are supported, such an option would be > necessary. But no double-conversion would ever have to take place. > Whatever the math input format is, that's what would be stored in the > doctree for the Transform to process. > > A ``.. math-input-format::`` directive could be used to set the input > > format for the whole document ... > from that point forward, until another such directive. > > If not input format is specified, it could be guessed by the > > Transform (e.g. telling MathML from LaTeX should be easy) > Better to require that the format be specified. Explicit is better > than implicit. Then, math-input-format == "auto" could be used to explicitly ask for auto-determination by the Transform. BTW: how would I specify the math-input-format in case of a role (i.e. for inline math)? Would it help, if I started writing a specification (in the line of ref/rst/directives.txt)? Should this be a sandbox project (math-directive) or modify the main documentation? Günter |
From: David G. <go...@py...> - 2008-05-15 13:04:10
Attachments:
signature.asc
|
[G. Milde - 2008-05-15 05:42] > On 14.05.08, David Goodger wrote: >> I don't think that a Writer should handle unrecognized formats. >> Recognizing formats and dealing with input is the job of a Parser or >> Reader. A Writer's job is to convert a standard doctree to the output >> format. > >> Yes, I realize that this is dealing with a distinct data format, so >> the model may break down. But this is something to keep in mind. > > We should care to differentiate between *input*, *internal*, and *output* > format. > > The basic question for inclusion of a math-directive is "How should the > standard doctree store the content of a math node?" I. e. which > *internal* format should docutils use. That depends on the stage of processing. The reST parser cannot deal with the math input, because it's not reST. The math directive can't deal with the math input, because it has no idea what the output format should be. The Writer should not deal with the math input, because that's not its job and it would cause a duplication of effort. A Transform is the right place to deal with the math input conversion, because it has all the necessary information. The math input data should be stored in a "pending" node in the original input format. The Transform will convert that to something compatible with the target Writer. The internal format for math in the Docutils doctree is the input format (prior to the Transform running), and the output format (after the Transform runs). I have no interest in making a generic math-doctree; it would double the size of the doctree spec. As far as Docutils is concerned, math is a blob. And there is no reason to do a double conversion; it's imperfect. Let's just use a standard format internally: the input format, LaTeX. If you're still not convinced, let me hammer it home. There's one edge case that seals the deal: publishing to and from a doctree. A document can be processed to a pickled doctree, stored in a database, then later the doctree can be retrieved and processed into a concrete target format. In this case, the Transform should leave the "pending" node alone during the first run, since there *is* no target format yet. > The set of supported input and output formats can be extended later > without change to the doctree specification. Yes. > IMO, the main candidates for the *internal* data format are: > > LaTeX > best graphical representation, relatively simply to type in directly, > widely supported and established in the scientific community. > > Math-ML > modern data-exchange format, standardised, "the future", > hard to type in by hand. > > Unfortunately, conversion between them is not always loss-less, so it is > desirable to keep input data that is in one of them in this format if > the *output* data requires the same. No, it's desirable to keep the doctree data in the input format (whatever that is) until it's converted into the target format. > IMO, the Transform should convert the *input* format into the *internal* > format (the one required by the current writer or both), normalising the > doctree. No, this adds too much complication to the Writers. Math markup is very specialized, and should be processed in one place only. That one place could be a whole module, or even a package, but it should not be distributed over multiple Writers. This is NOT the job of a Writer! > Jens' latex-math provides the code for a LaTeX->MathML Transform, > searching for a suitable MathML-LaTeX converter is the next important > step. No double conversion, please. >> By the time the Writer sees it, the math should be just a blob to >> insert into the output stream. > > In the most basic cases, yes. But generally a writer will convert the > *internal* format of the standard doctree to the *output* format. The internal format for math is the input format. > * the html+mathml writer just inserts the Math ML, There is no html+mathml Writer. There's an HTML Writer, that's all. > * the latex writer inserts LaTeX code. > > However, some html writer variants (or options) would care for older > browsers not understanding Math-ML > > * "html+pngmath" would produce graphical representations of the > formulae from the LaTeX data (a la latex2html), > > * "html+htmlmath" would convert the Math-ML to a HTML+CSS > substitution, > > * "html-jsmath" would write HTML+java-script for the jsmath extension... > > Other writers are feasible as well, e.g. > > * a "unicode" writer could convert the math node content to a textual > representation using the Unicode chars for math symbols. > > (Unicode defines "all possible" mathematical symbols, using a > fixed-width font, even large symbols (spanning multiple lines) can be > constructed.) I want the Transform to take care of these cases, not the Writers. The option will affect the Transform. The logical place for the variation is in the Transform (singular), NOT in the Writers (plural). Math output formats do not correspond one-to-one with document output formats. It's an M-to-N relationship. Again, this specialized processing is NOT the responsibility of Writers! -- David Goodger <http://python.net/~goodger> |
From: David G. <go...@py...> - 2008-05-16 13:20:23
Attachments:
signature.asc
|
[Günter replied to me off-list by accident. Here's my reply, quoting with permission.] On Thu, May 15, 2008 at 11:54 AM, G. Milde <mi...@us...> wrote: > On 15.05.08, David Goodger wrote: >> [G. Milde - 2008-05-15 05:42] >>> We should care to differentiate between *input*, *internal*, and *output* >>> format. > >>> The basic question is which internal format should docutils >>> use. > >> The math input data should be stored in a "pending" node in the original >> input format. The Transform will convert that to something compatible >> with the target Writer. The internal format for math in the Docutils >> doctree is the input format (prior to the Transform running), and the >> output format (after the Transform runs). > >> I have no interest in making a generic math-doctree; it would double the >> size of the doctree spec. As far as Docutils is concerned, math is a >> blob. And there is no reason to do a double conversion; it's imperfect. >> Let's just use a standard format internally: the input format, LaTeX. > > I see the light. This perspective makes everything so much easier. Great! > (But I would like to see other input formats too (at a later stage).) One challenge at a time. >>> However, some html writer variants (or options) would care for older >>> browsers not understanding Math-ML > ... >> I want the Transform to take care of these cases, not the Writers. The >> option will affect the Transform. > > So a realisation of this concept could be: > > * Define a "math" directive (rst syntax ``.. math::``) and role (rst > syntax :math:`2+3`) > > * The Parser stores the content in a pending "math" doctree node > (without parsing the content). > > * The Transform converts the content depending on configuration > settings (options), e.g. ``math-pass``, ``math-latex``, ``math-ml``, > ``math-png``, ... That's possible, but I'd only do it as an override. The default would be as follows. Each Writer specifies a list of acceptable formats (initially for math, but it could be extended for other objects, like images). These would be in order of preference. For example, the HTML Writer might specify ["mathml", "png", "jpeg", "ascii"]. If the math output format is not specified explicitly (via the --math-output option), the Transform would query the Writer for its preferences, and choose the first that matches its capability (with a fallback default of a literal block containing the math input). > We would have to decide on a set of options (like above) or one keyword > (say ``math-format``) with a set of string values ("", "LaTeX", > "Math-ML", "PNG", ...) I'd have a single --math-output option with arguments (auto-converted to lowercase; typing LaTeX is hard!). > Options can be set: > > * in the configuration file (generic or writer-dependent) > with system defaults in the standard conf file. > > * from command line options > > * If the Transform cannot convert to the desired format, a warning is > issued and the content is put in a literal block (eventually > preceded by a helpfull message). > > * The writer inserts the content verbatim (but without wrapping it in a > verbatim container). Yes, that works. > Example > > :: > > a) There is no arguing that :math:`1 + 1 = 2`. > > b) However it is not clear whether > > .. math:: > > 0 * \infty = 3 > > > > With the math-latex transformation, a) would become ``$1 + 1 = 2$`` > and b) would become:: > > \[ > 0 * \infty = 3 > \] > > (i.e. adding the math switches) while math-pass would leave the content > unprocessed. Not sure what you mean by math-pass. Pass-through, and format as a literal block / inline literal? >> No double conversion, please. > > To prevent e.g. a Math-ML -> LaTeX -> Math-ML conversion, > an option to the "math" directive could specify the input format. If multiple math input formats are supported, such an option would be necessary. But no double-conversion would ever have to take place. Whatever the math input format is, that's what would be stored in the doctree for the Transform to process. > A ``.. math-input-format::`` directive could be used to set the input > format for the whole document. Sure. Or for the whole document, from that point forward, until another such directive. > If not input format is specified, it could be guessed by the > Transform (e.g. telling MathML from LaTeX should be easy) Better to require that the format be specified. Explicit is better than implicit. -- David Goodger <http://python.net/~goodger> |
From: David G. <go...@py...> - 2008-05-16 14:38:42
Attachments:
signature.asc
|
[G. Milde - 2008-05-16 02:33] > On 15.05.08, David Goodger wrote: >> On Thu, May 15, 2008 at 11:54 AM, G. Milde wrote: > >>>>> However, some html writer variants (or options) would care for older >>>>> browsers not understanding Math-ML >>> ... >>>> I want the Transform to take care of these cases, not the Writers. The >>>> option will affect the Transform. >>> So a realisation of this concept could be: >>> >>> * Define a "math" directive (rst syntax ``.. math::``) and role (rst >>> syntax :math:`2+3`) >>> >>> * The Parser stores the content in a pending "math" doctree node >>> (without parsing the content). >>> >>> * The Transform converts the content > ... >> The default would be as follows. Each Writer specifies a list of >> acceptable formats (initially for math, but it could be extended >> for other objects, like images). These would be in order of >> preference. For example, the HTML Writer might specify ["mathml", >> "png", "jpeg", "ascii"]. If the math output format is not specified >> explicitly (via the --math-output option), the Transform would >> query the Writer for its preferences, and choose the first that >> matches its capability (with a fallback default of a literal block >> containing the math input). > > :literal: would explicitly ask for the literal block (or string, > in case of a math role), while > :raw: would pass-through the content and insert like a raw node. Why have "raw" math at all? We already have a "raw" directive. Wouldn't they do exactly the same thing? >> I'd have a single --math-output option with arguments >> (auto-converted to lowercase; typing LaTeX is hard!). > >>> Options can be set: >>> >>> + in the configuration file (generic or writer-dependent) >>> with system defaults in the standard conf file. >>> >>> + from command line options >>> >>> * If the Transform cannot convert to the desired format, a warning is >>> issued and the content is put in a literal block (eventually >>> preceded by a helpfull message). > > The warning and message can be suppressed by explicitly asking > for --math-output literal (or a supported format, of course). Yes. >>> * The writer inserts the content > like a "raw" node content. > > > Should the Transform store the converted math in a raw node? > > + writers will do "the right thing" without changes, > > - information is lost. This might matter if > > a) a transformed doctree is stored for later use, Not a problem. A stored doctree will still have the "pending" node. When processing to a doctree (using docutils.core.publish_doctree), the "null" writer is used. The math Transform will not process the pending node in this case, but leave it for a later processing run (when a concrete Writer is specified). > b) a writer wants to add e.g. some class info to math nodes. The Transform could add a "math" class to the raw node. The alternative is to add a "math" node to Docutils. Problem there is that there are many options for the math output: raw, image, literal block, maybe others. I think a "math" class is sufficient. >>> Example >>> >>> :: >>> >>> a) There is no arguing that :math:`1 + 1 = 2`. >>> >>> b) However it is not clear whether >>> >>> .. math:: >>> >>> 0 * \infty = 3 >>> >>> >>> >>> With the math-latex transformation, a) would become ``$1 + 1 = 2$`` >>> and b) would become:: >>> >>> \[ >>> 0 * \infty = 3 >>> \] >>> >>> (i.e. adding the math switches) > while math-output literal would leave the content unprocessed. > > >>> To prevent e.g. a Math-ML -> LaTeX -> Math-ML conversion, >>> an option to the "math" directive could specify the input format. > >> If multiple math input formats are supported, such an option would be >> necessary. But no double-conversion would ever have to take place. >> Whatever the math input format is, that's what would be stored in the >> doctree for the Transform to process. > >>> A ``.. math-input-format::`` directive could be used to set the input >>> format for the whole document > ... >> from that point forward, until another such directive. > >>> If not input format is specified, it could be guessed by the >>> Transform (e.g. telling MathML from LaTeX should be easy) > >> Better to require that the format be specified. Explicit is better >> than implicit. > > Then, math-input-format == "auto" could be used to explicitly ask > for auto-determination by the Transform. If we ever have more than one supported math input format, we can revisit this issue. It is premature now, as we have none and someone may be working on one. > BTW: how would I specify the math-input-format in case of a role > (i.e. for inline math)? Again, if this ever becomes necessary, we'll revisit it. Possibilities: * :math-latex:`...`, :math-ml:`...`, etc. * .. math-input-format:: ... followed by :math:`...` > Would it help, if I started writing a specification (in the line of > ref/rst/directives.txt)? Yes. > Should this be a sandbox project (math-directive) or modify the > main documentation? I suggest a branch, since it touches the entire codebase. /branches/mathsupport or /branches/math would be fine. -- David Goodger <http://python.net/~goodger> |
From: G. M. <mi...@us...> - 2008-05-16 16:11:39
|
On 16.05.08, David Goodger wrote: > [G. Milde - 2008-05-16 02:33] >> On 15.05.08, David Goodger wrote: >>> On Thu, May 15, 2008 at 11:54 AM, G. Milde wrote: About math-output options: >> :literal: would explicitly ask for the literal block (or string, >> in case of a math role), while :raw: would pass-through the content and insert like a raw node. > Why have "raw" math at all? We already have a "raw" directive. Wouldn't > they do exactly the same thing? To mark the content as math in the doctree. * the pseudoxml writer could have "raw" as default * without changes to the document, (initially unsupported) Math-ML input could be passed as raw to the html-writer (with --math-option raw), while inserted as literal block in latex. ... >> Should the Transform store the converted math in a raw node? >> + writers will do "the right thing" without changes, >> - information is lost. ... > The Transform could add a "math" class to the raw node. > The alternative is to add a "math" node to Docutils. Problem there is > that there are many options for the math output: raw, image, literal > block, maybe others. I think a "math" class is sufficient. Store the converted math in a raw node then, with class="math" About multiple input formats: > If we ever have more than one supported math input format, we can revisit > this issue. It is premature now, as we have none and someone may be > working on one. OK >> Would it help, if I started writing a specification (in the line of >> ref/rst/directives.txt)? > Yes. >> Should this be a sandbox project (math-directive) or modify the >> main documentation? > I suggest a branch, since it touches the entire codebase. > /branches/mathsupport or /branches/math would be fine. I have no experiences with branches (setup merging etc) so I'll leave this to others. Looking at the documentation, I see a lot of ideas with "not yet implemented" hints. So, maybe the specification can be done in the doc/ trunk? Have a nice weekend Günter |
From: G. M. <mi...@us...> - 2008-05-19 13:34:32
Attachments:
math-support-specification.diff
|
On 16.05.08, David Goodger wrote: > [G. Milde - 2008-05-16 02:33] >> On 15.05.08, David Goodger wrote: >>> On Thu, May 15, 2008 at 11:54 AM, G. Milde wrote: >> :literal: would explicitly ask for the literal block (or string, >> in case of a math role), while >> :raw: would pass-through the content and insert like a raw node. > Why have "raw" math at all? We already have a "raw" directive. Wouldn't > they do exactly the same thing? While using the "raw" directive is used to mark a part of the the document, the math-output-format "raw" would not be specified in the document but in either the Writer configuration or a command line option. It can be used to pass-through content in a specific math format in a specific case while using the fall-back solution (literal) in all others without the need to make this format known to Docutils. Mind that I don not plan a restriction on the value of the math ``input-format``. However, only a restricted set of input formats (starting with just "latex") will be handled by the Transform. All other formats will be included literal (with a message like:: """math input-format ``%s`` cannot be converted to ``%s``. (Specify a supported output format, e.g. ``--output-format=verbatim``, to turn of this message)."""%(input-format, output-format) >>> Better to require that the format be specified. Explicit is better >>> than implicit. >> Then, math-input-format == "auto" could be used to explicitly ask >> for auto-determination by the Transform. > If we ever have more than one supported math input format, we can revisit > this issue. It is premature now, as we have none and someone may be > working on one. While I agree that it makes sense to start with just one math input format, I would like to set up the framework for multiple input formats to facilitate later extension. My proposal should enable the power user to pass-through arbitrary math formats without fear of later incompatibilites. >> BTW: how would I specify the math-input-format in case of a role >> (i.e. for inline math)? > Again, if this ever becomes necessary, we'll revisit it. Possibilities: > * :math-latex:`...`, :math-ml:`...`, etc. > * .. math-input-format:: ... > followed by :math:`...` I prefer the second variant. >> Would it help, if I started writing a specification (in the line of >> ref/rst/directives.txt)? > Yes. Done. >> Should this be a sandbox project (math-directive) or modify the >> main documentation? > I suggest a branch, since it touches the entire codebase. > /branches/mathsupport or /branches/math would be fine. IMO, the specification could be done to the docs/ref/ tree of the trunk with proper marking of the additions as NOT YET IMPLEMENTED. I'll attach my additions to the reference documents as a diff between my working copy and the docutils SVN version. I hope that I did meet the style and conventions of the docutils documentation and would like to see comments etc. Up to now, I only committed to the sandbox and I am not sure whether I am allowed to commit to trunk (and whether my idea to keep the specification in the trunk will be agreed upon). Therefore I will not commit unto I receive a get-ahead note but I am happy with someone else taking the diff and doing the commit (either as-is or modified). Günter |