Thread: [Docstring-develop] Adding pydps/pysource to CVS
Status: Pre-Alpha
Brought to you by:
goodger
From: David G. <go...@us...> - 2001-09-18 21:37:42
|
Tony, how about adding pydps/pysource to CVS? Have you been able to get CVS access? We could create a "sandbox" directory for playing around, either beside "test" (distributed with the snapshots) or beside "web" (not distributed). Or put the code in a suitable place inside the DPS package itself. -- David Goodger go...@us... Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net |
From: Garth K. <ga...@de...> - 2001-09-19 09:17:17
|
> Tony, how about adding pydps/pysource to CVS? Have you been able to > get CVS access? It isn't there yet. > We could create a "sandbox" directory for playing around, either > beside "test" (distributed with the snapshots) or beside "web" (not > distributed). Or put the code in a suitable place inside the DPS > package itself. I'm -0 to peer of ``test``, +0 to it being a different module under CVS (ie peer of ``web``); pydps *uses* dps, but it's not necessarily *part* of it. Note that I'm not all that worried either way. :) Regards, Garth. |
From: Tony J I. (Tibs) <to...@ls...> - 2001-09-19 09:45:55
|
> Tony, how about adding pydps/pysource to CVS? Have you been able to > get CVS access? In theory I would love to have it in CVS (for a start, I feel somewhat risky about continually updating with no versioning behind me). BUT unfortunately I cannot have CVS running at work, and I often do significant bits of development at my work machine. I guess I should explore getting CVS working at home, but my online connection at home is *seriously* flaky (and the first time I tried to install CVS, for another purpose, I got hopelessly confused - even Debian can't cope with a total lack of understanding!). One possibility (but more work for someone else) might be to "mirror" the files on www.tibsnjoan.co.uk onto sourceforge - but I imagine that would require human intervention. > We could create a "sandbox" directory for playing around, either > beside "test" (distributed with the snapshots) or beside "web" (not > distributed). Or put the code in a suitable place inside the DPS > package itself. I'm not entirely sure I understand the above, so I'll leave it alone... Tibs (singlehandedly holding back the tides of technological progress) -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "How fleeting are all human passions compared with the massive continuity of ducks." - Dorothy L. Sayers, "Gaudy Night" My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |
From: Garth K. <ga...@de...> - 2001-09-19 10:32:13
|
> BUT unfortunately I cannot have CVS running at work, and I often do > significant bits of development at my work machine. If you can ssh to SourceForge, install the CygWin toolkit -- it doesn't need administrator rights -- and use the SourceForge repository directly. It's pretty easy. > I guess I should explore getting CVS working at home, but my online > connection at home is *seriously* flaky (and the first time I tried to > install CVS, for another purpose, I got hopelessly confused - even > Debian can't cope with a total lack of understanding!). Anyone on the list speak Debian? I'm stuck in RPM-land (thankfully, SuSE). > One possibility (but more work for someone else) might be to "mirror" > the files on www.tibsnjoan.co.uk onto sourceforge - but I imagine that > would require human intervention. Yep, but not that hard. David or I can put it up, and when you release a new tarball we can extract it in-place and commit the updates for you. Regards, Garth. |
From: Tony J I. (Tibs) <to...@ls...> - 2001-09-19 12:09:07
|
Garth Kidd wrote: > If you can ssh to SourceForge, install the CygWin toolkit -- > it doesn't need administrator rights -- and use the > SourceForge repository directly. It's pretty easy. Ah. The problem at work isn't technical but "political". For reasons (which I happen to agree with) of company policy, we can't just install software on our machines without getting permission (or doing evaluation, etc.). (we *do* have ssh available, though - via PuTTY on NT (see below)) We do *have* cygwin available on our NT machines (I believe - we certainly have enough Unix commands available!). Of course, whilst I'm running [my private copy of Python 2.1 - evaluation, see above] on NT, the *directories* containing the pydps stuff are on Unix, and I tend to use rftp on Unix to upload them to ntlworld. Humph. I may talk to Owen (our systems admin guy - who is a friend of the guy who wrote PuTTY, interestingly enough). > Anyone on the list speak Debian? I'm stuck in RPM-land > (thankfully, SuSE). Oh, part of the problem is just that it seems to be very easy to take initial wrong steps with CVS, and once you've done so to get thoroughly confused. It's also that in the hour or two it takes to sort out what one is doing, I could be doing *useful* things, like coding pydps or bathing the kids or ironing or sleeping... > > One possibility (but more work for someone else) might be > > to "mirror" the files on www.tibsnjoan.co.uk onto sourceforge > > - but I imagine that would require human intervention. > > Yep, but not that hard. David or I can put it up, and when > you release a new tarball we can extract it in-place and > commit the updates for you. Hmm - still more work than I'd like someone else to have to do. Leave me to think about it for the moment, and I'll get back to you (for instance, it *may* be that Owen actually *does* have CVS around for his own uses, and is just not making it "visible" generally, to save confusion - internally we use RCS and some decade-or-more old stuff written in, erm, not very good Python (not by me!)). Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "How fleeting are all human passions compared with the massive continuity of ducks." - Dorothy L. Sayers, "Gaudy Night" My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |
From: Tony J I. (Tibs) <to...@ls...> - 2001-09-21 13:00:38
|
Hi - two problem reports and an upload. First, I've uploaded a new pydps - it fixes some command line anomalies (so one can now pretty print a text file!), and the HTML output now makes a first pass at coping with references/footnotes/etc. (it doesn't yet fold in targets, because that's plainly a job for the transformer stage, and I don't have one of those yet!). Anyway, when run over reStructuredText.txt, the result is now something approaching useful. Now for the problems: 1. When trying to process reStructuredText.txt, the document produced starts off like (output in "pretty mode"):: <document name="restructuredtext markup specification"> <title> reStructuredText Markup Specification This seems wrong to me - surely by the law of least surprise, it should actually be: <document name="restructuredtext markup specification"> <section> <title> reStructuredText Markup Specification After all, the document starts with a title, and everywhere else in the document, a title is a signal that one is starting a section. (this isn't pure pedantry - it makes it a lot easier for me to determine what is going on - I don't *particularly* want to special case "document", and I *do* want to be able to cope well with documents that *don't* start with a title...) 2. When processing the string module (yes, I know it isn't marked as containing reST texts!), the text:: [1:2] (or similar) is incorrectly identified as a link. Not a Good Thing. Hmm - taking a text file containing:: This is a document with no title. What happens if we have a Python range, like [1:2] or [a:b] (or even [http:fred])? and outputting it in "pretty mode" gives us:: <document> <paragraph> This is a document with no title. <paragraph> What happens if we have a Python range, like [1:2] or [ <link refuri="a:b"> a:b ] (or even [ <link refuri="http:fred"> http:fred ])? which is interesting. Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |
From: Ueli S. <u_s...@bl...> - 2001-09-21 16:16:52
|
While trying to get some LaTeX output from text files [#]_, I've stumbled over your first bug/feature, too. This is what I found after playing around a bit: - Say you've got a document that starts with a title that uses a style unique to the document. (David uses titles over- and underlined with ``=`` in this case.) This title will be the document title:: --------------------------- ================ Document title ================ The first paragraph --------------------------- results in:: <document name="document title"> <title> Document title <paragraph> The first paragraph - Now add a title in the same style. You've got two sections, and no document title anymore. With :: --------------------------- =============== First section =============== The first paragraph ================ Second section ================ Where's the document title? --------------------------- you get this:: <document> <section name="first section"> <title> First section <paragraph> The first paragraph <section name="second section"> <title> Second section <paragraph> Where's the document title? - OR add some text before this paragraph. Same result, no document title any more, just section titles. Again an example:: --------------------------- Where's the document title? =============== First section =============== ... --------------------------- produces:: <document> <paragraph> Where's the document title? <section name="first section"> <title> First section <paragraph> ... This makes sense to me, so I consider it a feature. I'm actually not sure how I'd be able to give a title to the document as a whole if the parser worked as you expected (unless you special-cased the first section level!) (Explicitly discriminating between a document title and regular section titles doesn't count here.) Now, it seems to me that the structures of documents and sections are close relatives: - A document may or may not have a title, a section always has one. - A document may have a subtitle, bibliographic elements, and an abstract. A section has none of these. - The rest of the content follows the same model. Can thus sections be treated as simpler cases of documents (instead of the other way round, which is how I understand your post)? I'm not sure how I would exploit this, though... Hope I'm making sense here... Ueli .. [#] Not sure whether *that* is a good idea (me coding, not LaTeX ;-) Anyway, when I have something useful, I may post it here -- but only if everybody promises not to laugh out loud... "Tony J Ibbs (Tibs)" <to...@ls...> writes: > Hi - two problem reports and an upload. > > First, I've uploaded a new pydps - it fixes some command line anomalies > (so one can now pretty print a text file!), and the HTML output now > makes a first pass at coping with references/footnotes/etc. (it doesn't > yet fold in targets, because that's plainly a job for the transformer > stage, and I don't have one of those yet!). Anyway, when run over > reStructuredText.txt, the result is now something approaching useful. > > Now for the problems: > > 1. When trying to process reStructuredText.txt, the document produced > starts off like (output in "pretty mode"):: > > <document name="restructuredtext markup specification"> > <title> > reStructuredText Markup Specification > > This seems wrong to me - surely by the law of least surprise, > it should actually be: > > <document name="restructuredtext markup specification"> > <section> > <title> > reStructuredText Markup Specification > > After all, the document starts with a title, and everywhere else > in the document, a title is a signal that one is starting a section. > (this isn't pure pedantry - it makes it a lot easier for me to > determine what is going on - I don't *particularly* want to special > case "document", and I *do* want to be able to cope well with > documents that *don't* start with a title...) > [...] |
From: David G. <go...@us...> - 2001-09-22 00:02:43
|
[Tony] > First, I've uploaded a new pydps Great; I'll have to take a look at it. Where should it be installed on the source tree? (Perhaps a short README.txt file?) Tony's problem #1 is regarding a transformation that takes place when a document contains a single section and nothing else (with the possible exception of comments and a bibliographic field list). The section is 'promoted' to the document level. Ueli's analysis is very good, spot-on. This promotion was a conscious decision, for exactly the reasons he states. How else would you specify a document title, especially in a standalone .rtxt file? But I can see why it would be cumbersome when dealing with Python source (PySource). Normally a docstring doesn't have an explicit title; the title gets assigned from the docstring's parent object's name. There will inevitably be odd cases where there is a leading title, however. Perhaps this parser-specific transformation should be made optional [#]_. Then you can treat everything as generic sections until integration is complete. .. [#] Along with bibliographic field list interpretation, RCS keyword filtering, and whatever other conveniences we dream up. [Ueli] > Now, it seems to me that the structures of documents and sections > are close relatives: > > - A document may or may not have a title, a section always has one. It is actually intended that by the time the document tree gets to the writer, it must have a title. The parser can't always determine the title by itself, such as in PySource mode. The PySource reader is expected to supply all the titles as appropriate. > - A document may have a subtitle, bibliographic elements, and an > abstract. A section has none of these. > - The rest of the content follows the same model. Correct. > Can thus sections be treated as simpler cases of documents (instead of > the other way round, which is how I understand your post)? I'm not > sure how I would exploit this, though... Basically, yes, sections are simple sub-documents. The top-level document does need to be special-cased in the end however. HTML pages need their titles! [Tony] > 2. When processing the string module (yes, I know it isn't marked as > containing reST texts!), the text:: > > [1:2] > > (or similar) is incorrectly identified as a link. Not a Good Thing. ``1:2`` won't be a link (as your example confirms), but ``a:b`` will be. Why? Because according to RFC2396, 'a' could be a URI scheme (as in 'http' or 'mailto'). The solution? Use inline literals. [Ueli] > Anyway, when I have something useful, I may post it here Yes, please! Your explanation of the situation was quite eloquent. > -- but only if everybody promises not to laugh out loud... Never! <aghast> -- David Goodger go...@us... Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net |
From: Ueli S. <u_s...@bl...> - 2001-09-22 07:25:55
|
[David] > [Ueli] > > Now, it seems to me that the structures of documents and sections > > are close relatives: > > > > - A document may or may not have a title, a section always has one. > > It is actually intended that by the time the document tree gets to the > writer, it must have a title. The parser can't always determine the > title by itself, such as in PySource mode. The PySource reader is > expected to supply all the titles as appropriate. For PySource mode, this is certainly how it should work. However, I'm not sure whether it is always the desired behaviour. I believe that standalone rtxt files will often not have a formal document title, just a few sections. The reader/linker can't make up a sensible guess in this case, and IIRC one important goal is to not try to outsmart the user. What would be a sensible default title, anyway? Which leads me to believe that the document title should be left optional (but see filename_). BTW, that's what dps/spec/gpdi.dtd says, too:: <!-- Optional elements may be generated by internal processing. --> <!ELEMENT document ((title, subtitle?)?, (%bibliographic.elements;)*, abstract?, %structure.model;)> <!ATTLIST document %basic.atts;> [David] > Basically, yes, sections are simple sub-documents. The top-level > document does need to be special-cased in the end however. HTML pages > need their titles! ... and other types of output can merrily do without them. LaTeX vs. HTML seems to be a good example here! In LaTeX, I'd definitley use the document title as an argument to ``\title{...}`` and leave the ``\title{...}`` out if the document had none. In HTML, though, <title>...</title> elements aren't displayed (AFAIK, I'm not fluent in HTML). Not knowing what they're meant for, I'd be perfectly comfortable with something generated in this case, like the [filename]_ or something along the lines of "Document generated by pydps" (which is what my pydps/html.py does). .. [filename] The source filename isn't known to the writer, is it? Still, say I want ``<title>filename.rtxt</title>`` in HTML, but I definitely don't want ``\title{filename}`` in LaTeX. How about giving the title a "generated" attribute? Then it's left to the writer to use (or ignore) it, but any document could be required to have a title. (Which would mean to update the DTD.) (BTW, my first idea was to add a "sourcefile=filename.rtxt" attribute to the document. I like the "generated" much better, though!) Ueli |
From: Tony J I. (Tibs) <to...@ls...> - 2001-09-24 09:36:31
|
David Goodger wrote: > [Tony] > > First, I've uploaded a new pydps > > Great; I'll have to take a look at it. Where should it be installed on > the source tree? (Perhaps a short README.txt file?) At the moment, it doesn't need to be anywhere particular - it just imports things from DPS and restructuredtext (easy if they've been installed). I'll try to coin a README.txt file sometime soon, since this thing begins to approach usability (hah, until I restructure it under the new scheme!) > [Tony] > > 2. When processing the string module (yes, I know it isn't marked as > > containing reST texts!), the text:: > > > > [1:2] > > > > (or similar) is incorrectly identified as a link. Not a > Good Thing. > > ``1:2`` won't be a link (as your example confirms), but ``a:b`` will > be. Why? Because according to RFC2396, 'a' could be a URI scheme (as > in 'http' or 'mailto'). The solution? Use inline literals. Hmm. Will `a:b` be treated as a URI? (I haven't tested it). Is ``a:b`` *really* likely to be a sensible URI, given that ``a`` is entirely "local"? Should we be treating with the whole possible gamut of URIs, or restricting ourselves to those most likely? And other exciting questions I shall try not to think too hard about for the moment... Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |
From: Tony J I. (Tibs) <to...@ls...> - 2001-09-24 09:30:20
|
Ueli Schläpfer gave a good explanation of why David Goodger had chosen to do something I didn't understand - namely make a document with a title at the start have no section within it (except it was subtler than that). By the way, I did check the documentation, and it seemed to me that the current documentation indicated that a title would cause a section to be started - so David, if you want to perform this promotion, then it needs to be documented (unless I've missed it *again*). On the other hand, I don't actually see why the DPS system should have to do the promotion for the user, when it's not clear that it is always wanted (in other words, why is it up to the DPS system to decide that the case of a single section is special, and then have a sudden disjunction in its behaviour for two sections - that's my inner pedant objecting!). Regardless, an immediate solution to resolve the docstring case (and possibly a useful thing to do anyway) would be to have an argument to the Parser that states upfront that we are working on a document *fragment* - that is, something that is going to be "stitched in" by hand, later on, to an existing DPS tree. I would imagine that we may come across other actions in the future that are sensible in the context of a full document, but not in the cause of a fragment. As to HTML. The normal convention in HTML is that the document title (that is, the thing in <title> ) be the same as the <h1> title, and that there be one (and only one) <h1> in a document. The <title> is then displayed on the window as decoration (e.g., as the text on the top of the browser window). This is a strong convention, but is (of course) not part of the standard. A relevant cutting from HTML4 is: Every HTML document must have a TITLE element in the HEAD section. Authors should use the TITLE element to identify the contents of a document. Since users often consult documents out of context, authors should provide context-rich titles. Thus, instead of a title such as "Introduction", which doesn't provide much contextual background, authors should supply a title such as "Introduction to Medieval Bee-Keeping" instead. For reasons of accessibility, user agents must always make the content of the TITLE element available to users (including TITLE elements that occur in frames). The mechanism for doing so depends on the user agent (e.g., as a caption, spoken). Broadly, HTML common practise treats a document as having a single title at the top, which is used for both <title> and <h1>, and the "section hierarchy" (if any) starts with an <h2>. That makes HTML easy - it doesn't, of course, address any of Ueli's other points. Maybe (horrors) we should reserve one specific markup form to mean "overall title":: ============== Document title ============== =================== This is not allowed =================== (because it is a form reserved for the document title, of which we may only have one). Or perhaps we'll have to resort to:: :Title: Document title Somehow, I don't see David liking either of those... Anyway, to some specific comments: Ueli wrote: > This makes sense to me, so I consider it a feature. I'm actually not > sure how I'd be able to give a title to the document as a whole if the > parser worked as you expected (unless you special-cased the first > section level!) (Explicitly discriminating between a document title > and regular section titles doesn't count here.) Hmm. Having concentrated on the HTML case (sorry, it's what I've been working on) I hadn't seen the distinction, of course. My problem is that I'm trying to write formatters for *any* document that might come in (yes, I know I'm writing pydps/pysource, but I want the Writer to work for any document), so we have to be able to cope with: 1. Document with no titles at all 2. Document with one title (OK - David does that) 3. Document with more than one title (at the same level) - which in essence *really* resolves back to case 1. I'm afraid that the only "perfect" solution I can see for that (in the sense of *predictable*) is to require the user to indicate that they *do* have a document title, and that it is *this* thing, here. That then makes them aware of the problem, also, which I think is a necessary thing (otherwise, surprise will eventuate). Ueli wrote: > Now, it seems to me that the structures of documents and sections are > close relatives: > > - A document may or may not have a title, a section always has one. > - A document may have a subtitle, bibliographic elements, and an > abstract. A section has none of these. > - The rest of the content follows the same model. > > Can thus sections be treated as simpler cases of documents (instead of > the other way round, which is how I understand your post)? I'm not > sure how I would exploit this, though... I'm not sure either (nor if one should), but your analysis is clearly correct. Again, I've been too focused on the HTML case. David Goodger wrote: > It is actually intended that by the time the document tree > gets to the writer, it must have a title. The parser can't > always determine the title by itself, such as in PySource > mode. The PySource reader is expected to supply all the > titles as appropriate. Hmm. In PySource mode, the parser should not be trying to introduce titles - it is, after all, handling arbitrary document fragments, and can't know anything about their global scope (unless it is told!). *If* the final tree shall always have a title, where does it come from if the document author didn't provide one? Surely in that case it is not up to the *parser* to decide on what a title should be - that is up to the application. So one has three options: 1. The parser makes one up (yuck) 2. The application makes one up (yuck) 3. An error is generated (yuck) I'd vote for 3, not least for ease of explanation (I cite "there are no complex rules about making up document titles" versus "well, if the document is one section, we'll use the section title as a title, and not have any sections, but if the document is zero sections we'll do XXX and if the document is two or more sections we'll do YYY" - sorry to harp on the point!). Ueli seems, to me, to be partially arguing for the case I want in his next message. He also writes: > The source filename isn't known to the writer, is it? Well, it is in the pysource/pydps case (and I don't see why it shouldn't be elsewhere) - I attach a "filename" attribute to the Package and Module elements (which later on will become appropriate DPS structures, of course). > Still, say I want ``<title>filename.rtxt</title>`` in HTML, but I > definitely don't want ``\title{filename}`` in LaTeX. How about > giving the title a "generated" attribute? Then it's left to the > writer to use (or ignore) it, but any document could be required to > have a title. (Which would mean to update the DTD.) *If* David still really wants to produce a title of his own, then yes, that's a good distinction to make. > (BTW, my first idea was to add a "sourcefile=filename.rtxt" > attribute to the document. I like the "generated" much better, > though!) I think that the sourcefile as an optional attribute on the document is probably a useful thing, as well. Anyway, this is difficult stuff (in terms of folding it in to something easy to use and remember, not in terms of implementing!), so I await David's responses with interest. Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |
From: Ueli S. <u_s...@bl...> - 2001-09-25 17:47:00
|
(Warning: Some of what follws may not be sorted out well enough -- I don't have the time to write and rewrite as much as I should, but I don't want to miss the opportunity to add my 2 cents worth before going off line for a couple days...) [Tony] > As to HTML. The normal convention in HTML is that the document title > (that is, the thing in <title> ) be the same as the <h1> title, and that > there be one (and only one) <h1> in a document. The <title> is then [...] > at the top, which is used for both <title> and <h1>, and the "section > hierarchy" (if any) starts with an <h2>. Thanks for the explanations! I don't see how it would make HTML easier, though, as the final output is [...] > > My problem is that I'm trying to write formatters for *any* document > that might come in (yes, I know I'm writing pydps/pysource, but I want > the Writer to work for any document), so we have to be able to cope > with: > > 1. Document with no titles at all > 2. Document with one title (OK - David does that) > 3. Document with more than one title (at the same level) > - which in essence *really* resolves back to case 1. > > I'm afraid that the only "perfect" solution I can see for that (in the > sense of *predictable*) is to require the user to indicate that they > *do* have a document title, and that it is *this* thing, here. That then > makes them aware of the problem, also, which I think is a necessary > thing (otherwise, surprise will eventuate). > [...] > Hmm. In PySource mode, the parser should not be trying to introduce > titles - it is, after all, handling arbitrary document fragments, and > can't know anything about their global scope (unless it is told!). > > *If* the final tree shall always have a title, where does it come from > if the document author didn't provide one? Surely in that case it is not > up to the *parser* to decide on what a title should be - that is up to > the application. So one has three options: > > 1. The parser makes one up (yuck) > 2. The application makes one up (yuck) > 3. An error is generated (yuck) > > I'd vote for 3, not least for ease of explanation (I cite "there are no > complex rules about making up document titles" versus "well, if the > document is one section, we'll use the section title as a title, and not > have any sections, but if the document is zero sections we'll do XXX and > if the document is two or more sections we'll do YYY" - sorry to harp on > the point!). You've a point here... Still, I believe the promotion is a Good Thing at least for standalone documents. Let me try to reformulate: The first title becomes the document title if - it's at the very beginning of the document and - it's the only title at this level. Does that sound too complicated? Once I realized how it worked, I went "oh, cool!" and wouldn't miss this feature now. > Ueli seems, to me, to be partially arguing for the case I want in his > next message. He also writes: I don't have a firm point of view yet, and I second your point that it's not a trivial issue. My current opinion is that it really depends on what input we're processing -- what is good for docstring processing may be wrong for standalone documents and vice versa. Hence I believe that the promotion of a lone first section title should be optional. Similarly, specifying and enforcing whether a documents needs a title or not should be left to the writer, as this depends on the output format and possibly on the context the system is used in, and the application, as it may also depend on contecxt entirely *outside* the DPS. Which obviously opens up a whole new discussion, namely on how to pass such context information down to the DPS... > > The source filename isn't known to the writer, is it? > > Well, it is in the pysource/pydps case (and I don't see why it shouldn't > be elsewhere) - I attach a "filename" attribute to the Package and > Module elements (which later on will become appropriate DPS structures, > of course). > > > Still, say I want ``<title>filename.rtxt</title>`` in HTML, but I > > definitely don't want ``\title{filename}`` in LaTeX. How about > > giving the title a "generated" attribute? Then it's left to the > > writer to use (or ignore) it, but any document could be required to > > have a title. (Which would mean to update the DTD.) > > *If* David still really wants to produce a title of his own, then yes, > that's a good distinction to make. It's not enough, though -- the writer still needs to be told by the application whether it may use it or not. Again, a docstring vs. standalone-document issue (the document title for a docstring will probably be generated by the reader, but should be respected). But I really have to go now... Hope I was clear enough! Ueli |
From: Tony J I. (Tibs) <to...@ls...> - 2001-09-26 09:54:27
|
I started this to reply to Ueli, and then realised that my response was essentially just: I think that David's approach (as I understand it) of providing optional methods to produce the effect wanted is probably the best way forward. but that I wanted to diverge sideways... I'm aiming to refactor pydps (bouncy fun - and maybe produce pysource out of it, who knows) to follow David's Reader/Transformer/Writer model in the next week or so (I've not got very far yet, though, apart from thinking about it lots). I *actually* expect to have a flowline that's something like (sorry, no ASCII art): 1. Reader - uses compiler [1]_ to extract information from the Python source code - calls the reST parser on any docstrings that need it - combs_ any of the resultant docstrings 2. Transformer - (actually, the last two steps from the Reader might arguably go here - we need to do the docstring parsing after we *know* we've found any ``__docformat__`` value, after all) - produces a DPS tree from the Python information - combs_ it as required [2]_ 3. Writer - outputs the DPS tree as (in the first instance) HTML Since the HTML Writer would be handling a pure DPS tree, it would then be a good candidate for moving out of pysource into the main docutils tree (if we have one by then). I rather hope that at least the FootnoteComb would also be such a candidate. Combs ----- The metaphor of combing through hair to remove tangles is a bit iffy, but I like the term (I think David calls them Filters, which is less obvious to me). I'm expecting to have a series of combs which can be run on a DPS tree or subtree. Obvious ones are: TitleComb This does what David wants to produce a title. I suspect that it raises an exception if it can't do so, at which point it is up to the caller to do something sensible (either ignore the problem, or provide a default title). FootenoteComb This runs over a subtree (clearly for Python code, we don't want to run it over anything bigger than a docstring, lest we confuse footnotes!) and sorts out the numbering (in my development version of pyspd (not on the web yet) this is actually done as part of the HTML output phase, which is clearly the Wrong Place for it. David - a question or two on this. Each autonumbered footnote/footnote reference has the attribute 'auto' set to "1". I want to *insert* actual footnote numbers into the tree. I can just add a new attribute 'auto-number' into elements as required, *or* I could ask that you set 'auto' to be "-1" for the "no number yet" case, and use 'auto' to store the *actual* number calculated. Which to do is a style issue, so I'd prefer to leave it up to you (but using the same attribute would make things a bit neater in the code - I'm not sure if it would generate as elegant XML, though - I'd need to look up the detailed attribute present/absent rules). ContentsComb This runs over the entire tree, and locates <section> elements. It produces a <contents> subtree, which can be inserted at the appropriate place, with links to the <section>s. It needs to make sure that the links it uses are *real*, so ideally it will use the "implicit" link for a section when it exists, and it will have to invent one when the implicit link isn't there (presumably because the section is the twelfth "Introduction" in the document...). LinksComb This handles the indirect hyperlinks. It probably comes in two phases, because in a Python context we need to *resolve* them on a per-docstring basis, but if the user is trying to do the callout form of presentation, they would then want to group them all at the end of the document. .. [1] I note that in the CVS for Python, the compiler module is now in the standard library - hurrah! And it also looks like the bug that stopped the 1.2aX compiler module from *working* has also been fixed (just as I was about to find out how to submit a bug report - even more hurrah!) .. [2] does that count as verbing a noun? Tibs (So much to do, so little etc. And I've just been given news that I'm to start on a *really exciting* project for paid work, as well.) -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "Bounce with the bunny. Strut with the duck. Spin with the chickens now - CLUCK CLUCK CLUCK!" BARNYARD DANCE! by Sandra Boynton My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |