Thread: [Docstring-develop] Adding pydps/pysource to CVS
Status: Pre-Alpha
Brought to you by:
goodger
|
From: David G. <go...@us...> - 2001-09-18 21:37:42
|
Tony, how about adding pydps/pysource to CVS? Have you been able to get CVS access? We could create a "sandbox" directory for playing around, either beside "test" (distributed with the snapshots) or beside "web" (not distributed). Or put the code in a suitable place inside the DPS package itself. -- David Goodger go...@us... Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net |
|
From: Garth K. <ga...@de...> - 2001-09-19 09:17:17
|
> Tony, how about adding pydps/pysource to CVS? Have you been able to > get CVS access? It isn't there yet. > We could create a "sandbox" directory for playing around, either > beside "test" (distributed with the snapshots) or beside "web" (not > distributed). Or put the code in a suitable place inside the DPS > package itself. I'm -0 to peer of ``test``, +0 to it being a different module under CVS (ie peer of ``web``); pydps *uses* dps, but it's not necessarily *part* of it. Note that I'm not all that worried either way. :) Regards, Garth. |
|
From: Tony J I. (Tibs) <to...@ls...> - 2001-09-19 09:45:55
|
> Tony, how about adding pydps/pysource to CVS? Have you been able to > get CVS access? In theory I would love to have it in CVS (for a start, I feel somewhat risky about continually updating with no versioning behind me). BUT unfortunately I cannot have CVS running at work, and I often do significant bits of development at my work machine. I guess I should explore getting CVS working at home, but my online connection at home is *seriously* flaky (and the first time I tried to install CVS, for another purpose, I got hopelessly confused - even Debian can't cope with a total lack of understanding!). One possibility (but more work for someone else) might be to "mirror" the files on www.tibsnjoan.co.uk onto sourceforge - but I imagine that would require human intervention. > We could create a "sandbox" directory for playing around, either > beside "test" (distributed with the snapshots) or beside "web" (not > distributed). Or put the code in a suitable place inside the DPS > package itself. I'm not entirely sure I understand the above, so I'll leave it alone... Tibs (singlehandedly holding back the tides of technological progress) -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "How fleeting are all human passions compared with the massive continuity of ducks." - Dorothy L. Sayers, "Gaudy Night" My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |
|
From: Garth K. <ga...@de...> - 2001-09-19 10:32:13
|
> BUT unfortunately I cannot have CVS running at work, and I often do > significant bits of development at my work machine. If you can ssh to SourceForge, install the CygWin toolkit -- it doesn't need administrator rights -- and use the SourceForge repository directly. It's pretty easy. > I guess I should explore getting CVS working at home, but my online > connection at home is *seriously* flaky (and the first time I tried to > install CVS, for another purpose, I got hopelessly confused - even > Debian can't cope with a total lack of understanding!). Anyone on the list speak Debian? I'm stuck in RPM-land (thankfully, SuSE). > One possibility (but more work for someone else) might be to "mirror" > the files on www.tibsnjoan.co.uk onto sourceforge - but I imagine that > would require human intervention. Yep, but not that hard. David or I can put it up, and when you release a new tarball we can extract it in-place and commit the updates for you. Regards, Garth. |
|
From: Tony J I. (Tibs) <to...@ls...> - 2001-09-19 12:09:07
|
Garth Kidd wrote: > If you can ssh to SourceForge, install the CygWin toolkit -- > it doesn't need administrator rights -- and use the > SourceForge repository directly. It's pretty easy. Ah. The problem at work isn't technical but "political". For reasons (which I happen to agree with) of company policy, we can't just install software on our machines without getting permission (or doing evaluation, etc.). (we *do* have ssh available, though - via PuTTY on NT (see below)) We do *have* cygwin available on our NT machines (I believe - we certainly have enough Unix commands available!). Of course, whilst I'm running [my private copy of Python 2.1 - evaluation, see above] on NT, the *directories* containing the pydps stuff are on Unix, and I tend to use rftp on Unix to upload them to ntlworld. Humph. I may talk to Owen (our systems admin guy - who is a friend of the guy who wrote PuTTY, interestingly enough). > Anyone on the list speak Debian? I'm stuck in RPM-land > (thankfully, SuSE). Oh, part of the problem is just that it seems to be very easy to take initial wrong steps with CVS, and once you've done so to get thoroughly confused. It's also that in the hour or two it takes to sort out what one is doing, I could be doing *useful* things, like coding pydps or bathing the kids or ironing or sleeping... > > One possibility (but more work for someone else) might be > > to "mirror" the files on www.tibsnjoan.co.uk onto sourceforge > > - but I imagine that would require human intervention. > > Yep, but not that hard. David or I can put it up, and when > you release a new tarball we can extract it in-place and > commit the updates for you. Hmm - still more work than I'd like someone else to have to do. Leave me to think about it for the moment, and I'll get back to you (for instance, it *may* be that Owen actually *does* have CVS around for his own uses, and is just not making it "visible" generally, to save confusion - internally we use RCS and some decade-or-more old stuff written in, erm, not very good Python (not by me!)). Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "How fleeting are all human passions compared with the massive continuity of ducks." - Dorothy L. Sayers, "Gaudy Night" My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |
|
From: Tony J I. (Tibs) <to...@ls...> - 2001-09-21 13:00:38
|
Hi - two problem reports and an upload.
First, I've uploaded a new pydps - it fixes some command line anomalies
(so one can now pretty print a text file!), and the HTML output now
makes a first pass at coping with references/footnotes/etc. (it doesn't
yet fold in targets, because that's plainly a job for the transformer
stage, and I don't have one of those yet!). Anyway, when run over
reStructuredText.txt, the result is now something approaching useful.
Now for the problems:
1. When trying to process reStructuredText.txt, the document produced
starts off like (output in "pretty mode")::
<document name="restructuredtext markup specification">
<title>
reStructuredText Markup Specification
This seems wrong to me - surely by the law of least surprise,
it should actually be:
<document name="restructuredtext markup specification">
<section>
<title>
reStructuredText Markup Specification
After all, the document starts with a title, and everywhere else
in the document, a title is a signal that one is starting a section.
(this isn't pure pedantry - it makes it a lot easier for me to
determine what is going on - I don't *particularly* want to special
case "document", and I *do* want to be able to cope well with
documents that *don't* start with a title...)
2. When processing the string module (yes, I know it isn't marked as
containing reST texts!), the text::
[1:2]
(or similar) is incorrectly identified as a link. Not a Good Thing.
Hmm - taking a text file containing::
This is a document with no title.
What happens if we have a Python range, like [1:2] or [a:b]
(or even [http:fred])?
and outputting it in "pretty mode" gives us::
<document>
<paragraph>
This is a document with no title.
<paragraph>
What happens if we have a Python range, like [1:2] or [
<link refuri="a:b">
a:b
]
(or even [
<link refuri="http:fred">
http:fred
])?
which is interesting.
Tibs
--
Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/
Give a pedant an inch and they'll take 25.4mm
(once they've established you're talking a post-1959 inch, of course)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)
|
|
From: Ueli S. <u_s...@bl...> - 2001-09-21 16:16:52
|
While trying to get some LaTeX output from text files [#]_, I've
stumbled over your first bug/feature, too. This is what I found after
playing around a bit:
- Say you've got a document that starts with a title that uses a style
unique to the document. (David uses titles over- and underlined
with ``=`` in this case.) This title will be the document title::
---------------------------
================
Document title
================
The first paragraph
---------------------------
results in::
<document name="document title">
<title>
Document title
<paragraph>
The first paragraph
- Now add a title in the same style. You've got two sections, and no
document title anymore. With ::
---------------------------
===============
First section
===============
The first paragraph
================
Second section
================
Where's the document title?
---------------------------
you get this::
<document>
<section name="first section">
<title>
First section
<paragraph>
The first paragraph
<section name="second section">
<title>
Second section
<paragraph>
Where's the document title?
- OR add some text before this paragraph. Same result, no document
title any more, just section titles. Again an example::
---------------------------
Where's the document title?
===============
First section
===============
...
---------------------------
produces::
<document>
<paragraph>
Where's the document title?
<section name="first section">
<title>
First section
<paragraph>
...
This makes sense to me, so I consider it a feature. I'm actually not
sure how I'd be able to give a title to the document as a whole if the
parser worked as you expected (unless you special-cased the first
section level!) (Explicitly discriminating between a document title
and regular section titles doesn't count here.)
Now, it seems to me that the structures of documents and sections are
close relatives:
- A document may or may not have a title, a section always has one.
- A document may have a subtitle, bibliographic elements, and an
abstract. A section has none of these.
- The rest of the content follows the same model.
Can thus sections be treated as simpler cases of documents (instead of
the other way round, which is how I understand your post)? I'm not
sure how I would exploit this, though...
Hope I'm making sense here...
Ueli
.. [#] Not sure whether *that* is a good idea (me coding, not
LaTeX ;-) Anyway, when I have something useful, I may post it
here -- but only if everybody promises not to laugh out loud...
"Tony J Ibbs (Tibs)" <to...@ls...> writes:
> Hi - two problem reports and an upload.
>
> First, I've uploaded a new pydps - it fixes some command line anomalies
> (so one can now pretty print a text file!), and the HTML output now
> makes a first pass at coping with references/footnotes/etc. (it doesn't
> yet fold in targets, because that's plainly a job for the transformer
> stage, and I don't have one of those yet!). Anyway, when run over
> reStructuredText.txt, the result is now something approaching useful.
>
> Now for the problems:
>
> 1. When trying to process reStructuredText.txt, the document produced
> starts off like (output in "pretty mode")::
>
> <document name="restructuredtext markup specification">
> <title>
> reStructuredText Markup Specification
>
> This seems wrong to me - surely by the law of least surprise,
> it should actually be:
>
> <document name="restructuredtext markup specification">
> <section>
> <title>
> reStructuredText Markup Specification
>
> After all, the document starts with a title, and everywhere else
> in the document, a title is a signal that one is starting a section.
> (this isn't pure pedantry - it makes it a lot easier for me to
> determine what is going on - I don't *particularly* want to special
> case "document", and I *do* want to be able to cope well with
> documents that *don't* start with a title...)
>
[...]
|
|
From: David G. <go...@us...> - 2001-09-22 00:02:43
|
[Tony] > First, I've uploaded a new pydps Great; I'll have to take a look at it. Where should it be installed on the source tree? (Perhaps a short README.txt file?) Tony's problem #1 is regarding a transformation that takes place when a document contains a single section and nothing else (with the possible exception of comments and a bibliographic field list). The section is 'promoted' to the document level. Ueli's analysis is very good, spot-on. This promotion was a conscious decision, for exactly the reasons he states. How else would you specify a document title, especially in a standalone .rtxt file? But I can see why it would be cumbersome when dealing with Python source (PySource). Normally a docstring doesn't have an explicit title; the title gets assigned from the docstring's parent object's name. There will inevitably be odd cases where there is a leading title, however. Perhaps this parser-specific transformation should be made optional [#]_. Then you can treat everything as generic sections until integration is complete. .. [#] Along with bibliographic field list interpretation, RCS keyword filtering, and whatever other conveniences we dream up. [Ueli] > Now, it seems to me that the structures of documents and sections > are close relatives: > > - A document may or may not have a title, a section always has one. It is actually intended that by the time the document tree gets to the writer, it must have a title. The parser can't always determine the title by itself, such as in PySource mode. The PySource reader is expected to supply all the titles as appropriate. > - A document may have a subtitle, bibliographic elements, and an > abstract. A section has none of these. > - The rest of the content follows the same model. Correct. > Can thus sections be treated as simpler cases of documents (instead of > the other way round, which is how I understand your post)? I'm not > sure how I would exploit this, though... Basically, yes, sections are simple sub-documents. The top-level document does need to be special-cased in the end however. HTML pages need their titles! [Tony] > 2. When processing the string module (yes, I know it isn't marked as > containing reST texts!), the text:: > > [1:2] > > (or similar) is incorrectly identified as a link. Not a Good Thing. ``1:2`` won't be a link (as your example confirms), but ``a:b`` will be. Why? Because according to RFC2396, 'a' could be a URI scheme (as in 'http' or 'mailto'). The solution? Use inline literals. [Ueli] > Anyway, when I have something useful, I may post it here Yes, please! Your explanation of the situation was quite eloquent. > -- but only if everybody promises not to laugh out loud... Never! <aghast> -- David Goodger go...@us... Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net |
|
From: Ueli S. <u_s...@bl...> - 2001-09-22 07:25:55
|
[David]
> [Ueli]
> > Now, it seems to me that the structures of documents and sections
> > are close relatives:
> >
> > - A document may or may not have a title, a section always has one.
>
> It is actually intended that by the time the document tree gets to the
> writer, it must have a title. The parser can't always determine the
> title by itself, such as in PySource mode. The PySource reader is
> expected to supply all the titles as appropriate.
For PySource mode, this is certainly how it should work. However, I'm
not sure whether it is always the desired behaviour. I believe that
standalone rtxt files will often not have a formal document title,
just a few sections. The reader/linker can't make up a sensible guess
in this case, and IIRC one important goal is to not try to outsmart
the user. What would be a sensible default title, anyway? Which
leads me to believe that the document title should be left optional
(but see filename_). BTW, that's what dps/spec/gpdi.dtd says, too::
<!-- Optional elements may be generated by internal processing. -->
<!ELEMENT document
((title, subtitle?)?, (%bibliographic.elements;)*, abstract?,
%structure.model;)>
<!ATTLIST document %basic.atts;>
[David]
> Basically, yes, sections are simple sub-documents. The top-level
> document does need to be special-cased in the end however. HTML pages
> need their titles!
... and other types of output can merrily do without them. LaTeX
vs. HTML seems to be a good example here!
In LaTeX, I'd definitley use the document title as an argument to
``\title{...}`` and leave the ``\title{...}`` out if the document had
none. In HTML, though, <title>...</title> elements aren't displayed
(AFAIK, I'm not fluent in HTML). Not knowing what they're meant for,
I'd be perfectly comfortable with something generated in this case,
like the [filename]_ or something along the lines of "Document
generated by pydps" (which is what my pydps/html.py does).
.. [filename] The source filename isn't known to the writer, is it?
Still, say I want ``<title>filename.rtxt</title>`` in HTML, but I
definitely don't want ``\title{filename}`` in LaTeX. How about
giving the title a "generated" attribute? Then it's left to the
writer to use (or ignore) it, but any document could be required to
have a title. (Which would mean to update the DTD.)
(BTW, my first idea was to add a "sourcefile=filename.rtxt"
attribute to the document. I like the "generated" much better,
though!)
Ueli
|
|
From: Tony J I. (Tibs) <to...@ls...> - 2001-09-24 09:36:31
|
David Goodger wrote: > [Tony] > > First, I've uploaded a new pydps > > Great; I'll have to take a look at it. Where should it be installed on > the source tree? (Perhaps a short README.txt file?) At the moment, it doesn't need to be anywhere particular - it just imports things from DPS and restructuredtext (easy if they've been installed). I'll try to coin a README.txt file sometime soon, since this thing begins to approach usability (hah, until I restructure it under the new scheme!) > [Tony] > > 2. When processing the string module (yes, I know it isn't marked as > > containing reST texts!), the text:: > > > > [1:2] > > > > (or similar) is incorrectly identified as a link. Not a > Good Thing. > > ``1:2`` won't be a link (as your example confirms), but ``a:b`` will > be. Why? Because according to RFC2396, 'a' could be a URI scheme (as > in 'http' or 'mailto'). The solution? Use inline literals. Hmm. Will `a:b` be treated as a URI? (I haven't tested it). Is ``a:b`` *really* likely to be a sensible URI, given that ``a`` is entirely "local"? Should we be treating with the whole possible gamut of URIs, or restricting ourselves to those most likely? And other exciting questions I shall try not to think too hard about for the moment... Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |
|
From: Tony J I. (Tibs) <to...@ls...> - 2001-09-24 09:30:20
|
Ueli Schläpfer gave a good explanation of why David Goodger had chosen
to do something I didn't understand - namely make a document with a
title at the start have no section within it (except it was subtler than
that).
By the way, I did check the documentation, and it seemed to me that the
current documentation indicated that a title would cause a section to be
started - so David, if you want to perform this promotion, then it needs
to be documented (unless I've missed it *again*).
On the other hand, I don't actually see why the DPS system should have
to do the promotion for the user, when it's not clear that it is always
wanted (in other words, why is it up to the DPS system to decide that
the case of a single section is special, and then have a sudden
disjunction in its behaviour for two sections - that's my inner pedant
objecting!).
Regardless, an immediate solution to resolve the docstring case (and
possibly a useful thing to do anyway) would be to have an argument to
the Parser that states upfront that we are working on a document
*fragment* - that is, something that is going to be "stitched in" by
hand, later on, to an existing DPS tree. I would imagine that we may
come across other actions in the future that are sensible in the context
of a full document, but not in the cause of a fragment.
As to HTML. The normal convention in HTML is that the document title
(that is, the thing in <title> ) be the same as the <h1> title, and that
there be one (and only one) <h1> in a document. The <title> is then
displayed on the window as decoration (e.g., as the text on the top of
the browser window). This is a strong convention, but is (of course) not
part of the standard.
A relevant cutting from HTML4 is:
Every HTML document must have a TITLE element in the HEAD
section.
Authors should use the TITLE element to identify the contents
of a document. Since users often consult documents out of
context, authors should provide context-rich titles. Thus,
instead of a title such as "Introduction", which doesn't
provide much contextual background, authors should supply
a title such as "Introduction to Medieval Bee-Keeping" instead.
For reasons of accessibility, user agents must always make the
content of the TITLE element available to users (including
TITLE elements that occur in frames). The mechanism for doing
so depends on the user agent (e.g., as a caption, spoken).
Broadly, HTML common practise treats a document as having a single title
at the top, which is used for both <title> and <h1>, and the "section
hierarchy" (if any) starts with an <h2>.
That makes HTML easy - it doesn't, of course, address any of Ueli's
other points.
Maybe (horrors) we should reserve one specific markup form to mean
"overall title"::
==============
Document title
==============
===================
This is not allowed
===================
(because it is a form reserved for the document
title, of which we may only have one).
Or perhaps we'll have to resort to::
:Title: Document title
Somehow, I don't see David liking either of those...
Anyway, to some specific comments:
Ueli wrote:
> This makes sense to me, so I consider it a feature. I'm actually not
> sure how I'd be able to give a title to the document as a whole if the
> parser worked as you expected (unless you special-cased the first
> section level!) (Explicitly discriminating between a document title
> and regular section titles doesn't count here.)
Hmm. Having concentrated on the HTML case (sorry, it's what I've been
working on) I hadn't seen the distinction, of course.
My problem is that I'm trying to write formatters for *any* document
that might come in (yes, I know I'm writing pydps/pysource, but I want
the Writer to work for any document), so we have to be able to cope
with:
1. Document with no titles at all
2. Document with one title (OK - David does that)
3. Document with more than one title (at the same level)
- which in essence *really* resolves back to case 1.
I'm afraid that the only "perfect" solution I can see for that (in the
sense of *predictable*) is to require the user to indicate that they
*do* have a document title, and that it is *this* thing, here. That then
makes them aware of the problem, also, which I think is a necessary
thing (otherwise, surprise will eventuate).
Ueli wrote:
> Now, it seems to me that the structures of documents and sections are
> close relatives:
>
> - A document may or may not have a title, a section always has one.
> - A document may have a subtitle, bibliographic elements, and an
> abstract. A section has none of these.
> - The rest of the content follows the same model.
>
> Can thus sections be treated as simpler cases of documents (instead of
> the other way round, which is how I understand your post)? I'm not
> sure how I would exploit this, though...
I'm not sure either (nor if one should), but your analysis is clearly
correct. Again, I've been too focused on the HTML case.
David Goodger wrote:
> It is actually intended that by the time the document tree
> gets to the writer, it must have a title. The parser can't
> always determine the title by itself, such as in PySource
> mode. The PySource reader is expected to supply all the
> titles as appropriate.
Hmm. In PySource mode, the parser should not be trying to introduce
titles - it is, after all, handling arbitrary document fragments, and
can't know anything about their global scope (unless it is told!).
*If* the final tree shall always have a title, where does it come from
if the document author didn't provide one? Surely in that case it is not
up to the *parser* to decide on what a title should be - that is up to
the application. So one has three options:
1. The parser makes one up (yuck)
2. The application makes one up (yuck)
3. An error is generated (yuck)
I'd vote for 3, not least for ease of explanation (I cite "there are no
complex rules about making up document titles" versus "well, if the
document is one section, we'll use the section title as a title, and not
have any sections, but if the document is zero sections we'll do XXX and
if the document is two or more sections we'll do YYY" - sorry to harp on
the point!).
Ueli seems, to me, to be partially arguing for the case I want in his
next message. He also writes:
> The source filename isn't known to the writer, is it?
Well, it is in the pysource/pydps case (and I don't see why it shouldn't
be elsewhere) - I attach a "filename" attribute to the Package and
Module elements (which later on will become appropriate DPS structures,
of course).
> Still, say I want ``<title>filename.rtxt</title>`` in HTML, but I
> definitely don't want ``\title{filename}`` in LaTeX. How about
> giving the title a "generated" attribute? Then it's left to the
> writer to use (or ignore) it, but any document could be required to
> have a title. (Which would mean to update the DTD.)
*If* David still really wants to produce a title of his own, then yes,
that's a good distinction to make.
> (BTW, my first idea was to add a "sourcefile=filename.rtxt"
> attribute to the document. I like the "generated" much better,
> though!)
I think that the sourcefile as an optional attribute on the document is
probably a useful thing, as well.
Anyway, this is difficult stuff (in terms of folding it in to something
easy to use and remember, not in terms of implementing!), so I await
David's responses with interest.
Tibs
--
Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/
Give a pedant an inch and they'll take 25.4mm
(once they've established you're talking a post-1959 inch, of course)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)
|
|
From: Ueli S. <u_s...@bl...> - 2001-09-25 17:47:00
|
(Warning: Some of what follws may not be sorted out well enough -- I
don't have the time to write and rewrite as much as I
should, but I don't want to miss the opportunity to add my 2
cents worth before going off line for a couple days...)
[Tony]
> As to HTML. The normal convention in HTML is that the document title
> (that is, the thing in <title> ) be the same as the <h1> title, and that
> there be one (and only one) <h1> in a document. The <title> is then
[...]
> at the top, which is used for both <title> and <h1>, and the "section
> hierarchy" (if any) starts with an <h2>.
Thanks for the explanations! I don't see how it would make HTML
easier, though, as the final output is
[...]
>
> My problem is that I'm trying to write formatters for *any* document
> that might come in (yes, I know I'm writing pydps/pysource, but I want
> the Writer to work for any document), so we have to be able to cope
> with:
>
> 1. Document with no titles at all
> 2. Document with one title (OK - David does that)
> 3. Document with more than one title (at the same level)
> - which in essence *really* resolves back to case 1.
>
> I'm afraid that the only "perfect" solution I can see for that (in the
> sense of *predictable*) is to require the user to indicate that they
> *do* have a document title, and that it is *this* thing, here. That then
> makes them aware of the problem, also, which I think is a necessary
> thing (otherwise, surprise will eventuate).
>
[...]
> Hmm. In PySource mode, the parser should not be trying to introduce
> titles - it is, after all, handling arbitrary document fragments, and
> can't know anything about their global scope (unless it is told!).
>
> *If* the final tree shall always have a title, where does it come from
> if the document author didn't provide one? Surely in that case it is not
> up to the *parser* to decide on what a title should be - that is up to
> the application. So one has three options:
>
> 1. The parser makes one up (yuck)
> 2. The application makes one up (yuck)
> 3. An error is generated (yuck)
>
> I'd vote for 3, not least for ease of explanation (I cite "there are no
> complex rules about making up document titles" versus "well, if the
> document is one section, we'll use the section title as a title, and not
> have any sections, but if the document is zero sections we'll do XXX and
> if the document is two or more sections we'll do YYY" - sorry to harp on
> the point!).
You've a point here... Still, I believe the promotion is a Good Thing
at least for standalone documents. Let me try to reformulate:
The first title becomes the document title if
- it's at the very beginning of the document and
- it's the only title at this level.
Does that sound too complicated? Once I realized how it worked, I
went "oh, cool!" and wouldn't miss this feature now.
> Ueli seems, to me, to be partially arguing for the case I want in his
> next message. He also writes:
I don't have a firm point of view yet, and I second your point that
it's not a trivial issue.
My current opinion is that it really depends on what input we're
processing -- what is good for docstring processing may be wrong for
standalone documents and vice versa. Hence I believe that the
promotion of a lone first section title should be optional.
Similarly, specifying and enforcing whether a documents needs a title
or not should be left to the writer, as this depends on the output
format and possibly on the context the system is used in, and the
application, as it may also depend on contecxt entirely *outside* the
DPS. Which obviously opens up a whole new discussion, namely on how
to pass such context information down to the DPS...
> > The source filename isn't known to the writer, is it?
>
> Well, it is in the pysource/pydps case (and I don't see why it shouldn't
> be elsewhere) - I attach a "filename" attribute to the Package and
> Module elements (which later on will become appropriate DPS structures,
> of course).
>
> > Still, say I want ``<title>filename.rtxt</title>`` in HTML, but I
> > definitely don't want ``\title{filename}`` in LaTeX. How about
> > giving the title a "generated" attribute? Then it's left to the
> > writer to use (or ignore) it, but any document could be required to
> > have a title. (Which would mean to update the DTD.)
>
> *If* David still really wants to produce a title of his own, then yes,
> that's a good distinction to make.
It's not enough, though -- the writer still needs to be told by the
application whether it may use it or not. Again, a docstring
vs. standalone-document issue (the document title for a docstring
will probably be generated by the reader, but should be respected).
But I really have to go now... Hope I was clear enough!
Ueli
|
|
From: Tony J I. (Tibs) <to...@ls...> - 2001-09-26 09:54:27
|
I started this to reply to Ueli, and then realised that my response was
essentially just:
I think that David's approach (as I understand it)
of providing optional methods to produce the effect
wanted is probably the best way forward.
but that I wanted to diverge sideways...
I'm aiming to refactor pydps (bouncy fun - and maybe produce pysource
out of it, who knows) to follow David's Reader/Transformer/Writer model
in the next week or so (I've not got very far yet, though, apart from
thinking about it lots).
I *actually* expect to have a flowline that's something like (sorry, no
ASCII art):
1. Reader
- uses compiler [1]_ to extract information
from the Python source code
- calls the reST parser on any docstrings
that need it
- combs_ any of the resultant docstrings
2. Transformer
- (actually, the last two steps from the Reader
might arguably go here - we need to do the
docstring parsing after we *know* we've found
any ``__docformat__`` value, after all)
- produces a DPS tree from the Python information
- combs_ it as required [2]_
3. Writer
- outputs the DPS tree as (in the first instance)
HTML
Since the HTML Writer would be handling a pure DPS tree, it would then
be a good candidate for moving out of pysource into the main docutils
tree (if we have one by then). I rather hope that at least the
FootnoteComb would also be such a candidate.
Combs
-----
The metaphor of combing through hair to remove tangles is a bit iffy,
but I like the term (I think David calls them Filters, which is less
obvious to me). I'm expecting to have a series of combs which can be run
on a DPS tree or subtree. Obvious ones are:
TitleComb
This does what David wants to produce a title.
I suspect that it raises an exception if it
can't do so, at which point it is up to the
caller to do something sensible (either ignore
the problem, or provide a default title).
FootenoteComb
This runs over a subtree (clearly for Python code,
we don't want to run it over anything bigger than
a docstring, lest we confuse footnotes!) and sorts
out the numbering (in my development version of
pyspd (not on the web yet) this is actually done
as part of the HTML output phase, which is clearly
the Wrong Place for it.
David - a question or two on this. Each autonumbered
footnote/footnote reference has the attribute 'auto'
set to "1". I want to *insert* actual footnote numbers
into the tree. I can just add a new attribute 'auto-number'
into elements as required, *or* I could ask that you set
'auto' to be "-1" for the "no number yet" case, and use
'auto' to store the *actual* number calculated. Which to
do is a style issue, so I'd prefer to leave it up to you
(but using the same attribute would make things a bit
neater in the code - I'm not sure if it would generate
as elegant XML, though - I'd need to look up the detailed
attribute present/absent rules).
ContentsComb
This runs over the entire tree, and locates <section>
elements. It produces a <contents> subtree, which can
be inserted at the appropriate place, with links to
the <section>s. It needs to make sure that the links
it uses are *real*, so ideally it will use the "implicit"
link for a section when it exists, and it will have to
invent one when the implicit link isn't there (presumably
because the section is the twelfth "Introduction" in the
document...).
LinksComb
This handles the indirect hyperlinks. It probably comes
in two phases, because in a Python context we need to
*resolve* them on a per-docstring basis, but if the
user is trying to do the callout form of presentation,
they would then want to group them all at the end of the
document.
.. [1] I note that in the CVS for Python, the compiler module
is now in the standard library - hurrah! And it also looks
like the bug that stopped the 1.2aX compiler module from
*working* has also been fixed (just as I was about to find
out how to submit a bug report - even more hurrah!)
.. [2] does that count as verbing a noun?
Tibs
(So much to do, so little etc. And I've just been given news that I'm to
start on a *really exciting* project for paid work, as well.)
--
Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/
"Bounce with the bunny. Strut with the duck.
Spin with the chickens now - CLUCK CLUCK CLUCK!"
BARNYARD DANCE! by Sandra Boynton
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)
|