Document titles (was RE: [Docstring-develop] DPS - possible bugs/features)
Status: Pre-Alpha
Brought to you by:
goodger
From: Tony J I. (Tibs) <to...@ls...> - 2001-09-24 09:30:20
|
Ueli Schläpfer gave a good explanation of why David Goodger had chosen to do something I didn't understand - namely make a document with a title at the start have no section within it (except it was subtler than that). By the way, I did check the documentation, and it seemed to me that the current documentation indicated that a title would cause a section to be started - so David, if you want to perform this promotion, then it needs to be documented (unless I've missed it *again*). On the other hand, I don't actually see why the DPS system should have to do the promotion for the user, when it's not clear that it is always wanted (in other words, why is it up to the DPS system to decide that the case of a single section is special, and then have a sudden disjunction in its behaviour for two sections - that's my inner pedant objecting!). Regardless, an immediate solution to resolve the docstring case (and possibly a useful thing to do anyway) would be to have an argument to the Parser that states upfront that we are working on a document *fragment* - that is, something that is going to be "stitched in" by hand, later on, to an existing DPS tree. I would imagine that we may come across other actions in the future that are sensible in the context of a full document, but not in the cause of a fragment. As to HTML. The normal convention in HTML is that the document title (that is, the thing in <title> ) be the same as the <h1> title, and that there be one (and only one) <h1> in a document. The <title> is then displayed on the window as decoration (e.g., as the text on the top of the browser window). This is a strong convention, but is (of course) not part of the standard. A relevant cutting from HTML4 is: Every HTML document must have a TITLE element in the HEAD section. Authors should use the TITLE element to identify the contents of a document. Since users often consult documents out of context, authors should provide context-rich titles. Thus, instead of a title such as "Introduction", which doesn't provide much contextual background, authors should supply a title such as "Introduction to Medieval Bee-Keeping" instead. For reasons of accessibility, user agents must always make the content of the TITLE element available to users (including TITLE elements that occur in frames). The mechanism for doing so depends on the user agent (e.g., as a caption, spoken). Broadly, HTML common practise treats a document as having a single title at the top, which is used for both <title> and <h1>, and the "section hierarchy" (if any) starts with an <h2>. That makes HTML easy - it doesn't, of course, address any of Ueli's other points. Maybe (horrors) we should reserve one specific markup form to mean "overall title":: ============== Document title ============== =================== This is not allowed =================== (because it is a form reserved for the document title, of which we may only have one). Or perhaps we'll have to resort to:: :Title: Document title Somehow, I don't see David liking either of those... Anyway, to some specific comments: Ueli wrote: > This makes sense to me, so I consider it a feature. I'm actually not > sure how I'd be able to give a title to the document as a whole if the > parser worked as you expected (unless you special-cased the first > section level!) (Explicitly discriminating between a document title > and regular section titles doesn't count here.) Hmm. Having concentrated on the HTML case (sorry, it's what I've been working on) I hadn't seen the distinction, of course. My problem is that I'm trying to write formatters for *any* document that might come in (yes, I know I'm writing pydps/pysource, but I want the Writer to work for any document), so we have to be able to cope with: 1. Document with no titles at all 2. Document with one title (OK - David does that) 3. Document with more than one title (at the same level) - which in essence *really* resolves back to case 1. I'm afraid that the only "perfect" solution I can see for that (in the sense of *predictable*) is to require the user to indicate that they *do* have a document title, and that it is *this* thing, here. That then makes them aware of the problem, also, which I think is a necessary thing (otherwise, surprise will eventuate). Ueli wrote: > Now, it seems to me that the structures of documents and sections are > close relatives: > > - A document may or may not have a title, a section always has one. > - A document may have a subtitle, bibliographic elements, and an > abstract. A section has none of these. > - The rest of the content follows the same model. > > Can thus sections be treated as simpler cases of documents (instead of > the other way round, which is how I understand your post)? I'm not > sure how I would exploit this, though... I'm not sure either (nor if one should), but your analysis is clearly correct. Again, I've been too focused on the HTML case. David Goodger wrote: > It is actually intended that by the time the document tree > gets to the writer, it must have a title. The parser can't > always determine the title by itself, such as in PySource > mode. The PySource reader is expected to supply all the > titles as appropriate. Hmm. In PySource mode, the parser should not be trying to introduce titles - it is, after all, handling arbitrary document fragments, and can't know anything about their global scope (unless it is told!). *If* the final tree shall always have a title, where does it come from if the document author didn't provide one? Surely in that case it is not up to the *parser* to decide on what a title should be - that is up to the application. So one has three options: 1. The parser makes one up (yuck) 2. The application makes one up (yuck) 3. An error is generated (yuck) I'd vote for 3, not least for ease of explanation (I cite "there are no complex rules about making up document titles" versus "well, if the document is one section, we'll use the section title as a title, and not have any sections, but if the document is zero sections we'll do XXX and if the document is two or more sections we'll do YYY" - sorry to harp on the point!). Ueli seems, to me, to be partially arguing for the case I want in his next message. He also writes: > The source filename isn't known to the writer, is it? Well, it is in the pysource/pydps case (and I don't see why it shouldn't be elsewhere) - I attach a "filename" attribute to the Package and Module elements (which later on will become appropriate DPS structures, of course). > Still, say I want ``<title>filename.rtxt</title>`` in HTML, but I > definitely don't want ``\title{filename}`` in LaTeX. How about > giving the title a "generated" attribute? Then it's left to the > writer to use (or ignore) it, but any document could be required to > have a title. (Which would mean to update the DTD.) *If* David still really wants to produce a title of his own, then yes, that's a good distinction to make. > (BTW, my first idea was to add a "sourcefile=filename.rtxt" > attribute to the document. I like the "generated" much better, > though!) I think that the sourcefile as an optional attribute on the document is probably a useful thing, as well. Anyway, this is difficult stuff (in terms of folding it in to something easy to use and remember, not in terms of implementing!), so I await David's responses with interest. Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |