On 2007/05/09, at 00:17, Lea Wiemann wrote:
> Since you mentioned multiple PDF files, I'd be curious if
> you currently have an actual use case for multiple output files in any
> format other than HTML, or if that's just something you were
> expecting/hoping to fall out for free.
In the real world, I work as the director of a small team of
technical writers documenting a modular software product that is over
10 years old.
In such a long-lived documentation effort, you develop lots of
contents over time, and develop different strategies to deliver those
contents to different people.
I have struggled with the following issues over the years:
1) There is no simple way to just divide those contents into a set of
books. What works today will not work tomorrow or two years from now.
2) Different user profiles may require different books, often sharing
some of the contents. Either you plan for maintenance, or you will be
killed by maintenance later on.
3) Different output formats work better with different amounts of
information. For example, support prefers a large help file with
everything, but printable books work better with 100 pages or less.
4) The different helps and books still require links between them.
For example, links should just work in the "everything included"
help, but something sensible should happen between books as well.
In summary, your "books":
- Are always part of something larger
- Will evolve over time
When you add the requirement of having flexible output capabilities
in multiple formats, you quickly stumble on the following issues:
- Having the ability to publish a subset of a "book" in a sensible way
- Having the ability to publish several "books" as an integrated
output (for a large help or for a large web site)
- Having sensible links between contents that work on the different
output formats and different partitions. For example, link in HTML,
link with page reference in PDF (possibly to a different document!)
Right now docutils fails in a very basic way: unless you resort to
"raw" hacks, the images and the links do not adapt to the output
format at all.
I used a custom build system based on XML and heavily inspired by the
old linuxdoc that mostly solved these issues:
- Images were specified without extension. When publishing, each
output format would try different extensions in turn. This allowed
the HTML output to prefer .gif while the printable output
preferred .wmf over .gif et al.
- Documents in a directory were considered peers. Assuming that
documents were output to another directory, links between them worked
in whatever output format. Links had the form "id@...", and worked in
HTML, PDF, and Windows 95 help (PDF and Help used M$ Word as an
We moved to DITA after outgrowing the custom-build system. DITA uses
"topics" instead of documents as a base. DITA has obvious advantages
in a multi-writer environment (shorter pieces mean less fighting to
edit the same piece), but it also lacks any idea of "documentation set".
> Lea Wiemann' project at the Google summer of code:
> Time Line
> present - May 27
> Have some preliminary discussion on the Docutils mailing list
> about how each step should be implemented. I expect that much
> design discussion will still take place during the implementation
> phase as issues arise.
Sorry for replying so late... the real world got to me, and I
actually missed this part of your message until today :-(
> May 28 - June 10
> Add support for multiple input documents. This may involve adding
> a new format for a top-level "master" document which references
> all files in the documentation.
As I see it, you cannot have sensible links between different
documents unless you have some idea of "documentation set". This
allows you to generate a book that is part of a set of books, and
thus make links work across documents.
The master document is beautifully implemented in DITA, where the
similar concept of "map" is basically a tree of references to
included topics. Maps also feature clever ways to add "related links"
Important in the DITA concept of map is that you can have maps of
maps (useful to tame complexity) as well as alternative maps.
Meaning, a single topic can belong to different maps, and thus you
can easily organize the same contents in small and large maps as
needed, while varying the order and nesting of the topics.
DITA is also planning to have a "chunking" attribute that aims at
decoupling the inputs from the outputs, namely in HTML outputs. The
idea is that you can have multiple input files generate a single HTML
file and vice-versa.