Looking at your examples and the OpenOffice.org XML DTD and
specification docs [*]_, I see that OpenOffice XML requires footnotes
themselves to be embedded inside the paragraph at the point of
reference, as does DocBook and, I believe, TeX. This makes sense for
processing (easier), but not for reading since the whole point of a
footnote is to remove the extra text from the main flow.
.. [*] Available from http://xml.openoffice.org/. DTD files are at
http://xml.openoffice.org/source/browse/xml/xmloff/dtd/ (text.mod
has the most significance here) and the specification is
http://xml.openoffice.org/xml-specification.pdf.
[Aahz]
>>> What I really ought to do is call for a walkabout on the footnote
>>> node, but I can't quite figure out how to do that.
[David]
>> I assume you're already *doing* a walkabout. Just let it continue.
>> You're short-circuiting the process artificially. The internal
>> document tree is well-formed: an end-tag (depart_tag) for every
>> start-tag (visit_tag), and all elements arranged as the DTD
>> describes. (If they're not, it's a bug.) Trust in the docree.
[Aahz]
> Well, I suppose what I *really* ought to do is transform the doctree
> to replace the footnote_reference with the actual footnote node.
> Then everything would work magically (modulo the issue of the
> <text:footnote-body> tag).
I don't think you should transform the tree at this point, since
you're traversing the tree. It's like modifying a list while looping
over it: dangerous. Having seen the context, I now think your first
idea was correct, to force a traversal of the footnote when you reach
the first footnote reference. However, note that there may be
multiple references to the same footnote, so only the *first*
reference should have its footnote traversed; others should use the
<text:footnote-ref> element I believe.
In general, doing a traversal on a subtree is simple. Say we want to
do a traversal starting at the "footnote" node::
# Use our own class to get a clone of ourselves:
visitor = self.__class__(self.document)
# Traverse the subtree rooted at "footnote":
footnote.walkabout(visitor)
# Collect the results (assumes uniform treatment of output):
self.body.extend(visitor.body)
But there's a tricky issue in the OpenOfficeTranslator class::
def visit_footnote(self, node):
raise nodes.SkipNode
This is fine for the outer traversal, but it clobbers the inner
traversal. Perhaps change this to::
def visit_footnote(self, node):
if self.handle_footnotes:
... handle footnotes
else:
raise nodes.SkipNode
How to set ``self.handle_footnotes``?
* It could be a parameter to the __init__ method, but we'd still have
to watch for nested footnotes (a footnote reference within a
footnote body).
* Start out with ``self.handle_footnotes`` true, and set it false in
``visit_document``? If nested traversals need to be done for any
other elements, this could backfire.
* Check for an empty ``self.body``? That might be the simplest and
best way::
def visit_footnote(self, node):
if self.body:
raise nodes.SkipNode
else:
... handle footnotes
Update: the DTD says "text:footnote and text:endnote elements may not
contain other text:footnote or text:endnote elements". Docutils
documents *can* have nested footnotes though, which complicates
matters. Either a workaround has to be devised, or we place a
*documented* restriction on Docutils wrt the OpenOffice writer. The
latter would be acceptable for now.
Let's take a look at the ``visit_footnote_reference`` method::
def visit_footnote_reference(self, node):
name = node['refid']
Why are you calling this the "name"? I find it misleading, since
there are "name" attributes on many elements. I'd use "refid" or
"footnote_id" instead.
Continuing::
id = node['id']
number = node['auto']
for footnote in self.document.autofootnotes:
if name == footnote['name']:
break
The ``if name == footnote['name']:`` test relies on an accident and
may not always work. IDs are derived from names, and simple names are
equal to their IDs, but more complicated names are not. For example,
the name "a name" turns into the ID "a-name". ID's can't have spaces
or anything apart from alphanumerics and "-" (see
docutils.nodes.make_id; 32 lines of docstring for a 3-line function!).
In any case, there's a much easier way to get the footnote node::
footnote = self.document.ids[name]
(Although it shouldn't be "name" but "refid".)
Since a footnote should only be rendered once, you should check if
it's already happened here. Something like::
if hasattr(footnote, 'rendered'):
self.body.append('<text:footnote-ref text:ref-name="%s"'
' text:reference-format="text">'
% ???)
...
else: # proceed as before
...
footnote.rendered = 1
I don't know what should replace the "???" above. The OpenOffice XML
spec says that "Footnotes, endnotes, and sequences are assigned names
by the application used to create the OpenOffice.org XML file format
when the document is exported." However, I can't find a "name"
attribute on <text:footnote> elements or subelements in the DTD. Does
it mean the "id" attribute? You should verify with some actual
OpenOffice output.
The text of the footnote reference can be traversed normally, then the
end-tag inserted by ``depart_footnote_reference``. But since there's
a conditional here, there are two ways to proceed:
1. Store the end-tag on an internal stack (like the "context" stack of
the HTML writer's HTMLTranslator class), and pop it off in
``depart_footnote_reference``. This approach is recommended.
2. Process the entire <text:footnote-ref> tag in the ``visit_...``
method, using ``self.astext()`` to get the label text. Insert the
end-tag, and finish with a ``raise nodes.SkipNode``. I don't
recommend this, because it complicates processing (makes the flow
hard to understand with a special case; uniform is better) and it
will break if the contents of a footnote_reference element ever
gets more complicated. This technique *cannot* be used on any
element with a content model more complicated than "(#PCDATA)", so
it's best not to use it at all.
Continuing with ``visit_footnote_reference``::
self.body.append('<text:footnote text:id="%s">\n' % id)
self.body.append('<text:footnote-citation text:string-value='
'"%s"/>\n' % number)
I don't see a "string-value" attribute in the DTD. I do see a "label"
attribute though. Either the DTD is wrong or out of date, or you have
the wrong attribute name. Also, I don't understand how you're using
the Docutils <footnote-reference> "auto" attribute here (in variable
"number").
Continuing::
self.body.append('<text:footnote-body>\n')
self.body.append(self.start_para % '.body')
for child in footnote.children:
if isinstance(child, nodes.paragraph):
self.body.append(child.astext())
self.body.append(self.end_para)
I'd replace most of the above with a nested tree traversal. Finally::
self.body.append('</text:footnote-body>\n')
self.body.append('</text:footnote>')
raise nodes.SkipNode
--
David Goodger <goodger@...> Open-source projects:
- Python Docutils: http://docutils.sourceforge.net/
(includes reStructuredText: http://docutils.sf.net/rst.html)
- The Go Tools Project: http://gotools.sourceforge.net/
|