Re: [Docutils-develop] Parsing oddness

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Richard Jones wrote:
> I've got some problems in parsing a document. The following biblio
> element results in errors:
> 
> >>>>> snip
> :Organization: Computer and Information Sciences::
> 
>                  New Jersey Institute of Technology
>                  University Heights
>                  Newark, NJ, 07102
>                  (address obsolescent).
> <<<<< snip

I'll take the warnings in reverse order.

> Reporter: WARNING (2) Cannot extract compound bibliographic field
> "Organization".

This is by design of the document model.  The "organization" element
currently may only contain text, not body elements.  It was intended
to only contain the organization's name.  You present a good example
of why the status quo perhaps ought to change.  Requires thought.

However, a literal block isn't really the ideal way to represent an
address block, is it?  I've been mulling over an idea for a "verse"
directive which seems to apply here.  See
http://docutils.sf.net/spec/notes.html#body-verse [#]_. What do you
think?  How about that ';;' syntax?

.. [#] Just updated.  Especially note the examples; I love having
   *Monty Python's Flying Circus: Just The Words* and *The Fairly
   Incomplete & Rather Badly Illustrated Monty Python Songbook* on my
   reference shelf, and I refer to them frequently (can you tell?).
   Makes programming fun!  Thank Guido for Python!

> Reporter: WARNING (2) Literal block expected at line 2; none found.

This is a tricky one.  Because field list field names can be quite
long, the spec allows for arbitrary minimum indentation of the second
and subsequent lines [#]_, without inferring any significance.  In
other words, the address block is not seen as being indented at all,
but just as a second paragraph.  So as far as the parser is concerned,
there *is* no literal block to be found!

I think this is correct behavior; perhaps the docs need some
clarification.  I invite counter-arguments, to either of the preceding
statements.  ;-)

.. [#] See
   http://docutils.sf.net/spec/rst/reStructuredText.html#field-lists,
   end of third paragraph, and
   http://docutils.sf.net/spec/rst/reStructuredText.html#indentation,
   last paragraph before the literal block.

To rectify this problem, just rewrite that field list item as
follows::

    :Organization:
        Computer and Information Sciences::

            New Jersey Institute of Technology
            University Heights
            Newark, NJ, 07102
            (address obsolescent).

Now the indentation of the literal block *is* significant (it's
different from the indentation of the second line).  However, you'll
either have to wait for a change to the document model (not
guaranteed!), or rewrite the field list item, either as a single
paragraph, or something like this::

    :Organization: Computer and Information Sciences,
                   New Jersey Institute of Technology [#]_

    .. [#]
       ::

           University Heights
           Newark, NJ, 07102
           (address obsolescent).

Note that the '::' has to be on the second line of the footnote, for
the same reason as given above for the field list item.  This behavior
may change, however (there's a to-do list entry that begins with "Fix
the parser's indentation handling" in
http://docutils.sf.net/spec/notes.html#restructuredtext-parser).

> I've hit another error in a definition list... note the line number
> the error is reported at, and the line that is actually erroneous
> (the last one).

I see it.  I'll see about fixing the error reporting.  It may take a
bit of thought though, so don't hold your breath.

As always, thanks for the feedback.  It will make its way into the
code and/or the docs, anon.

-- 
David Goodger  <go...@us...>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/