#303 epydoc fails to parse #: style doc strings

v3.0
open
Edward Loper
9
2009-02-20
2009-02-10
Ross Collins
No

I'm using the latest epydoc 3.0 on a system with Python 2.5.1 and a system with Python 2.6.1. Parsing the same code on both systems, epydoc works without any warnings on the Python 2.5.1 system but produces many warnings on the Python 2.6.1. The most worrying one is when parsing source code with variable docstrings, e.g. (with line numbers):

90
91 #: X is very important.
92 x = 10

This produces the warning message:

Warning: Ignoring docstring comment block followed by a blank line in u'/
myFile.py' on line 90

Where the line number refers to the blank line _above_ the docstring line.

Discussion

  • Ross Collins
    Ross Collins
    2009-02-20

    I've changed the summary title because I'm no longer convinced this is a Python 2.6 related issue since I now realise that my Python 2.5.1 system had epydoc 3.0beta1 installed, whereas my Python 2.6.1 system has epydoc 3.0.1 installed.

    Can someone please confirm that epydoc 3.0.1 is still capable of parsing #: style doc strings, when they are used to document class-scope variables and located on the previous line, e.g.:

    class MyClass(object):
    """ MyClass """

    #: This docstring doesn't parse.
    aVariable = 10
    bVariable = 20 #: This docstring does parse.
    cVariable = 30
    """ This docstring also parses. """

     
  • Ross Collins
    Ross Collins
    2009-02-20

    • priority: 5 --> 9
    • assigned_to: nobody --> edloper
    • summary: epydoc fails with Python 2.6? --> epydoc fails to parse #: style doc strings
     
  • Ross Collins
    Ross Collins
    2009-02-20

    I've just tried epydoc 3.0beta1 on the Python 2.6.1 system and had the same problem... so it is a lack of Python 2.6 support that is to blame.

     
  • André Malo
    André Malo
    2009-09-29

    It's 2.6 related. The tokenizer works differently. For comment lines it emits a comment and a newline token now instead of a single comment token.

     
  • Ross Collins
    Ross Collins
    2009-09-29

    Does Ed Loper still support epydoc? It's beginning to look like a dying project, which isn't good news when Python develops so fast.

    I've taken to hacking epydoc now to get it to parse my source code without warning. This both removes the warning message and ensures the documentation is correct (though doesn't handle the case of warning about the existence of blank lines between comments and the objects they are documenting, but that's not so important):

    docparse.py : line 572

    elif toktype == tokenize.NL:
    pass
    # if comments and not line_toks:
    # log.warning('Ignoring docstring comment block followed by '
    # 'a blank line in %r on line %r' %
    # (module_doc.filename, srow-1))
    # comments = []

     
  • André Malo
    André Malo
    2009-10-04

    I've created a more complete patch against 3.0.1 tackling that problem. Works for me at least :)

     
  • André Malo
    André Malo
    2009-10-04

    Can't find a way to add patch files, so here it comes:

    diff -Nur epydoc-3.0.1/epydoc/docparser.py epydoc-3.0.1/epydoc/docparser.py
    --- epydoc-3.0.1/epydoc/docparser.py
    +++ epydoc-3.0.1/epydoc/docparser.py
    @@ -72,6 +72,26 @@
    from epydoc.compat import *

    ######################################################################
    +## Tokenizer change in 2.6
    +######################################################################
    +
    +def comment_includes_nl():
    + """ Determine whether comments are parsed as one or two tokens... """
    + readline = iter(u'\n#\n\n'.splitlines(True)).next
    + tokens = [
    + token.tok_name[tup[0]] for tup in tokenize.generate_tokens(readline)
    + ]
    + if tokens == ['NL', 'COMMENT', 'NL', 'ENDMARKER']:
    + return True
    + elif tokens == ['NL', 'COMMENT', 'NL', 'NL', 'ENDMARKER']:
    + return False
    + raise AssertionError(
    + "Tokenizer returns unexexpected tokens: %r" % tokens
    + )
    +
    +comment_includes_nl = comment_includes_nl()
    +
    +######################################################################
    ## Doc Parser
    ######################################################################

    @@ -520,6 +540,10 @@
    # inside that block, not outside it.
    start_group = None

    + # If the comment tokens do not include the NL, every comment token
    + # sets this to True in order to swallow the next NL token unprocessed.
    + comment_nl_waiting = False
    +
    # Check if the source file declares an encoding.
    encoding = get_module_encoding(module_doc.filename)

    @@ -570,7 +594,9 @@
    # then discard them: blank lines are not allowed between a
    # comment block and the thing it describes.
    elif toktype == tokenize.NL:
    - if comments and not line_toks:
    + if comment_nl_waiting:
    + comment_nl_waiting = False
    + elif comments and not line_toks:
    log.warning('Ignoring docstring comment block followed by '
    'a blank line in %r on line %r' %
    (module_doc.filename, srow-1))
    @@ -578,6 +604,7 @@

    # Comment token: add to comments if appropriate.
    elif toktype == tokenize.COMMENT:
    + comment_nl_waiting = not comment_includes_nl
    if toktext.startswith(COMMENT_DOCSTRING_MARKER):
    comment_line = toktext[len(COMMENT_DOCSTRING_MARKER):].rstrip()
    if comment_line.startswith(" "):

     
  • The related patch has been applied to the Debian version of Epydoc, in Debian version 3.0.1-7. This was requested in Debian bug #590112 (http://bugs.debian.org/590112).