epydoc fails to parse #: style doc strings
Brought to you by:
edloper
I'm using the latest epydoc 3.0 on a system with Python 2.5.1 and a system with Python 2.6.1. Parsing the same code on both systems, epydoc works without any warnings on the Python 2.5.1 system but produces many warnings on the Python 2.6.1. The most worrying one is when parsing source code with variable docstrings, e.g. (with line numbers):
90
91 #: X is very important.
92 x = 10
This produces the warning message:
Warning: Ignoring docstring comment block followed by a blank line in u'/
myFile.py' on line 90
Where the line number refers to the blank line _above_ the docstring line.
I've changed the summary title because I'm no longer convinced this is a Python 2.6 related issue since I now realise that my Python 2.5.1 system had epydoc 3.0beta1 installed, whereas my Python 2.6.1 system has epydoc 3.0.1 installed.
Can someone please confirm that epydoc 3.0.1 is still capable of parsing #: style doc strings, when they are used to document class-scope variables and located on the previous line, e.g.:
class MyClass(object):
""" MyClass """
#: This docstring doesn't parse.
aVariable = 10
bVariable = 20 #: This docstring does parse.
cVariable = 30
""" This docstring also parses. """
I've just tried epydoc 3.0beta1 on the Python 2.6.1 system and had the same problem... so it is a lack of Python 2.6 support that is to blame.
It's 2.6 related. The tokenizer works differently. For comment lines it emits a comment and a newline token now instead of a single comment token.
Does Ed Loper still support epydoc? It's beginning to look like a dying project, which isn't good news when Python develops so fast.
I've taken to hacking epydoc now to get it to parse my source code without warning. This both removes the warning message and ensures the documentation is correct (though doesn't handle the case of warning about the existence of blank lines between comments and the objects they are documenting, but that's not so important):
docparse.py : line 572
elif toktype == tokenize.NL:
pass
# if comments and not line_toks:
# log.warning('Ignoring docstring comment block followed by '
# 'a blank line in %r on line %r' %
# (module_doc.filename, srow-1))
# comments = []
I've created a more complete patch against 3.0.1 tackling that problem. Works for me at least :)
Can't find a way to add patch files, so here it comes:
diff -Nur epydoc-3.0.1/epydoc/docparser.py epydoc-3.0.1/epydoc/docparser.py
--- epydoc-3.0.1/epydoc/docparser.py
+++ epydoc-3.0.1/epydoc/docparser.py
@@ -72,6 +72,26 @@
from epydoc.compat import *
######################################################################
+## Tokenizer change in 2.6
+######################################################################
+
+def comment_includes_nl():
+ """ Determine whether comments are parsed as one or two tokens... """
+ readline = iter(u'\n#\n\n'.splitlines(True)).next
+ tokens = [
+ token.tok_name[tup[0]] for tup in tokenize.generate_tokens(readline)
+ ]
+ if tokens == ['NL', 'COMMENT', 'NL', 'ENDMARKER']:
+ return True
+ elif tokens == ['NL', 'COMMENT', 'NL', 'NL', 'ENDMARKER']:
+ return False
+ raise AssertionError(
+ "Tokenizer returns unexexpected tokens: %r" % tokens
+ )
+
+comment_includes_nl = comment_includes_nl()
+
+######################################################################
## Doc Parser
######################################################################
@@ -520,6 +540,10 @@
# inside that block, not outside it.
start_group = None
+ # If the comment tokens do not include the NL, every comment token
+ # sets this to True in order to swallow the next NL token unprocessed.
+ comment_nl_waiting = False
+
# Check if the source file declares an encoding.
encoding = get_module_encoding(module_doc.filename)
@@ -570,7 +594,9 @@
# then discard them: blank lines are not allowed between a
# comment block and the thing it describes.
elif toktype == tokenize.NL:
- if comments and not line_toks:
+ if comment_nl_waiting:
+ comment_nl_waiting = False
+ elif comments and not line_toks:
log.warning('Ignoring docstring comment block followed by '
'a blank line in %r on line %r' %
(module_doc.filename, srow-1))
@@ -578,6 +604,7 @@
# Comment token: add to comments if appropriate.
elif toktype == tokenize.COMMENT:
+ comment_nl_waiting = not comment_includes_nl
if toktext.startswith(COMMENT_DOCSTRING_MARKER):
comment_line = toktext[len(COMMENT_DOCSTRING_MARKER):].rstrip()
if comment_line.startswith(" "):
patch eventually uploaded: https://sourceforge.net/tracker/?func=detail&aid=2872545&group_id=32455&atid=405620
The related patch has been applied to the Debian version of Epydoc, in Debian version 3.0.1-7. This was requested in Debian bug #590112 (http://bugs.debian.org/590112).
This is also fixed in http://code.google.com/p/epycsdoc/source/detail?r=d45a54d320429d9aeb37824f6c238ecd3a04c910 (a fork)