#65 Optionally have '#' act like '#:'

open
Edward Loper
None
5
2007-08-02
2007-08-01
Judah De Paula
No

It would be nice to have an option where '#' would act like the '#:' assignment doc strings, so a comment above an assignment would be automatically considered a doc string. I might be able to help with it?

Discussion

  • Edward Loper
    Edward Loper
    2007-08-01

    Logged In: YES
    user_id=195958
    Originator: NO

    But then *all* comments would be treated as docstrings, which seems like overkill in most cases -- I'd like to avoid getting a bunch of false posititives, where epydoc describes a variable using text from a comment that doesn't really have anything to do with that variable.

    What's the motivation behind this request? (a) there is some pre-existing code, that has comments before assignments, and you'd like to extract those comments; or (b) you find it inconvenient to type "#:" instead of "#" (e.g., because emacs is smarter about wrapping "#" than "#:").

    If (a), then can you tell by looking at the pre-existing code if this change would give you many 'false postivites'? If (b), perhaps we could discuss alternative explicit markings, that are still explicit, but that get around whatever issues you're having with using "#:"?

     
  • Judah De Paula
    Judah De Paula
    2007-08-02

    Logged In: YES
    user_id=695839
    Originator: YES

    I'm in your case 'a'. There's a large body of code that has been using '#' above assignments in the files and in class attribute definitions. An existing tool is currently being used to pull the comments from assignments in the file body and class definitions, while ignoring comments within function definitions. Function definitions are where most of the false positives would happen.

    To avoid the false positives, it seems like the implicit assignment-comment operation would have to behave differently from the existing '#:' operator. Sadly, this makes for a more invasive modification. I've already looked at register_markup_language() and unless I'm mistaken that interface will not provide the functionality I'm describing since previously ignored doc-strings need to be detected using certain rules. Is there an interface that I didn't see where a small plug-in could be added to do what I'm describing?

     
  • Edward Loper
    Edward Loper
    2007-08-02

    • assigned_to: nobody --> edloper
     
  • Edward Loper
    Edward Loper
    2007-08-02

    Logged In: YES
    user_id=195958
    Originator: NO

    To make the change globally, all you would need to do is change COMMENT_DOCSTRING_MARKER's value from "#:" to "#" in epydoc/docparser.py. To make it on a per-file basis wouldn't be too difficult -- that constant is only used in a single function, process_file(), so that function would need to be adapted.

    Comments before assignment statements inside most functions would already get ignored by epydoc. The exception is __init__ methods for classes, where epydoc does check for pseudo-docstrings, and uses them as descriptions of instance variables. E.g.:

    class A:
    def __init__(self):
    #: A counter used to keep track of ...
    self.counter = 0

    Is more-or-less equivalent to having a "@instancevar counter: A counter used to keep track of ..." field in A's docstring. It would be possible to disable this behavior without too much trouble, though -- the relevant code is in get_lhs_parent(), again in epydoc/docparser.py. E.g., a new module-constant FIND_DOCSTRINGS_IN_CONSTRUCTORS could be defined, and get_lhs_parent() could check that constant.

    Comments *before* functions or classes, or at the top of modules, might be another issue. It wouldn't be too difficult to turn these off globally -- e.g., process_funcdef includes the following statement:

    add_docstring_from_comments(func_doc, comments)

    which could simply be wrapped in a conditional, turning it off if desired. The case is similar for classes, etc.

    If this is to be done on a global basis, then it wouldn't be too hard:

    - epydoc.cli checks for some flag
    - if present, it sets epydoc.docparser.COMMENT_DOCSTRING_MARKER='# '
    - and sets epydoc.docparser.FIND_DOCSTRINGS_IN_CONSTRUCTORS=False
    - and sets epydoc.docparser.USE_COMMENTS_FOR_FUNCTIONS=False
    - etc.

    (This would mean that "#:" and "#" couldn't really be mixed)

    If this is to be done on a per-file basis, it's slightly trickier, but not too hard. One question would be how epydoc would know what to do for a given file? __docformat__ already has a pre-defined format (specified by a PEP), which doesn't really leave any room to specify this. And it seems like you'd want to know how you should be processing *before* you parse the contents of the file.. e.g., if __docformat__ comes after the module-initial comment. Also, since this is a large body of pre-existing code, modifying each file might not be desirable. Ideas?

     
  • Judah De Paula
    Judah De Paula
    2007-08-03

    Logged In: YES
    user_id=695839
    Originator: YES

    I've just made the substitution of '#:' with '#', and have not modified anything else. Granted, epydoc now spits out dozens of warnings per file about doc strings with no code after them; but it also does the right thing and discards those spurious comments while generally picking out the good ones.

    Comments in constructors generally should be detected as doc strings since they usually comment on instance variables, like in your example. I like the USE_COMMENTS_FOR_FUNCTIONS flag idea, but don't know what the right thing to do is with regard to dealing with different files.