Thread: [Epydoc-devel] Epytext markup enhancement suggestions

Brought to you by: edloper

epydoc-devel

[Epydoc-devel] Epytext markup enhancement suggestions

From: Mikko O. <mi...@re...> - 2008-02-15 03:01:30

Hi all,

Epydoc is the best thing since the sliced bread happened for Python which
otherwise lacks naming conventions and such. The goal of documentation is to
make the process simple as possible, so that it wouldn't "hinder" the work
of development and that's what Epydoc does well.

I have been wondering if it where possible to make epytext markup even
simpler. Based on my experiences in PHP and Java formal comment blocks, here
are some suggestions:

def f(a, b):
    """
    @param int|str a parameter description here
    @param module.Class b parameter description here
    @return c float return value description here
    """"

Versus (old style)

def f(a, b):
   """
   @param a: parameter description here
   @type a: int
   @type a: str
   @param b: parameter description here
   @type b: module.Class
   @return:  return value  description here
   @rtype: float
   """

I see a consirably win =)

1. Leave out : (other languages don't use it either)

2. Leave out @type, @rtype and describe type inline as the first word of
parameter paragraph.

3. Use | to separate different possible types

These changes can be made compatible with existing Epytext markup with some
smart logic (i.e. new syntax if effective if : is left out).

Now, tell my why this wouldn't be possible or I start to submit patches :)

-- 
Mikko Ohtamaa
Red Innovation Ltd.
+358 40 743 9707
www.redinnovation.com
Every problem is solvable if you can throw enough energy drinks at it

Re: [Epydoc-devel] Epytext markup enhancement suggestions

From: Edward L. <ed...@se...> - 2008-02-15 04:00:36

> Epydoc is the best thing since the sliced bread happened for Python [...]

Thanks! :)

> I have been wondering if it where possible to make epytext markup even
> simpler. [...]

> 1. Leave out : (other languages don't use it either)

I'm actually fairly partial to that ": myself -- to my eyes, your
examples are hard to read because I don't know where the field body
begins.  E.g. in your proposed:

@param int|str a parameter description here

I have a strong tendency to read "a" as an indefinite article, not a
variable name.  Also, note that the "type description" doesn't have to
be a single word, or even a simple expression.  I want people to be
able to express such notions as "dictionary from int to string" or
"sequence of int" or "nonnegative int," and human language is a great
way to do it. :)  And finally, including the ":" explicitly allows us
to implement your suggestion (2) as an optional feature -- we don't
want all the docstrings people have already written suddenly and
mysteriously breaking when they upgrade epydoc!

> 2. Leave out @type, @rtype and describe type inline as the first word of
> parameter paragraph.

I wouldn't mind adding some syntax that would allow types to be
optionally specified as part of a parameter/returns field.  With your
proposal this would look like (keeping the ":" because I like it):

@param int x: the x coordinate

I also wouldn't mind any number of alternatives, including:

@param x (int): the x coordiante
@param x [int]: the x coordinate

> 3. Use | to separate different possible types

I'm not sure what you intend for the behavior to be here.  You can
already write "int|str", and it will be rendered as such in your
output.  Do you want the "|" to get translated to "or" in the output?
If not, why not just write "int or str"?

So with this modified proposal (keep ":", allow types as prefix
strings, and just use "or"), your example function would be:

def f(a, b):
    """
    @param int or str a: parameter description here
    @param module.Class b: parameter description here
    @return float: return value description here
    """"

An open question is how the "type prefix" string would interact with
epydoc's current support for having a single "@param" that describes
multiple parameters.  E.g.:

@def f(x, y):
    """
    @param x, y: The coordinates of some point.
    """

We wouldn't want "x," being interpreted as a type for y!  This would
be one advantage of a syntax like "@param x (int): ..." that more
explicitly marks the type expression.

-Edward

Re: [Epydoc-devel] Epytext markup enhancement suggestions

From: Huu Da T. <huu...@gm...> - 2008-02-15 14:16:18

Edward Loper wrote:

>> 2. Leave out @type, @rtype and describe type inline as the first word of
>> parameter paragraph.
> 
> I wouldn't mind adding some syntax that would allow types to be
> optionally specified as part of a parameter/returns field.  With your
> proposal this would look like (keeping the ":" because I like it):
> 
> @param int x: the x coordinate
> 
> I also wouldn't mind any number of alternatives, including:
> 
> @param x (int): the x coordiante
> @param x [int]: the x coordinate

I like the () option as it looks like cast. It is a great idea for 
simple type, but when extending it complex type (or with formatting), I 
believe it would be hard to read. How would this be translated into the 
suggested format?

@param var1: first param.
@type var1: C{int} | C{None}
@param var2: an object
@type var2: L{search<re.search>}

-- 
Huu Da Tran <huu...@gm...>
Fidèle esclave des félins

Qui m'insulte, parfois, me dit la vérité.
Qui me flatte, plus souvent qu'autrement, me ment!

[Epydoc-devel] Fwd: Epytext markup enhancement suggestions

From: Mikko O. <mi...@re...> - 2008-02-16 11:09:40

I had accidentally posted this to Edward when I was going to post to the
list.

---------- Forwarded message ----------
From: Mikko Ohtamaa <mi...@re...>
Date: 15.2.2008 21:36
Subject: Re: [Epydoc-devel] Epytext markup enhancement suggestions
To: Edward Loper <ed...@se...>

Hi all,

Thank you for great feedback. It's nice to see discussion which could lead
to the evolution of open source tools :)

>
> > 1. Leave out : (other languages don't use it either)
>
>
> I'm actually fairly partial to that ": myself -- to my eyes, your
> examples are hard to read because I don't know where the field body
> begins.  E.g. in your proposed:
>

I have strong point to make here. This could be a taste issue. There is no
official "comment document standard", but no other language uses :

http://manual.phpdoc.org/HTMLSmartyConverter/HandS/phpDocumentor/tutorial_tags.param.pkg.html
http://java.sun.com/j2se/javadoc/writingdoccomments/#tag
C# uses XML tags
Doxygen does not have : in Java-like mode:
http://www.stack.nl/~dimitri/doxygen/docblocks.html<http://www.stack.nl/%7Edimitri/doxygen/docblocks.html>.
In non-Java mode Doxygen uses \param.

I wish that we would make learning curve for people coming for other
languages as low as possible. Personally, I was wondering. why epytext
didn't work for my @param tags until I noticed the important :.

@param int|str a parameter description here
>
>
> I have a strong tendency to read "a" as an indefinite article, not a
> variable name.

Sorry, I am not native English speaker and Finnish lacks articles :)

Also, note that the "type description" doesn't have to
> be a single word, or even a simple expression.  I want people to be
> able to express such notions as "dictionary from int to string" or
> "sequence of int" or "nonnegative int," and human language is a great
> way to do it. :)

If we have formal way to define syntax, we can automatically link type
definition to the correct type. Also, there is the longer description
"parameter description here" for special cases. The point is to avoid extra
@type definition which would lead to extra line of typing from the
developer. Also, typing C{}... something around a class definition is a
habit of epytext only and other document comments avoid this somehow.

Few more examples

def ban(usernames):
  """
  @param [string] usernames  A sequence of usernames who will be banned
  """

[] could mark dict, list and tuple. Or

def  ban(users):
    """
    @param [mooware.models.User] A tuple of User objects
    """"

And finally, including the ":" explicitly allows us
> to implement your suggestion (2) as an optional feature -- we don't
> want all the docstrings people have already written suddenly and
> mysteriously breaking when they upgrade epydoc!

I 100% agree. I don't want to break backwards compatibility. If we cannot
make a parser which automatically detects these cases, I recommend that

- I can create a dialect which is a chose of the developer (__doctring__ =
"epytext-java")

- A command-line switch

> > 3. Use | to separate different possible types
>
>
> I'm not sure what you intend for the behavior to be here.  You can
> already write "int|str", and it will be rendered as such in your
> output.  Do you want the "|" to get translated to "or" in the output?
> If not, why not just write "int or str"?

The point is that when parameter/return types are formally defined, so the
parser can link it to the specific class. | is a way to define multiple
return types on a single line. It really does not matter for me whether it
would be | or "or". I just picked | because PHP was already using it. But
after receving your feedback, it became obvious this is very non-pythonic
and "or" should be used :)

This could be a usual case

def find_foo(id):
  """
  @return Foo|None Return Foo instance or None if id is not found
 """

istead of

def find_foo(id)
   """
   @return: Foo instance or None if id is not found
   @rtype Foo
   @rtype None
"""
   """

So with this modified proposal (keep ":", allow types as prefix
> strings, and just use "or"), your example function would be:
>
> def f(a, b):
>     """
>     @param int or str a: parameter description here
>
>     @param module.Class b: parameter description here
>
>     @return float: return value description here
>     """"
>
> An open question is how the "type prefix" string would interact with
> epydoc's current support for having a single "@param" that describes
> multiple parameters.  E.g.:
>
> @def f(x, y):
>     """
>     @param x, y: The coordinates of some point.
>     """
>
> We wouldn't want "x," being interpreted as a type for y!  This would
> be one advantage of a syntax like "@param x (int): ..." that more
> explicitly marks the type expression.

With the new syntax this could be

def f(x,y)
   """
   @param x or y The coordinates of somepoint
   """

Well again... I think | was useful after all, since you can concatenate all
return types to a single word (no spaces) and parsing compatibility could
not be achieved. It would be impossible to parse @param xory The
coordines... :)

Anyways I think we are on a right track to get rid of @type and @rtype and I
could help to implement some of these solutions (if not all).

Cheers,
Mikko

-Edward
>

-

Re: [Epydoc-devel] Epytext markup enhancement suggestions

From: Mikko O. <mi...@re...> - 2008-02-16 11:10:07

Hello all,

Epydoc expects to find a *single* description of the type of a
> parameter.  I.e., the correct way to write that is:
>
>     @param x: blah
>     @type x: None or str


Ah,  my bad then =)


> > I have strong point to make here. This could be a taste issue. There is
> no
> > official "comment document standard", but no other language uses :
>
>
> Some markup languages that use ":" as a field separator:
>   - rfc822-style headers (very widely used, eg by email)
>   - restructuredtext


That's true. The point I was trying highlight is that other programming
languages and comment documentation systems do @param without :. I want to
make it easy for people migrating from other languages - Python is catching
more and more fire every day (the programming language of 2007 and so on).
When I started to document my code with Epytext, which is the easiest
documentation tool available for Python currently, I found this : issue very
frustrating, since my previous habits tend to make me type @param without :.

@group Group Name: x, y, z
> @var x, y, z: Description of several variables at once
> @param x, y, z: Description of several parameters at once
>
> These would not be possible if there was no colon.


True. I am not suggesting ditching @xxx y: syntax completely. I just hope it
could be left out from the simplest cases which I belive 80% of
documentation lines are. The ultimate goal is to streamline the grunt work
and have big cannons available for hard cases =)

Epytext already detects the case where you have a line that starts
> with "@" but there's no corresponding ":" -- and it flags it as a
> warning.


Trust me, I have seen a lot of these warnings =)


> So I guess I might be ok with making the ":" optional.  That way,
> things like "@group Group Name: x, y, z" would continue to work, but
> things like "@param x description" would also work.  The following
> case might trip people up sometimes:
>
>     @param x this is a description of x.  Read about: y
>
> where there's a ":" further along in the first line (after "Read
> about").  But I guess that would be fairly rare, and epydoc should be
> able to complain in that case.


You are righ. This would also fall into "special cases" which probably don't
appear that often.


So, given all that, my counterproposal
> is for epytext to understand the following syntaxes:




@param x: This is a description of x.
> @param x This is a description of x.
> @param x (int) This is a description of x.
> @param x, y, z: A description of three variables
> @group Name of Group: x, y, z
> @group Transformers x, y, z
> @return int: This is a description of the return value.
> @return This is a description of the return value.
> @return: This is a description of the return value.


Here is another suggestion which we might have not discussed before.

@param x (int): This is a description of x.

-->

@param int x: This is a description of  x
@param MyClass x: This is a description of x
@param int or string x: This is a description of x
@return int x: This is a description of type


Parameter type rule: If there are several words before the colon of @param
line, threat the last word as a parameter name and the former words as a
type description. This resembles more C-like syntax which is widely
recognized and I believe Javascript (Ext JS) does this. In PHP
documentation, another languague having dynamic types, the syntax would be:

@param int $x This is a description of x. Since Python doesn't (luckily)
have $ notitation, $ wouldn't fit well.


So the ":" is required for:
>   - @return with type
>   - @param w/ multiple vars


Right. But let's consider above "type definition" notation for @return and
@param.

And optional in all other case.


Backwards compability++ :)


> I don't like having types be prefixed -- it makes
> it too hard to scan for the name of the param,


Consider above type definition example with colon. I find it quite readable,
since parameter name is always left from the colon. Some perfectionist could
add some whitespaces to align all parameter names into a column.

which is much more
> important than the type.  More important information should come
> before less important information.)  Return types are specified as an
> argument to the field.


I agree. However, due to all statically typed languages out there, people
have used to have int before their x.


Of course, there are some issues with this proposal, and the way
> markup languages are currently set up.  One is that there is currently
> a very strong abstraction barrier between epytext and epydoc --
> epytext defines a markup syntax for field lists, but has nothing to
> say about what fields can be used, or what their meanings are.  In
> order to allow some fields (like @param) to automatically eat the
> first following word as an argument, while others (like @summary)
> don't, will require that abstraction barrier to be weakened -- epytext
> will need to know what fields are expected, and what to do with them.


I haven't yet checked how epytext and epydoc play together - I have only
experience hacking Doxygen. I believe the problem could be solved by adding
"postprocess" step when the documentation is compiled: all type information
is out there to exploit. It could go through @param definitions and do the
magic - users love magic as an alternative for hand-typing!


> So what do you think of this proposal?


I think it's heading to the right way when pursuing my goal of having better
compatibility with other documentation languages.

p.s., on the topic of creating a formal language for expressing types:
> unless Python defines a standard formal language for types, I'm
> strongly opposed to coming up with our own.  "Explicit is better than
> implicit" is the second rule in the zen of python. :)  Your suggested
> format:
>
>     @param [int] x description...
>
> Makes me think x is an integer, not a list of ints.  Also, note that
> Python is a language with a strong culture of duck typing, and precise
> notions of type are often not applicable.


A very good point. How about

@param list x: The description of x.

Since Python lists (tuples, dicts) are generic types, the documentation does
not "want to tell" what's inside the list.

..or...

@param list(int) x: The description of x

and

@param dict(str, MyClass) x: The description of x
@param tuple(int) x: The description of x

These resemble the Python syntax somehow. Collection-like return types are
an issue which could be discussed through too!


-- 
Mikko Ohtamaa
www.redinnovation.com
Every problem is solvable if you can throw enough energy drinks at it

Re: [Epydoc-devel] Epytext markup enhancement suggestions

From: Edward L. <ed...@se...> - 2008-02-16 17:35:39

> Here is another suggestion which we might have not discussed before.
>
> @param int x: This is a description of  x
> @param MyClass x: This is a description of x
> @param int or string x: This is a description of x
> @return int x: This is a description of type

The "@return" example doesn't make sense to me -- there's no variable
name associated with a return value.

For the "@param" examples -- I understand that this ordering has a
long history, and is very familiar from statically typed languages.
Nevertheless, I still find myself reluctant to add it.  Let's add the
postfix parenthesized form (i.e., "@param x (int): descr", try it out
for a while, and then you can tell me if you think this prefix form is
still necessary.  To quote the Zen of Python again, "There should be
one -- and preferably only one -- obvious way to do it."

> I agree. However, due to all statically typed languages out there, people
> have used to have int before their x.

I can see how the difference between having a colon and not could trip
people up -- the circumstances of using @param in other languages and
@param in epydoc is very similar.  But this (prefixed types) is more a
case of something that might be familiar to users, not something that
might trip them up.

> > on the topic of creating a formal language for expressing types: [...]
>
> A very good point. How about
>
> @param list x: The description of x.
> @param list(int) x: The description of x
> @param dict(str, MyClass) x: The description of x
> @param tuple(int) x: The description of x

You're free to use this homegrown type specification language yourself
-- epydoc will render them just as you type them, and it will be up to
your documentation readers to interpret them.  But I don't want to
make it (or any other non-standard type specification language) part
of epydoc itself.  So, as I said before, unless Python adopts a
standard type specification language (which seems unlikely to me),
epydoc won't support one.

-Edward