Thread: [Docstring-develop] AST mining (was Re: Direction of PyChecker)
Status: Pre-Alpha
Brought to you by:
goodger
From: David G. <dgo...@bi...> - 2001-08-14 04:43:08
|
From the PyChecker AST discussion, it seems we may have a common goal. For the Docstring Processing System, I am looking into gleaning information from the abstract syntax tree. From the working notes, under "Docstring Extractor": We need code that scans a parsed Python module, and returns an ordered tree containing the names, docstrings (including additional docstrings), and additional info (in parentheses below) of all of the following objects: - packages - modules - module attributes (+ values) - classes (+ inheritance) - class attributes (+ values) - instance attributes (+ values) - methods (+ formal parameters) - functions (+ formal parameters) In order to evaluate interpreted text cross-references, namespaces for each of the above will also be required. I'd be very interested in pooling efforts to make this easier. I know almost nothing about ASTs now, but that could change in a hurry :-). -- David Goodger dgo...@bi... Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net |
From: Neal N. <ne...@me...> - 2001-08-14 15:15:43
|
David Goodger wrote: > > >From the PyChecker AST discussion, it seems we may have a common goal. For It seems that way. > the Docstring Processing System, I am looking into gleaning information from > the abstract syntax tree. From the working notes, under "Docstring > Extractor": > > We need code that scans a parsed Python module, and returns an > ordered tree containing the names, docstrings (including > additional docstrings), and additional info (in parentheses below) > of all of the following objects: > - methods (+ formal parameters) > - functions (+ formal parameters) Would you also want default parameter values? PyChecker needs all of that, plus additional info. So there is a lot of overlap. > I'd be very interested in pooling efforts to make this easier. I know almost > nothing about ASTs now, but that could change in a hurry :-). Pooling efforts would be good. I also don't know anything about the ASTs/compiler, but am willing to work on it. Neal |
From: David G. <dgo...@bi...> - 2001-08-15 01:47:33
|
Neal Norwitz <ne...@me...> wrote on 2001-08-14 11:13: > Would you also want default parameter values? Yes. > PyChecker needs all of that, plus additional info. So there is a lot > of overlap. I think the DPS will use a small subset of what PyChecker needs. The only extra information the DPS may need is attribute and additional docstrings, string literals in certain contexts that are not currently recognized as docstrings (see PEP 258). > Pooling efforts would be good. I also don't know anything about > the ASTs/compiler, but am willing to work on it. I look forward to it, and to reading Jeremy's docs. (Go, doc writer!) -- David Goodger dgo...@bi... Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net |
From: Jeremy H. <je...@zo...> - 2001-08-14 15:29:14
|
>>>>> "TP" == Tim Peters <ti...@ho...> writes: TP> Let me suggest you don't really want an AST -- you want an TP> object model for Python source that answers the questions above TP> directly. Agreed. TP> directly. An AST may be an effective (under the covers) TP> implementation technique to get such info, but if you don't want TP> to wait for people to argue about "the right" AST and "the TP> right" tree-based query language to make it better than TP> completely useless <wink>, you can answer all the stuff above by TP> building on tokenize.py now. I wouldn't wait for people to argue about the right AST either. Use the one Greg and Bill came up with for p2c. It's in Tools/compiler in the Python distribution. It's much simpler than the parse tree produced by the parser module. And, as far as I know, no one is advocating a different AST. It doesn't have a query language, but neither does tokenize.py <wink>. Jeremy |
From: Jeremy H. <je...@zo...> - 2001-08-14 15:50:53
|
>>>>> "NN" == Neal Norwitz <ne...@me...> writes: >> I'd be very interested in pooling efforts to make this easier. I >> know almost nothing about ASTs now, but that could change in a >> hurry :-). NN> Pooling efforts would be good. I also don't know anything about NN> the ASTs/compiler, but am willing to work on it. I'm on the hook for AST/compiler documentation, which I plan to work on this week. I will probably have more time for it later in the week than I will today or tomorrow. In the absence of documentation, here's a trivial example program that extracts some information about methods and attributes from a class and its methods. I wasn't exhaustive here. I'll get attributes assigned to by "self.x" in a method body and I'll get method definitions in the class body. I don't deal with obvious things like attributes defined at the class level. Jeremy from compiler import parseFile, walk, ast class Class: def __init__(self, name, bases): self.name = name self.bases = bases self.methods = {} self.attributes = {} def addMethod(self, meth): self.methods[meth.name] = meth def addInstanceAttr(self, name): self.attributes[name] = name def getMethodNames(self): return self.methods.keys() def getAttrNames(self): return self.attributes.keys() class Method: def __init__(self, name, args, defaults): self.name = name self.args = args self.defaults = defaults def getSelf(self): return self.args[0] class ClassExtractor: classes = [] def visitClass(self, node, klass=None, meth=None): c = Class(node.name, node.bases) self.visit(node.code, c) self.classes.append(c) def visitFunction(self, node, klass=None, meth=None): if klass is not None and meth is None: m = Method(node.name, node.argnames, node.defaults) klass.addMethod(m) self.visit(node.code, klass, m) else: self.visit(node.code) def visitAssAttr(self, node, klass=None, meth=None): if isinstance(node.expr, ast.Name) and meth is not None: if node.expr.name == meth.getSelf(): klass.addInstanceAttr(node.attrname) else: self.visit(node.expr) def main(py_files): extractor = ClassExtractor() for py in py_files: ast = parseFile(py) walk(ast, extractor) for klass in extractor.classes: print klass.name print klass.getMethodNames() print klass.getAttrNames() print if __name__ == "__main__": import sys main(sys.argv[1:]) |
From: Fred L. D. Jr. <fd...@ac...> - 2001-08-14 16:31:38
|
Jeremy Hylton writes: > I'm on the hook for AST/compiler documentation, which I plan to work > on this week. I will probably have more time for it later in the week > than I will today or tomorrow. So between standing over your shoulder to make sure this gets done, and watching over Barry's shoulder to make sure the smtpd module gets documented, I'm going to get a fair bit of exercise this week... well, I guess some people would consider that a good thing. ;-) -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation |
From: Jeremy H. <je...@zo...> - 2001-08-14 22:42:56
|
A first and quite incomplete draft of docs for the compiler package has been checked in Tools/compiler/doc. The HTML version is currently available from http://www.python.org/~jeremy/compiler/ Jeremy |
From: Tony J I. (Tibs) <to...@ls...> - 2001-08-15 09:15:17
|
Jeremy Hylton wrote: > A first and quite incomplete draft of docs for the compiler package > has been checked in Tools/compiler/doc. The HTML version is currently > available from > http://www.python.org/~jeremy/compiler/ Great Stuff, but... Is there any chance of a "flat" version of this? - I would like something I could print out, and these hierarchical organisations can be a right pain. Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) |
From: Jeremy H. <je...@zo...> - 2001-08-15 12:05:08
|
>>>>> "TJI" == Tibs <Tony> writes: TJI> Jeremy Hylton wrote: >> A first and quite incomplete draft of docs for the compiler >> package has been checked in Tools/compiler/doc. The HTML version >> is currently available from >> http://www.python.org/~jeremy/compiler/ TJI> Great Stuff, but... TJI> Is there any chance of a "flat" version of this? - I would like TJI> something I could print out, and these hierarchical TJI> organisations can be a right pain. Fred's excellent documentation tools can be used to produce pdf, ps, html, etc. from the tex source. I'd like to avoid producing many different versions for intermediate drafts. Might you be able to grab the Python source tree and try it out: run Doc/tools/mkhowto on Tools/compiler/doc/compiler.tex. I'll produce all the various formats when I have a solid draft. Jeremy |