Speaking only for the needs of coverage:py, all I need is a function that takes a Python module and returns the set of all line number that could possibly appear in a settrace call.  The CPython compiler stores these line numbers in co_lnotab, which is per-code-object, which means that there are many of them in a module.  The way to find all of them in a .pyc file is to navigate the co_consts structure.  But we don't keep any of that structure: the end result is just a set of line numbers for the module.  The co_consts structure is an implementation detail we have to deal with, but it's not important to us.

Again, speaking for coverage:py, I don't need the numbers to be in a .pyc-compatible co_lnotab structure, though that might make the code simpler for now.  I have plans to use more information from the .pyc file in future features, so eventually, I'll have to know the difference between a .pyc and a $py.class anyway.  (If you're interested, the idea for tracing byte codes rather than lines is here: http://nedbatchelder.com/blog/200804/wicked_hack_python_bytecode_tracing.html).


Jim Baker wrote:
We support co_consts for PyBytecode code objs. We may be able to extract them from Java bytecode - there's a regular layout to them - but it's more involved than just building up co_lnotab or some other line number structure by using visitLineNumber.

Creating the metadata as part of the compilation process is certainly an option for 2.5.1. Perhaps access to it could be done lazily so the only impact would be a small increase in the $py.class file size.

On Mon, May 11, 2009 at 10:33 PM, Moss Prescott <moss@theprescotts.com> wrote:

On May 11, 2009, at 9:37 PM, Philip Jenvey wrote:

The bytecode offsets in co_lnotab seem useless for $py.class files.
Correct me if I'm wrong but they probably aren't too useful for .pyc
either outside of the dis module.

If we don't care about them, having the compiler tally the line
numbers into this lnotab structure would be about as easy as analyzing
the class w/ ASM after the fact. I'd just increment the byte offset by
1 for every entry. Then we could support co_lnotab on the code object.

Current and future compilers should be able to handle this task, no?

Some implementation details :


Philip Jenvey

This was the approach I took when I was looking at trace.py a few months ago, and it worked for simple scripts. The problem, at least for trace.py, was that it uses the co_consts attribute to find nested functions and some other odd places that code can hide. Here's the relevant snippet from trace.py:

def find_lines(code, strs):
   """Return lineno dict for all code objects reachable from code."""
   # get all of the lineno information from the code of this scope level
   linenos = find_lines_from_code(code, strs)

   # and check the constants for references to other code objects
   for c in code.co_consts:
       if isinstance(c, types.CodeType):
           # find another code object, so recurse into it
           linenos.update(find_lines(c, strs))
   return linenos

Implementing co_consts was looking like a much larger and uglier hack than co_lnotab, so I punted.

 - moss

Jim Baker

------------------------------------------------------------------------------ The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com

_______________________________________________ Jython-dev mailing list Jython-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jython-dev

Ned Batchelder, http://nedbatchelder.com