On 12/03/2012 01:12 PM, Chris Barker - NOAA Federal wrote:
> This argues against making the Cython source code a part of the matplotlib codebase.
> huh? are you suggesting that we use Cython to generate the glue, then
> hand-maintain that glue? I think that is a really, rally bad idea --
> generated code is ugly and hard to maintain, it is not designed to be
> human-readable, and we wouldn't get the advantages of bug-fixes
> further development in Cython.
> So -- if you use Cython, you want to keep using, and theat means the
> Cython source IS the source. I agree that it's a good idea to ship the
> generated code as well, so that no one that is not touching the Cython
> has to generate. Other than the slight mess from generated files
> showing up in diffs, etc, this really works just fine.
I agree with this approach.
> Any reason MPL couldn't continue with EXACTLY the same approach now
> used with C_XX -- it generates code as well, yes?
No -- PyCXX is just C++. Its killer feature is that it provides a
fairly thin layer around the Python C/API that does implicit reference
counting through the use of C++ constructors and destructors. I
actually think it's a really elegant approach to the problem. The
downside we're running into is that it's barely maintained, so using
vanilla upstream as provided by packagers is not viable. An alternative
to all of this discussion is to fork PyCXX and release as needed. The
maintenance required is primarily when new versions of Python are
released, so it wouldn't necessarily be a huge undertaking. However, I
know some are reluctant to use a relatively unused tool.
> Michael Droettboom wrote:
>> For the PNG extension specifically, it was creating callbacks that can
>> be called from C and the setjmp magic that libpng requires. I think
>> it's possible to do it, but I was surprised at how non-obvious those
>> pieces of Cython were. I was really hoping by creating this experiment
>> that a Cython expert would step up and show the way ;)
> Did you not get the support you expected from the cython list? Anyway,
> there's no reason you can't keep stuff in C that's easier in C (or did
> C_XX make this easy?).
The support has been adequate, but the solutions aren't always an
improvement over raw Python/C API (not just in terms of lines of code
but in terms of the number of layers of abstraction and "magic" between
the coder and what actually happens).
> I think making basic callbacks is actually
> pretty straightforward, but In don't know about the setjmp magic (I
> have no idea hat that means!).
It turned out to be not terrible once I figured out the correct incantation.
>> The Agg backend has more C++-specific challenges, particularly
>> instantiating very complex template expressions --
> I'm guessing you'd do the complex template stuff in C++ -- and let
> Cython see a more traditional static API.
Agreed -- I'm really only considering replacing the glue code provided
by PyCXX, not the whole thing. matplotlib's C/C++ code has been around
for a while and has been fairly vetted at this point, so I don't think a
wholesale rewrite makes sense.
>> but some of that complexity could be reduced by using Numpy arrays in place of the
>> image buffer types that each of them contain
> OR Cython arrays and/or memoryviews -- this is indeed a real strength of Cython.
Sure, but when we return to Python, they should be Numpy arrays which
have more methods etc. -- or am I missing something?
>> The Cython version isn't that much shorter than the C++ version.
> I think some things make sense to keep in C++, though I do see a fair
> bit of calls (in the C++) to the python API -- I'm surprised there
> isn't much code advantage, but anyway, the goal is more robust/easier
> to maintain, which may correlate with code-size, but not completely.
>> These declarations aren't exact matches to what one would find in the header file(s) >because Cython doesn't support exact-width data types etc.
> It does support the C99 fixed-width integer types:
> from libc.stdint cimport int16_t, int32_t,
> Or are you talking about something else?
The problem is that Cython can't actually read the C header, so there
are types in libpng, for example, that we don't actually know the size
of. They are different on different platforms. In C, you just include
the header. In Cython, I'd have to determine the size of the types in a
pre-compilation step, or manually determine their sizes and hard code
them for the platforms we care about.
>> I'm not sure why some of the Python/C API calls I needed were not defined in Cython's include wrappers.
> I suspect that's an oversight -- for the most part, stuff has been
> added as it's needed.
> One other note -- from a quick glance at your Cython code, it looks
> like you did almost everything is Cython-that-will-compile-to-pure-C
> -- i.e. a lot of calls to the CPython API. But the whole point of
> Cython is that it makes those calls for you. So you can do type
> checking, and switching on types, and calling np.asarray(), etc, etc,
> etc, in Python, without calling the CPython api yourself. I know
> nothing of the PNG API, and am pretty week on the CPython API (and C
> for that matter), but I it's likely that the Cython code you've
> written could be much simplified.
It would at least make this a more fair comparison to have the Cython
code as Cythonic as possible. However, I couldn't find any ways around
using these particular APIs -- other than the Numpy stuff which probably
does have a more elegant solution in the form of Cython arrays and
>> Once things compiled, due to my own mistake, calling the function segfaulted. Debugging
>> that segfault in gdb required, again, wading through the generated code. Using gdb on
>> hand-written code is *much* nicer.
> for sure -- there is a plug-in/add-on/something for using gdb on
> Cython code -- I haven't used it but I imagine it would help.
Ah. I wasn't aware of that. Thanks for pointing that out. I have the
CPython plug-in for gdb and it's great.
> Ian Thomas wrote:
>> I have never used Cython, but to me the code looks like an inelegant combination of
>> Python,C/C++ and some Cython-specific stuff.
> well, yes, it is that!
>> I can see the advantage of this approach for small sections of code, but I have strong > reservations about using it for complicated modules that have extensive use of
>> templated code and/or Standard Template Library collections (mpl has examples of
>> both of these).
> So far, I've found that Cython is good for:
> - The simple stuff -- basic loops through numpy arrays, etc.
> - wrapping/calling more complex C or C++
> -- essentially handling the reference counting and python type
> packing/unpacking of python types.
> So we find we do write some shim code in C++ to make the access to the
> core libraries Cython-friendly. We haven't dealt with complex
> templating, etc, but I'd guess if we did I'd keep that in C++. And
> since the resulting actual glue code is pretty simple, it makes the
> debugging easier.
>> Maybe rather than asking "if we switched to using Cython, would more participate", I
>> should be asking "among those that can participate in removing the PyCXX
>> dependency, what is the preferred approach?"
> I don't know that we need a one-sieze fits all approach -- perhaps
> some bits make the most sense to move to plain old C/C++, and some to
> Cython, either because of the nature of the code itself, or because of
> the experience/preference of the person that takes ownership of a
> particular problem.
True. We do have two categories of stuff using PyCXX in matplotlib:
things that (primarily) wrap third-party C/C++ libraries, and things
that are actually doing algorithmic heavy lifting. It's quite possible
we don't want the same solution for all.