[MathDOM] Re: Packaging MathDOM, patching lxml
Brought to you by:
scoder
From: Stefan B. <beh...@gk...> - 2005-11-04 18:22:33
|
Jeff Kowalczyk wrote: > I don't think mathdom should have split packages. The mathdom setup.py doesn't > need options for that, just runtime import autodetection of whether lxml is > available, or pyxml as the fallback. The PyXML version is not a fallback. It's a different implementation based on the XML DOM Level 2 API, while the lxml version is based on the ElementTree API with XIST-like extensions. While I could implement a certain level of compatibility methods (and I partially did already), I will definitely not reimplement the entire APIs, so the general usage of both APIs will stay different. So in a way it's two packages in one - you have to choose. > My question was whether mathdom doesn't > run (with degraded functionality) on a stock lxml-0.8, in which case you'd need > to test for your patched functions in addition to a successful 'import lxml' > statement. No, because it is based on an implementation style that I had to patch into lxml (namespace implementation by element subclassing), so it will definitely not work with any release version of lxml until that patch is merged. It's pretty simple to test though: .>>> from lxml.etree import ElementBase, SaxTreeBuilder will fail on unpatched systems. > 'pymathdom' is a reasonable name if you think 'mathdom' name will conflict. > Gentoo would call the package 'dev-python/mathdom', so this isn't a problem. In > general, I prefer the distro package name to be the same as the python package > name. Like 'mathml'? :) That would be the first to conflict. Actually, one of the reasons for bundling PyMathML with MathDOM was such a conflict... I still think 'mathml' is a good name for the installed package while 'mathdom' is a good name for the main module. Currently, to the best of my knowledge, MathDOM is the only Python implementation of Content MathML there is, so conflicts are unlikely. *cross-fingers* > I think the direction to go (for Gentoo packaging anyway) is simply to have a > modified ebuild that optionally builds lxml with your lxml patch. The mechanism > is called a USE variable, the Gentoo user would add +mathdom to his lxml > configuration, a rebuild would trigger, and your patch would apply. Ok, fine. > This is all fine for the bleeding edge Gentoo experimenter, but I think that > going forward any lxml patches mathdom prefers will have to be accepted > upstream lxml, or mathdom will need to implement rejected extensions itself in > an intermediary C, pyrex, or python code. I'm actually pretty sure the main patches that make MathDOM work will be accepted. MathDOM is a perfect example why they are great. :) Namespace implementation is exemplified in XIST and it's the best way of doing custom data binding and XML driven APIs/GUIs/etc. http://www.livinglogic.de/Python/xist/Howto.html > The other option is to include a private copy of the full lxml source and apply > your patch in place, then have setup.py build and install that in a private > subdir. If the runtime test for a patched lxml in the default python path > fails, have the import statement import the private copy. With the modified setup.py, you can easily ship lxml with MathDOM. The test for a patched version also has an option. But lxml is not yet included in the build process. Once it *is* built, all you have to do is copy the resulting lxml directory (containing the files etree.so, _elementpath.py and __init__.py) into the mathml directory. Since the only module that depends on lxml is mathml.lmathdom, Python will simply look there first and find the correct version. The tests and examples will *not* find the patched version, but they are not of interest to an installation anyway. > lxml is becoming a core product, even Zope has considered making it a > dependency. I don't think it will be practical to apply post-release patching > to lxml in anything but experimental setups or package-private copies. Ok, sounds reasonable. So maybe it would help to integrate the build processes of lxml and MathDOM. In that case I'd prefer shipping only the generated C-code for lxml, i.e. lxml/etree.c, to strip the additional Pyrex dependency (the latest official Pyrex version is still broken when used with GCC4, so that is really the right thing to do). I'll have to look into that. Stefan |