Menu

How do I navigate outside the current object?

Help
2011-09-23
2013-03-06
  • Peter Keller

    Peter Keller - 2011-09-23

    Dear all,

    I am extending class definitions created by pyxbgen with new instance methods, and I need to be able to get to data in the document that is not contained within the element that corresponds to the instance. I thought that this would be fairly simple to do, but after reading through lots of the PyXB documentation I still can't see how to do it. Can anyone suggest an approach that I could use?

    More specifically, imagine that I have added a method like this to a pyxbgen-generated subclass of  pyxb.binding.basis.complexTypeDefinition:

       def findSomething(self, ... other args ...):
          ... do something ...
          someIdref = ....
    

    What I would like to do is: when still inside this method to find the element in the document containing "self" which has an ID attribute with the value of "someIdref". I have thought about the following:

    • The ID attribute in the element that I would be looking for has been declared in the XML schema with a type of xs:ID. Maybe there is a quick way of exploiting this in PyXB to find it?

    • I know more-or-less where in the document to look for the element, but I would need to be able to navigate from "self" to the document's root element first. I haven't been able to find a way of doing this.

    • I could also use an XPath search, if there is a good way of invoking this against the document that contains "self". Even a partial XPath implementation such as that in ElementTree 1.3 would be easily good enough.

    I could maintain some static cache that holds a reference to the containing document for every element instance that I encounter, but I can't help thinking that there must be a better way than this.

    Regards,
    Peter.

     
  • Peter A. Bigot

    Peter A. Bigot - 2011-09-23

    Sounds like you basically want xml:id support.  Which I never implemented and apparently never created a trac ticket for, though perhaps ticket 59 on identity constraint support is somewhat relevant.  (Sorry, it's been a couple years since I heavily used XML and my understanding of how ids are supposed to work wasn't very strong in the first place.)

    I don't believe there's currently a way in PyXB to find either the containing instance object (when there is one) or the containing document.  It might make sense to add that, though they would introduce another naming conflict (what do you call the attribute/method).

    There is no XPath facility in PyXB, and the internal representation doesn't use a standard XML model like DOM for the Python objects so it would be complex to create one.

    One approach would be to create a mix-in class that maintains the mapping as a class variable, and add it as a parent to customized binding classes for each complex type that has the corresponding attribute.  Make sure it gets invoked in the constructor after the underlying class, so the attribute value is correct.

    Maintaining the mapping in the face of manipulations of the document as Python instance objects could get pretty nasty, so this might only be doable under the assumption that the instance tree was created by parsing an XML document.

    In short, no, I don't think there is an existing solution.  It's an interesting problem, though, and if you'd care to take the time to write up a trac ticket, I can take a look at what it'd take to implement the necessary infrastructure.  I've recently been trying to decide how much PyXB is being used and hence how many of the known deficiencies I should try to correct.  Unfortunately, it seems either nobody uses it heavily, or those who do aren't encountering issues, since the feedback's been pretty light.

    Peter

     
  • Peter Keller

    Peter Keller - 2011-09-26

    Thanks for replying with your thoughts. I think that for my purposes I will maintain a mapping between the ID values and the containing document, and follow your suggestion to customise the binding classes so that these are set up at the time the document is read. In my context this will be good enough, because:

    The ID values will not mutate once created
    I provide a custom method for adding new objects to a document anyway, so I can keep track of new ID's there
    My application doesn't need or support updating of ID/IDREF's on deletion of elements from the XML document.

    I can see your point about integrating an XPath engine. Identity constraints use a subset of XPath anyway, so doing something about ticket 59 might be almost as problematic. xml:ID support is a smaller problem, and would be nice to have of course, but I can see that a general solution is complicated by the validation aspects.

    Out of what we have discussed, I think that the most generally useful thing to add to PyXB would be methods to get at the enclosing element and the containing document. The impact of naming conflicts can be reduced by delegation, so that the calling code looks something like this:

    containingObject = myObject.infoset.parent
    document = myObject.infoset.doc
    

    For a future version of PyXB, you might like to consider delegating everything that isn't generated from the contents of the XSD in this way, which will help with conflicts. This style could be switched on by an option to pyxbgen, to avoid breaking compatibility with existing applications.

    As for what I am doing: you might have realised from my hint in ticket 104 that I generate a Java API from XSD's using XMLBeans. What I am doing is trying to offer Python programmers an API that allows them to use the same documents. For both PyXB and XMLBeans I have to generate and integrate additional code to extend the generated API's.

    I chose PyXB, because what I am doing is basically a model-driven approach (in fact, my XSD's are generated from UML data models by a custom transformation of of my own). The PyXB (and XMLBeans) approach of generating an API from the XSD fits well with this approach. The alternatives that I have looked at parse and interpret an XML document first, and only then validate against an XSD, so the coupling between data and metadata is rather loose. I believe that the tighter coupling that you get from pre-generating an API suits what I am doing better, especially as in my case the XSD is not the primary artefact that describes the data: the UML data model is.

    I hope that this is useful and/or of interest. Please don't be discouraged by the lack of feedback: the person who is likely to use the API that I am generating soonest told me that he thought that PyXB was a good choice.

    Regards,
    Peter.

     
  • Peter Keller

    Peter Keller - 2011-10-03

    HI Peter,

    I am trying to implement something based on your earlier suggestion:

    One approach would be to create a mix-in class that maintains the mapping as a class variable, and add it as a parent to customized binding classes for each complex type that has the corresponding attribute. Make sure it gets invoked in the constructor after the underlying class, so the attribute value is correct.

    but I'm getting stuck and I wonder if you could give me a pointer. I haven't been able to figure out how to hook into the complexTypeDefinition instance creation process during the parse of an input document late enough so that the attribute I am after has actually been populated. __init__ and _postFactory_vx are too early of course. I could try to get at the attribute by hacking into a SAX handler directly, but that strikes me as a bit messy and I don't think that is what you meant.

    I'm also confused about what you mean by "the constructor" in this context, which doesn't help :-)

    If you could give me a quick hint about what the customized class needs, I'm sure that I can work the rest out myself.

    Regards,
    Peter.

     
  • Peter A. Bigot

    Peter A. Bigot - 2011-10-03

    What I had in mind as a workaround didn't involve any changes to pyxb, but rather use of customized binding classes.  In fact, you shouldn't need to worry about the constructor as long as everything gets created (ultimately) through the standard factory methods.

    The best example of this is in pyxb/bundles/wssplat/wsdl11.py.  tDefinitions is relatively undocumented but seems to do something like what you need, by building maps for different categories based on contained elements.  The _postFactory_vx method is the hook into the binding object generation process.

    examples/ndfd/latlon.py uses these classes to look up messages based on name from a map.

    You'd only need to lift the map management out to a mix-in class if you have multiple elements that can introduce values that you want added to the map.  None of the wsdl extension include a mix-in class, but that's simply a matter of putting multiple base classes in the parentheses in the class declaration.  There are various examples of it in the pyxb module sources; see, for example, class ElementUse in pyxb.binding.content.

    Hope that helps.

     
  • Peter A. Bigot

    Peter A. Bigot - 2011-10-03

    Re-reading your post, I don't think postFactory_vx is too early, because it should have been handed everything: it serves as the constructor and initial validator.  At any rate, look at the wsdl example, and see whether that helps.

     
  • Peter Keller

    Peter Keller - 2011-10-04

    I was getting confused because I hadn't realised that I had let the CreateFromDocument method default to saxer…. Using

    pyxb._SetXMLStyle(pyxb.XMLStyle_minidom)
    

    has set me straight, and _postFactory_vx now sees the fully-populated instance as expected.

    I'm well on the way now, and sorry for the noise.

     

Log in to post a comment.