I am having trouble when parsing documents whose schemas have xsd:any elements in a sequence. Here is an example:
<?xml version="1.0" encoding="utf-8"?>
<s:schema elementFormDefault="qualified" targetNamespace="testNs" xmlns:tns="testNs" xmlns:s="http://www.w3.org/2001/XMLSchema">
<s:any minOccurs="0" maxOccurs="unbounded" />
<s:element minOccurs="0" maxOccurs="1" name="Name" type="s:string" />
<s:element name="obj" type="tns:MyObjectType"/>
Notice that s:any occurs before Name element.
Now, when I try to run the following code
xml = """<obj xmlns="testNs">
my_obj = ws.CreateFromDocument(xml)
the result is
I would expect that the parser recognizes the Name element and parses it correctly.
I don't know if this is a bug or not, but is it somehow possible to get the desired behaviour?
Correct your schema. 3.10.2 XML Representation of Wildcard Schema Components:
Wildcards are subject to the same ambiguity constraints (Unique Particle
Attribution (§3.8.6)) as other content model particles: If an instance
element could match either an explicit particle and a wildcard, or one of
two wildcards, within the content model of a type, that model is in error.
The Name element matches the wildcard, so the wildcard can consume it regardless of whether it could also match a subsequent element.
You need to more carefully define the desired behavior. Commonly wildcard elements appear at the end of a sequence. Alternatively elements can be excluded from match by using a namespace constraint on the wildcard.
Unfortunately, I am not in position to change the original .xsd. If one were to work around this behaviour, what would be the correct approach?
Would something like the following be correct?
for element in list(my_object.wildcardElements()):
if isinstance(element, pyxb.binding.datatypes.string):
if element._element().name().localName() == 'Name':
my_object.Name = element.title()
(I am iterating over wildcardElements() and removing the ones I "know" are "wrongly" consumed.)
Should I be removing the said elements from some other lists (orderedContent() for example)?
Thanks for answering.
I can't say whether your workaround would be "correct", since you're starting from schema with erroneous content models. A fundamental goal of PyXB was to support validation, and working around errors in XML schema is out of scope.
If your proposal works for you, then you should use it. I wouldn't expect you to need to change anything else including the ordered content, but I haven't really thought about what else might be affected. You'll just have to try it and see.
One other thing you might also try is disabling validation when parsing; search the documentation and earlier questions here for how to do that. It's possible PyXB would stuff the value in the element instead of treating it as a wildcard in that case, but it's fairly likely there would be other side effects. Again, this is something where you're on your own. Sorry.
Removing elements from wildcardElements() and orderedContent() seems to work.
Thank you for answering.
Log in to post a comment.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.