The ordered symbol set is in the wrong order for the attached test. The schema below should produce an instance._symbolSet() of
["a","b","c"]. Instead it produces ["c","a","b"]. It can load the XML fragment below, but when it attempts to print it, it raises:
File "content.py", line 631, in sequencedChildren
raise pyxb.UnprocessedElementContentError(self.instance, cfg, symbols, symbol_set)
From what I can tell (content.py:~570) first locates "c" in the DOM and then concludes that "a" and "b" are out of place.
--- Schema excerpt ---
<xs:element name="Test" type="Outer"/>
<xs:complexType name="Outer">
<xs:complexContent>
<xs:extension base="Inner">
<xs:sequence>
<xs:element name="c" type="xs:string" minOccurs="0" maxOccurs="1"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="Inner"><xs:sequence>
<xs:element name="a" type="xs:string" minOccurs="0"/>
<xs:element name="b" type="xs:string" minOccurs="0"/></xs:sequence>
</xs:complexType>
<?xml version="1.0" encoding="UTF-8"?>
<Test xmlns="http://foo.org/test">
<a>A</a>
<b>B</b>
<c>C</c>
</Test>
Example of issue
Did a bit more peeking and the original dx was incorrect. It appears that the issue can be fixed by removing the "break" statement at line 615 in content.py:
An alternative might be to add a "continue" immediately after the del statement - not entirely clear when len(matches) ends up > 0, but either solution solves the issue on this side.
Capturing the exception and printing its details() produces:
While it may appear to be "fixed" by eliminating the break, that break is positioned correctly to prevent state explosion in non-deterministic automata like this.
In the past, element generation order was affected by the iteration order of Python set objects, which is determined by the address of the members, which is effectively random). PyXB 1.2 normalized the order of elements to match the order in the schema (#181) so there's consistency generating content from "all" model groups and when the same instance is converted to DOM multiple times.
In this case, the schema order is the wrong choice because the extension comes before what was extended. Infrastructure will have to be added so that the order is influenced by extension hierarchy as well as schema position. (Note that the bug does not occur when Inner is moved before Outer in the schema, so the automaton itself is arguably correct, just not manner in which its non-determinism is resolved.)
Yes - further investigation into other regression test failures showed that the issue is in the generation of the state machine, not the interpretation.
We've encountered several situations, however, where we extend a set of optional elements imported from a schema entitled "Core.xsd" and follow it by a mandatory element in a schema called "CodeSystemVersion.xsd". The name causes the mandatory element sorts to the front of the list in the state machine, which, in turn causes a transition to the next state without processing the optional elements. Would be happy to submit a sample of the issue, although it will be a tad more difficult to put into the unit test format you are using.
For the time being, we can pass most of the unit tests by disabling the sort that was introduced in #181. Perhaps the first element of the sort key should be the min cardinality?
I don't think the sort should be based on anything other than the order in the content model; cardinality is not relevant for priority since it's checked against available symbols. When no extension types are used, using declaration order is right for sequence model groups (where it's required to satisfy the content model) and for choice and all model groups (where it eliminates non-determinism in generation per #181).
pyxb.binding.content.ElementUse should be in one-to-one correspondence with element declarations in the schema and hence with complex types, so it can be extended with an ordinal to be used when emitting transitions instead of inferring it from the line/column/schema "address" of the use.
I believe the fix will involve detecting when an extension type is used and assigning the ordinals for its element uses in declaration order but starting from the highest ordinal in the base type, so they are truly an extension of rather than interleaved with the base type elements. This should fix the original problem and, as I understand it, the situation you describe in the previous comment. I'll extend the test case to cover that situation (and thanks for providing it).
Schedule-wise I'm looking at dealing with this next week; hope that's OK. Context switches between PyXB and BSP430 are pretty heavy, so I try to avoid going back and forth too much.
Closed with these commits. Note that this fix may change the format of pickled schema components in archive files; regenerate all bundles and other archives to ensure the new format is used.