I am interested in PyXB's behavior matching 'something|empty string'.
I put together a simple example like:
<?xml version="1.0" encoding="utf-8"?> <xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="Address"> <xs:complexType> <xs:sequence> <xs:element name="City" type="xs:string" /> <xs:element name="Country" minOccurs="0"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="[A-Z]{2}|" /> </xs:restriction> </xs:simpleType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
with the expectation of element Country to accept either empty string or 2 letter code (my expectations come from XML regex defintion from https://www.w3.org/TR/xmlschema-2/#regexs
Country
However, when I do:
!pyxbgen -u example_pattern.xsd -m example import example as e address = e.Address() address.City = "New-York" address.Country = "USA"
I'm getting 3-letter Country code accepted and the following xml file produced:
<Address> <City>New-York</City> <Country>USA</Country> </Address>
So, my question is: is PyXB's regex engine XML compliant?
A little bit of background:
"[A-Z]{2}|"
xmllint
lxml
Well, the answer to my problem is brackets:
<?xml version="1.0" encoding="utf-8"?> <xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="Address"> <xs:complexType> <xs:sequence> <xs:element name="City" type="xs:string" /> <xs:element name="Country" minOccurs="0"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="([A-Z]{2}|)" /> </xs:restriction> </xs:simpleType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
After putting the pattern in brackets it works as expected.
PyXB attempts to translate XML regex syntax into Python syntax for execution. It's intended to be correct; if it isn't please open a ticket on github.
I will. As a temporal remedy, can you please point me to the file where this translation happens?
https://github.com/pabigot/pyxb/blob/next/pyxb/utils/xmlre.py
Log in to post a comment.
I am interested in PyXB's behavior matching 'something|empty string'.
I put together a simple example like:
with the expectation of element
Country
to accept either empty string or 2 letter code(my expectations come from XML regex defintion from https://www.w3.org/TR/xmlschema-2/#regexs
However, when I do:
I'm getting 3-letter Country code accepted and the following xml file produced:
So, my question is: is PyXB's regex engine XML compliant?
A little bit of background:
"[A-Z]{2}|"
from xmlSpy generated XSD's.xmllint
orlxml
(both throw an error that this is not a valid regex pattern)Last edit: Sergey Bushmanov 2016-12-12
Well, the answer to my problem is brackets:
After putting the pattern in brackets it works as expected.
Last edit: Sergey Bushmanov 2016-12-12
PyXB attempts to translate XML regex syntax into Python syntax for execution. It's intended to be correct; if it isn't please open a ticket on github.
I will. As a temporal remedy, can you please point me to the file where this translation happens?
https://github.com/pabigot/pyxb/blob/next/pyxb/utils/xmlre.py