
Is PyXB's regex XML regex' compliant?

  • Sergey Bushmanov

    I am interested in PyXB's behavior matching 'something|empty string'.

    I put together a simple example like:

    <?xml version="1.0" encoding="utf-8"?>
    <xs:schema elementFormDefault="qualified" xmlns:xs="">
      <xs:element name="Address">
            <xs:element name="City" type="xs:string" />
            <xs:element name="Country" minOccurs="0">
                <xs:restriction base="xs:string">
                  <xs:pattern value="[A-Z]{2}|" />

    with the expectation of element Country to accept either empty string or 2 letter code
    (my expectations come from XML regex defintion from

    However, when I do:

    !pyxbgen -u example_pattern.xsd -m example
    import example as e
    address = e.Address()
    address.City = "New-York"
    address.Country = "USA"

    I'm getting 3-letter Country code accepted and the following xml file produced:


    So, my question is: is PyXB's regex engine XML compliant?

    A little bit of background:

    1. I got patterns like "[A-Z]{2}|" from xmlSpy generated XSD's.
    2. The above mentioned pattern does not validate either through xmllint or lxml (both throw an error that this is not a valid regex pattern)
    3. I've asked a related question on stackoverflow and got some replies there.

    Last edit: Sergey Bushmanov 2016-12-12
  • Sergey Bushmanov

    Well, the answer to my problem is brackets:

    <?xml version="1.0" encoding="utf-8"?>
    <xs:schema elementFormDefault="qualified" xmlns:xs="">
      <xs:element name="Address">
            <xs:element name="City" type="xs:string" />
            <xs:element name="Country" minOccurs="0">
                <xs:restriction base="xs:string">
                  <xs:pattern value="([A-Z]{2}|)" />

    After putting the pattern in brackets it works as expected.


    Last edit: Sergey Bushmanov 2016-12-12
  • Peter A. Bigot

    Peter A. Bigot - 2016-12-12

    PyXB attempts to translate XML regex syntax into Python syntax for execution. It's intended to be correct; if it isn't please open a ticket on github.

  • Sergey Bushmanov

    I will. As a temporal remedy, can you please point me to the file where this translation happens?


Log in to post a comment.