Pattern restriction does not work correctly with Unicode.
Brought to you by:
pabigot
PyXB raises BadTypeValueError on some Unicode strings that should be valid according to the regular expression.
To reproduce, create TestPatternRestriction.xsd schema:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:simpleType name="TestPatternRestriction">
<xs:restriction base="xs:string">
<xs:pattern value=".*"/>
</xs:restriction>
</xs:simpleType>
<xs:element name="test" type="TestPatternRestriction" />
</xs:schema>
Create binding:
pyxbgen -u 'TestPatternRestriction.xsd' -m 'TestPatternRestriction'
Create script:
1 2 3 4 | |
Trac truncated the submission at the location of the Unicode character in the example script. Here is the script again, this time with the unicode character represented as a \u escape sequence:
Error message (even though the ".*" regular expression in the restriction should match the Unicode character:
Patch to correct wide unicode support detection
Fixed in following commit, available on branch "next".
commit 652a2cf38ac5b8d66303a25520e6c0a33578ba27
Author: Peter A. Bigot <pabigot@‌>
Date: Mon Sep 19 18:27:43 2011 -0500