(look for the <xs:element name="Дом" type="xs:string" /> where I use cyrillic. The encoding of the file is utf8. However, when I try:
pyxbgen -u example.xsd -m example
I got the error:
Traceback (most recent call last):
File "/home/sergey/anaconda3/lib/python3.5/xml/sax/expatreader.py", line 210, in feed
self._parser.Parse(data, isFinal)
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 9, column 26
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/sergey/anaconda3/bin/pyxbgen", line 52, in <module>
generator.resolveExternalSchema()
.......
What is the right way to approach element names in unicode?
I'm using Python 3.5.
As noted in my response on stackoverflow (which I don't monitor regularly), utf8 is spelled utf-8, and at this time there is no fix to support non-ASCII identifiers. It's tracked as issue 67 and will probably be fixed for PyXB 1.2.6 but that's likely to be sometime next year.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This is perhaps a newbie question, but I would really appreciate it if somebody points me to the right direction,
I'm unable to generate binding classes with PyXB when element names are non ASCII?
The minimal reproducible example:
(look for the
<xs:element name="Дом" type="xs:string" />
where I use cyrillic. The encoding of the file is utf8. However, when I try:pyxbgen -u example.xsd -m example
I got the error:
What is the right way to approach element names in unicode?
I'm using Python 3.5.
(I have put a similar question at http://stackoverflow.com/questions/40428247/pyxb-generating-classes-with-unicode but unfortunately I am not getting any response there).
Thanks in advance!
Update
When I change encoding to
cp1251
I got:and I am able to generate scheme like:
by assigning to
emptyString
.Is there a way to keep original name
Дом
?Last edit: Sergey Bushmanov 2016-11-05
As noted in my response on stackoverflow (which I don't monitor regularly),
utf8
is spelledutf-8
, and at this time there is no fix to support non-ASCII identifiers. It's tracked as issue 67 and will probably be fixed for PyXB 1.2.6 but that's likely to be sometime next year.Unidecode seem to work if I further simplify pyxbgen_jp :
Then, I am able to:
with output:
Last edit: Sergey Bushmanov 2016-11-05
Thanks; I've updated the issue to note that approach should be considered.
Thank YOU man, your package really helps!