I have implemented an XML file writer which replaces occurrences of
control characters with character references, i.e. a backspace becomes
and formfeed becomes
When I parse this XML file I get an "Illegal XML character" fatal error.
I have figured out that these two chars are illegal characters according
to the XML spec. I don't understand the rationale behind this and would
be grateful if anyone could tell me.
Anyway, the requirement for my custom DOM model and the implemented DOM
writer is that control characters should be made explicit in the XML
file by representing them using unicode character references. The idea
was that we can then rely on the parser expanding them at parse-time.
Obviously, I could write my own character converter and use references
like \b or \f in my file. I would however like to conform to the unicode
character notation. Any ideas how I could do this ? Could I implement
some custom error handling that takes care of the character conversion
for these illegal characters ? I know that if there is a way to do this
I will end up with XML that is not WF (thus XML editors will complain),
but I am prepared to accept this.
Or is there a way to solve this by using entities ? But I guess not, I
would just get the parse error when the entity declaration is parsed ?
Help would be much appreciated !!!!