Menu

#42 more configurable handling of illegal XML/HTML characters

open
General (10)
5
2003-03-15
2003-03-15
Mike Brown
No

We have hard-coded the illegal character replacement
mechanism for the XML and HTML printers. Instead, we
should offer the ability for the user to configure the
following parameters:

- Check for illegal characters?
Currently: yes, always
This should be configurable so that it can be bypassed
for speed.

- Which characters are illegal for XML output?
Currently: those disallowed by XML 1.0
This should be configurable because I still live in fear of
XML 1.1.

- Which characters are illegal for HTML output?
Currently: those disallowed by XML 1.0
(the replacement code in StreamWriter does not know
whether it is being used for XML or HTML output).
This should be configurable because strict
conformance would require
treating \u0080-\u009F as illegal in HTML, but users
may not want this.

- Replace illegal characters with what? currently:
U+003F ('?')
This should be configurable because other viable
candidates are the
empty string or \uFFFD.

Discussion


Log in to post a comment.

MongoDB Logo MongoDB