From: Peter J. <pj...@wa...> - 2004-09-06 22:07:02
|
Hi Arno, All, "Arno Brinkman" <fir...@ab...> wrote: > Why not? > It's well-formed, but in hex it's (and that's what i wanted to write in > fact): > > <?xml version="1.0" encoding="US-ASCII"?> > <database> > <column name="RDB$TRIGGER_BLR">�L</column> > </database> No, whether hex or not, it is not well formed: This disallows control characters: Character Range [2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */ And this paragraph says, that text substituted for entities is handled "as though it were part of the document at the location the reference was recognized" 4.4.2 Included [Definition: An entity is included when its replacement text is retrieved and processed, in place of the reference itself, as though it were part of the document at the location the reference was recognized.] The replacement text MAY contain both character data and (except for parameter entities) markup, which MUST be recognized in the usual way. (The string "AT&T;" expands to "AT&T;" and the remaining ampersand is not recognized as an entity-reference delimiter.) A character reference is included when the indicated character is processed in place of the reference itself. The question regulary comes up on the mailing lists, e.g: http://lists.xml.org/archives/xml-dev/199804/msg00502.html http://lists.xml.org/archives/xml-dev/199804/msg00504.html > You can check it with an on-line XML validator. You can check an XML validator with your input. If it doesn't flag it as not wellformed, it's a rather bad validator. Try xmllint from libxml2. All things considered, XML is for markup. If you want to encode data structures, there is ASN.1 (only joking, nowadays it must be XML, whether it fits or not) Regards, Peter Jacobi |