Both my positive and negative tests suggested that
either Saxon defaults the output encoding to utf-8 or leaves it to the parser
default which is also utf-8 in case of Xerces.
In a negative test when I supplied xml data encoded in
cp-1252 (windows default) and containing non us-ascii chars like öüä to a
stylesheet that did not specify any xsl:output encoding, I saw an exception
"java.io.UTFDataFormatException: Invalid byte 2 of 4-byte UTF-8 sequence." from
the parser (Xerces in this case.)
In a successful case, when the input contains valid
utf-8 chars and the stylesheet does not specify any output encoding, the
resulting xml contains encoding="UTF-8" in the xml
I think UTF-8 is the
default and it doesn’t depend on the platform. I’m not 100% confident of that
though, I would need to check the code or run some tests – do you have any
evidence to the contrary?
[mailto:email@example.com] On Behalf Of Sonali J. Kanaujia
Sent: 26 February 2004 19:21
Subject: [saxon] Default output encoding
Is UTF-8 the default encoding used
by Saxon, when there isn't any specified on <xsl:output>
Does platform default encoding have
any role to play here ?