Thanks for pointing this out.
It's not that easy to fix, because Saxon is using the same mechanism for
character maps and disable-output-escaping, and the rule for d-o-e is
different: "If output escaping is disabled for a character that is not
representable in the encoding that the processor is using for output, =
request to disable output escaping is ignored in respect of that =
In fact, Saxon isn't getting it right for d-o-e either.
This seems to mean that as characters are sent down the serialization
pipeline, I need to distinguish three kinds of character: ordinary,
generated-by-d-o-e, and generated-by-character-mapping; whereas I =
distinguish only two. Ugly.
Unfortunately the conformance rules for serialization don't allow me to
provide an option to select more lenient error handling. I campaigned =
for this and lost (the W3C I18N police are very draconian about this =
thing, and they have a lot of influence). It's not conformant to add an
optional serialization attribute that overrides the rules in the spec =
fact, saxon:character-representation already breaks these rules, but =
kept it for backwards compatibility). Technically, the only way I can do
this is to add a different output method, e.g. method=3D"saxon:xml".
Alternatively (I've been thinking about this) I think the conformance =
probably allow me to produce the output file with fallback =
invalid characters, provided I also output a message saying "Officially,
this transformation has failed".
> -----Original Message-----
> From: saxon-help-bounces@...
> [mailto:saxon-help-bounces@...] On Behalf=20
> Of Abel Braaksma
> Sent: 29 January 2007 13:27
> To: Mailing list for SAXON XSLT queries
> Subject: [saxon] Bug: Non-comformance: SERE0008 expected, but=20
> not raised, with character maps
> Hi Michael & others,
> the spec states the following: "A serialization error=20
> [err:SERE0008] occurs if character mapping causes the output=20
> of a string containing a character that cannot be represented=20
> in the encoding that the serializer is using for output. The=20
> serializer MUST signal the error."
> However, Saxon does not signal this error for versions 8.8J=20
> and 188.8.131.52J (and setting -w0, -w1 or -w2 does not change=20
> this). The following stylesheet illustrates this behavior:
> <xsl:output use-character-maps=3D"mymap" encoding=3D"US-ASCII"/>
> <xsl:character-map name=3D"mymap">
> <xsl:output-character character=3D"e" string=3D"=E9"/>
> <xsl:template match=3D"/">
> <elem val=3D"Resume"/>
> This outputs, no errors, the following:
> <?xml version=3D"1.0" encoding=3D"US-ASCII"?><elem val=3D"R?sum?"/>
> However, an error is expected.
> With the right encoding (many will do, here I use UTF-8), it=20
> will output:
> <?xml version=3D"1.0" encoding=3D"UTF-8"?><elem val=3D"R=E9sum=E9"/>
> If you decide to fix this behavior, is it possible to make it=20
> into a saxon specific xsl:output serialization parameter?=20
> Reason being because my users actually expect the question=20
> marks whenever an encoding is wrong, and showing them errors=20
> instead, will make things harder for them, I believe. And=20
> perhaps there are other use-cases, too.
> -- Abel Braaksma
> Take Surveys. Earn Cash. Influence the Future of IT Join=20
> SourceForge.net's Techsay panel and you'll get the chance to=20
> share your opinions on IT & business topics through brief=20
> surveys - and earn cash=20
> saxon-help mailing list