Thanks for supplying the repro (off list).
 
There are several points (it's a sorry tale!)
 
Firstly, according to the specifications, lang="zh_CN" is incorrect. It should be lang="zh-CN" (hyphen not underscore). XSLT defines this by reference to the XML specification, which in turn refers to RFC 3066.
 
Secondly, Saxon 6.5.x does not automatically pick up the collation from the Java Locale. Instead it uses the rather complex mechanism described at
 
http://saxon.sourceforge.net/saxon6.5.5/extensibility.html#Implementing%20a%20collating%20sequence
 
(I don't know why it was done this way - perhaps it was the only way that would work on earlier Java releases)
 
Finally, if you try running this on Saxon 9.0, you hit a bug: a language identifier containing a hyphen is parsed incorrectly. You can get around this by specifying lang="zh" (which gives the output you were expecting).
 
So you will have to decide whether to move forward to Saxon 9.0, and if you do, then until I fix the bug you will need to specify lang="zh".
 
Michael Kay
http://www.saxonica.com/


From: saxon-help-bounces@lists.sourceforge.net [mailto:saxon-help-bounces@lists.sourceforge.net] On Behalf Of Lizl
Sent: 18 March 2008 02:33
To: saxon-help@lists.sourceforge.net
Subject: [saxon] seek help for sort chinese characters

Hello:

  I have a question wants to ask about saxon6.5.3 support chinese charactors for <xsl:sort> element.
 
We are using Saxon6.5.3 and xsl1.0 to create a PDF document in Simplified Chinese. This document has an index and glossary that must be sorted.

But when sort chinese characters using <xsl:sort> elemen, the sort result is not correct, the xml and xsl files as follows:


<!-- country.xml -->
<?xml version="1.0" encoding="gb2312"?>
<root>
<book>
<title>中国(Zhong guo)(4)</title>
</book>
<book>
<title>美国(Mei guo)(2)</title>
</book>
<book>
<title>日本(Ri ben)(3)</title>
</book>
<book>
<title>俄罗斯(e luo si)(1)</title>
</book>
</root>
<!-- sort.xsl -->
<?xml version="1.0" encoding="GB2312"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format">
<xsl:template match="/">
<root>
<xsl:for-each select="root/book">
<xsl:sort select="title" lang="zh-CN"/>
<p>
<xsl:value-of select="title"/>
</p>
</xsl:for-each>
</root>
</xsl:template>
</xsl:stylesheet>
The result shoud be:(using MSXML Engine)
<?xml version="1.0" encoding="UTF-16"?>
<root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<p>俄罗斯(e luo si)(1)</p>
<p>美国(Mei guo)(2)</p>
<p>日本(Ri ben)(3)</p>
<p>中国(Zhong guo)(4)</p>
</root>
but when I use the saxon6.5.3, the result is:(this is wrong result)
<?xml version="1.0" encoding="utf-8"?>
<root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<p>中国(Zhong guo)(4)</p>
<p>俄罗斯(e luo si)(1)</p>
<p>日本(Ri ben)(3)</p>
<p>美国(Mei guo)(2)</p>
</root>


使用新一代 Windows Live Messenger 轻松交流和共享! 立即体验!