StringIndexOutOfBounds in normalize-unicode()
The Saxon XSLT and XQuery processor, developed by Saxonica
Brought to you by:
mhkay
There is a problem in normalize-unicode(), which is also likely to occur if normalization-form="NFD" is selected in the serializer.
When normalizing to decomposed normal form (NFD) if the input contains a combining character (such as x0304) immediately after a non-BMP character (such as x1D4AE), a StringIndexOutOfBounds exception occurs.
Source patched in Subversion, module net.sf.saxon.codenorm.Normalizer
Logged In: YES
user_id=251681
Originator: YES
Because of the priority of this fix for an important customer, a patch has also been created for Saxon 8.8 in Subversion (module net.sf.saxon.codenorm.Normalizer)
Logged In: YES
user_id=251681
Originator: YES
There is a further problem that occurs under the same input conditions; this time no exception occurs, but the resulting string is not well-formed UTF16, that is, it contains incorrect bytes in the surrogate pair range. A further patch will be placed in Subversion on both 8.9 and 8.8 branches.
Logged In: YES
user_id=251681
Originator: YES
Fixed in 9.0.0.1