The internal method getStringValue() when applied to a compressed whitespace text node sometimes returns the wrong value. Specifically, if the internal value is a long whose ls word is negative, the first word will be taken as xffffffff instead of its true value; the decompression of xffffffff produces a string of 252 space (x20) characters. This condition occurs if the whitespace being compressed contains five or more groups of whitespace characters, where a group is from 1 to 63 occurrences of the same whitespace character; and where the fifth group contains either space (x20) or carriage return (x0D) characters. In the example under test, a string of 254 carriage returns was wrongly delivered as a string of 252 spaces followed by two carriage returns.
The problem affects the internal static getStringValue() method, but not its counterpart getStringValueCS(). The broken method is used when atomizing the text node itself or when getting the string value of its parent, but it is not used when serializing the node; the bug therefore only affects applications that depend on the exact content of the whitespace text node.
A source fix (module net.sf.saxon.tinytree.WhitespaceTextImpl) is being placed in Subversion.