On 11/02/2011 00:57, CRB wrote:
The value of the group-by key here is a boolean, so you end up with
two groups. You'll get lines 2000, 4000, 6000 etc in one group, and
all the other lines in the other group.
Possibly an excercise in the ridiculous - but I am stumped by what I thought would be rather simple: using XSLT to partition a large log file (20mb) into multiple smaller files (4mb).
Here is what I have:
<xsl:variable name="input" select="unparsed-text('ServerLog.csv')"/>
<xsl:for-each-group select="tokenize($input, '\n')" group-by="position() mod 2000 = 0">
<xsl:for-each-group select="tokenize($input, '\n')"
group-adjacent="(position() -1) mod 2000">
Note that group-adjacent is always likely to be more efficient than
Multiple outputs aside for the moment, I find myself challenged just to get the grouping of a sequence. The above runs but does not partition into groups of 2000.
Alternatively, I had been thinking the group-by would be something like:
group-by=". | following-sibling::node()[position()
The items in a sequence are not (in general) siblings of each other.
To be siblings, two items need to have a common parent in an XML
tree. tokenize() produces strings, which don't have a parent because
they are not nodes.
(this question isn't actually Saxon-specific, so it would be better
posted on a general XSLT forum such as the xsl-list at