The problem seems to be with the class LargeStringBuffer when handling very large strings (10m characters). I haven't got to the bottom of it yet - it's years since I looked at this code - but I'm on the trail.

Michael Kay

On 17/03/2012 00:11, Michael Kay wrote:


This fails for me with default memory settings, but succeeds if I give it -Xmx2048m, reporting

Execution time: 55.811s (55811ms)
Memory used: 1333035256

It's definitely using streaming on the input file, but appears to be using memory heavily for something else. I suspect that for some reason the space used for the temporary trees isn't being released for garbage collection.

The problem seems to be related to the existence of very large text nodes, of which the longest seems to be almost 10m characters.

The investigation continues...

Michael Kay

On 16/03/2012 21:11, Todd Gochenour wrote:
Sample file can be accessed on Google Docs at
Sample XSLT is:

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.1"

<xsl:template match="/">
<xsl:for-each select="saxon:stream(doc('test7.xml')/*/*/table_data)">
<xsl:variable name="table-name" select="@name"/>
<xsl:variable name="collection-uri" select="concat('db/',parent::*/@name,'/',$table-name)"/>
<xsl:for-each select="row">
<xsl:variable name="record">
<xsl:element name="{$table-name}">
<xsl:for-each select="field[@name]">
<xsl:element name="{if(number(substring(@name,1,1))=number(substring(@name,1,1))) then concat('_',@name) else @name}">
<xsl:value-of select="text()"/>
<xsl:variable name="id" select="
if($table-name='cal_cyc') then concat('id-',$record/*/id,'-mnth-',$record/*/mnth,'-wk-',$record/*/wk)
else if($record/*/id) then concat('id-',$record/*/id)
else concat('ndx-',position())"/>
<xsl:result-document href="{$collection-uri}/{$id}.xml">
<xsl:copy-of select="$record"/>

This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here

saxon-help mailing list archived at