Hello

I tried using SAXON-AS library because someone pointed me that the library is capable of handling large xmls without building full xml tree which consumes A LOT of memory for large files. Currently we need to transform large xml documents - 1 gig+. As I understood SAXON provides a way to grab the data from an external xml using <copy-to /> with read-once="yes" and process it as a stream - i.e. not to use too much memory.

I tried doing this with my large 1 gig xml data file which has the following structure:

<root>
  <data />
  <data />
      ...
</root>

and the xsl stylesheet applied to another xml file with the root tag named "main" (see below)
running
Transform.exe -s:data.xml -xsl:data.xsl -o:data.csv
eventually threw an out of memory exception.

What am I missing here?

<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:saxon=" http://saxon.sf.net/">
  <xsl:output method="text"/>

  <xsl:function name="saxon:customers">
    <xsl:copy-of select="doc(' bigdata.xml')/*/DATA"
                 saxon:read-once="yes" />
  </xsl:function>

  <xsl:template match="main">
    <xsl:apply-templates select="saxon:customers()"/>
  </xsl:template>

  <xsl:template match="data">
    <xsl:value-of select="."/>
    <xsl:text>&#10;</xsl:text>
  </xsl:template>

</xsl:stylesheet>