If you use the Saxon TinyTree, the space occupied is generally around 5 times the source data size. You can get it a bit smaller by stripping whitespace text nodes, and/or using TinyTreeCondensed (which commons up attribute values that occur repeatedly). But 400Mb is likely to be a stretch. Alternatives to consider are document projection (which only builds the parts of the tree that a query needs to access) or streaming transformations, which process the data on the fly - both these are supported in Saxon-EE only.

I assume you're aware that (assuming this is Java) you need to set -Xmx to ensure that Java grabs enough memory from the operating system.

Michael Kay

On 25/02/2012 15:15, Mark Rubelmann wrote:

I'm *hoping* to process some big XML files with XSLT but Saxon is throwing an out of memory exception when the input is only around 350 or 400 mb.  I understand that XSLT needs to have the whole input tree in memory but I don't understand why it's failing with such a [relatively] small input.  Is the loaded tree like 10x bigger than the source XML?  Is there anything I can do to alleviate the problem?  I don't have a really solid requirement yet but I'd feel a lot better if it could handle two gigs.


Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.

saxon-help mailing list archived at http://saxon.markmail.org/