Thanks for the replies everyone!  I'm using .NET so VM size isn't an issue.  Sorry I didn't mention that originally.  The 10x increase in data size certainly matches what I was observing.  I always knew DOM parsers were pigs but man, that's crazy!  I was thinking about splitting the XML up but I wouldn't be able to make that work in a worst case scenario, so I don't think I should count on it.  I guess I'm going to have to forget about XSLT and just go with a SAX parser or something.  What a bummer!

Thanks again,
Mark


On Sat, Feb 25, 2012 at 1:20 PM, Michael Kay <mike@saxonica.com> wrote:
If you use the Saxon TinyTree, the space occupied is generally around 5 times the source data size. You can get it a bit smaller by stripping whitespace text nodes, and/or using TinyTreeCondensed (which commons up attribute values that occur repeatedly). But 400Mb is likely to be a stretch. Alternatives to consider are document projection (which only builds the parts of the tree that a query needs to access) or streaming transformations, which process the data on the fly - both these are supported in Saxon-EE only.

I assume you're aware that (assuming this is Java) you need to set -Xmx to ensure that Java grabs enough memory from the operating system.

Michael Kay
Saxonica


On 25/02/2012 15:15, Mark Rubelmann wrote:
Hi,

I'm *hoping* to process some big XML files with XSLT but Saxon is throwing an out of memory exception when the input is only around 350 or 400 mb.  I understand that XSLT needs to have the whole input tree in memory but I don't understand why it's failing with such a [relatively] small input.  Is the loaded tree like 10x bigger than the source XML?  Is there anything I can do to alleviate the problem?  I don't have a really solid requirement yet but I'd feel a lot better if it could handle two gigs.

Thanks,
Mark


------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/


_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/saxon-help 


------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/saxon-help