>  It seems that my theory of using XSLT to split down these large documents with XSLT is going to have to be replaced by hand writing something that handles the data as text instead of XML.  
 
Very often that job can be tackled quite easily using a SAX filter.
 
Historically, the name Saxon is in fact derived from SAX. Saxon started life as a Java library that sat on top of SAX, scanning the document seqentially, and firing off element handlers that matched particular patterns in the input. Later there were options to process either serially or in memory, and the "in-memory" option was extended to support XSLT. Then it became too difficult to support two modes of operation in the same product. For a while I offered "preview" mode, which was the ability to fire off templates while the document was being loaded; but that too eventually became unsupportable because it really didn't fit the functional programming model.
 
Michael Kay
http://www.saxonica.com/