We have a application which uses expat to convert a xml data file into a binary version of the file. The file at the moment is about 600M but will grow.
We encountered a blocking problem - while parsing the file the application starts using a huge amount of memory it needs 4G of RAM to finish successfuly a 600MB file.
Our engineers explained that this is due to block memory management in expat when it builds the xml tree. They explained that our xml has alot of tags which in turn requires separate 4K memory pages for even 3 bytes of actual data.
Is there any way to improve this? Could anyone suggest how we can optimize this process? Is there any settings which we can use to make it work?
Here is the file structure ( I am not uploading the file since it is 600M I can provide it though ).
<?xml version="1.0" encoding="utf-8" ?>
<Value_2 type ="RELATIVE">0.0000</Value_2>
There can be many Groups - in practice about 100
Each Group can have many elements - in practice about 100,000
Each Element can have many subelements - in practice about 4