Re: [Simple-support] Performance and time consuming issues for parsing a big XML file using Simple
Brought to you by:
niallg
|
From: Niall G. <gal...@ya...> - 2011-10-15 13:13:54
|
Hi,
I had a chance to take a look at this using the large XML document you provided, and there is quite a simple fix associated with the ElementListUnion that doubles the performance. There should also be less churn in the garbage collector. I will ensure to add this fix to the next release.
Thanks,
Niall
________________________________
From: Jebarlin Robertson <jeb...@gm...>
To: Niall Gallagher <gal...@ya...>
Cc: sim...@li...
Sent: Wednesday, 14 September 2011 5:22 AM
Subject: Re: [Simple-support] Performance and time consuming issues for parsing a big XML file using Simple
Hi Niall,
Thanks for the valuable response...
I am not able to send the attachment and Simple mail list is not allowing me to attach more than 40kb file size
After calling this line CTDocument doc = serializer.read(CTDocument.class, new ByteArrayInputStream(data), false);
Garbage collector is running more frequently and taking so much time to complete the parser....
Some sample code
@Root(name="document")
public class CTDocument extends CTDocumentBase
{
@Element
private CTBody body;
public CTBody getBody() {
return body;
}
@Validate
public void validate() {
System.out.println(" CTDocument @Validate ***********************");
}
}
public class CTDocumentBase
{
@Element(name="background", required=false)
private CTBackground background;
public CTBackground getBackground() {
return background;
}
}
public class CTBody
{
@ElementListUnion ( {
@ElementList(entry="p", inline=true, type=CTP.class, required = false),
@ElementList(entry="tbl", inline=true, type=CTTbl.class, required = false)
})
List<WordXMLTagsOperation> operations;
@Element(required=false)
private CTSectPr sectPr;
public List<WordXMLTagsOperation> getOperations() {
return operations;
}
public CTSectPr getSectPr() {
return sectPr;
}
@Validate
public void validate() {
System.out.println(" CTBody @Validate ***********************");
}
}
public class CTP extends EGPContent implements WordXMLTagsOperation
{
@Attribute(required=false)
private String rsidRPr;
@Attribute(required=false)
private String rsidR;
@Attribute(required=false)
private String rsidDel;
@Attribute(required=false)
private String rsidP;
@Attribute(required=false)
private String rsidRDefault;
@Element(name="pPr", required=false)
private CTPPr PPr;
public CTPPr getPPr() {
return PPr;
}
public String getRsidRPr() {
return rsidRPr;
}
public String getRsidR() {
return rsidR;
}
public String getRsidDel() {
return rsidDel;
}
public String getRsidP() {
return rsidP;
}
public String getRsidRDefault() {
return rsidRDefault;
}
@Validate
public void validate() {
System.out.println(" CTP @Validate ***********************");
}
}
public class EGPContent
{
@ElementListUnion({
@ElementList(entry="r", inline=true, type=CTR.class, required = false),
@ElementList(entry="hyperlink", inline=true, type=CTHyperlink.class, required = false),
@ElementList(entry="subDoc", inline=true, type=CTRel.class, required = false),
@ElementList(entry="customXml", inline=true, type=CTCustomXmlRun.class, required = false),
@ElementList(entry="smartTag", inline=true, type=CTSmartTagRun.class, required = false),
@ElementList(entry="fldSimple", inline=true, type=CTSimpleField.class, required = false),
@ElementList(entry="sdt", inline=true, type=CTSdtRun.class, required = false)
})
List<WordXMLTagsOperation>PContentList;
public List<WordXMLTagsOperation> getOperations() {
return PContentList;
}
}
I will send the remaining part in the following mail, as the Simple mailing list is not allowing to send more than 40kp body size
Regards,
Jebarlin |