From: Jimmy Z. <cra...@co...> - 2006-08-04 01:16:33
|
Is there any data on the performance of splitting files using VTD-XML? It would certainly be interesting to know about.... ----- Original Message ----- From: "Din Sush" <di...@ya...> To: "Tatu Saloranta" <cow...@ya...>; <vtd...@li...> Sent: Thursday, August 03, 2006 4:36 AM Subject: Re: [Vtd-xml-users] VTD-XML Query >I tried woodstox parser, it seems to be working and > for a 1 GB file it is taking around 11 mins to split > the file in multiple 1 MB files. > > Thanks for your suggestion!! I was just wondering if I > can make it any faster, I am using > "copyEventFromEventMethod" to write to the file. > > Thanks again. > > --- Tatu Saloranta <cow...@ya...> wrote: > >> --- Din Sush <di...@ya...> wrote: >> >> > Well I only need to split the document and don't >> > need >> > to go back to parsed document, and I don't need >> DOM >> > like functionality. >> > >> > Will VTD-XML be still better in this scenario. >> >> I would suggest that if you do have time, you >> investigate both using VTD-XML, and a Stax >> implementation (such as >> http://woodstox.codehaus.org). >> My feeling is that it all comes down to which one >> API >> you feel more comfortable with, or perhaps whether >> have to use a xml-compliant standard-based solution >> or >> not. >> Both can perform well enough, assuming you are not >> limited by VTD-XML due to main memory requirements. >> Stax memory usage is not linear with document >> length, >> so there are no practical input size limitations. >> >> If you do end up both approaches, it would be very >> nice to get the performance numbers, since this >> would >> be an actual real-world use case, instead of >> benchmarks. Plus if code is simple enough, perhaps >> it >> could become a benchmark for these types of >> operations? >> >> > Secondly as the entire document needs to be loaded >> > in >> > the memory, the whole idea of splitting is that I >> am >> > getting "Out of Memory" error won't I get the same >> > error when I am using VTD-XML, than it kind of >> > defeats >> > the purpose. Correct me if I am wrong in the >> > interpretation as I have never used VTD. >> >> You are correct here. While limit is much higher >> than >> with, say, DOM (2x or perhaps 3x), there is a limit. >> >> -+ Tatu +- >> >> >> __________________________________________________ >> Do You Yahoo!? >> Tired of spam? Yahoo! Mail has the best spam >> protection around >> http://mail.yahoo.com >> > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vtd-xml-users mailing list > Vtd...@li... > https://lists.sourceforge.net/lists/listinfo/vtd-xml-users > |