Hi I want to read huge xml file more than 100000 records and will validate input value using xpath query.So i tried AutoPilot but it is taking 800ms per record.I need more fast process so i tried AutoPilotHuge but in evalXPathToString() is comng null only.Code is below please check that.
package com.test;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import org.springframework.util.StringUtils;
import com.ximpleware.AutoPilot;
import com.ximpleware.VTDGen;
import com.ximpleware.VTDNav;
import com.ximpleware.XPathParseException;
import com.ximpleware.extended.AutoPilotHuge;
import com.ximpleware.extended.VTDGenHuge;
import com.ximpleware.extended.VTDNavHuge;
class Test extends Thread {
private AutoPilotHuge ap; private int i; private int total; public Test() { } public Test(AutoPilotHuge ap1,int inc,int total) { this.i=inc; this.total=total; this.ap=ap1; // this.vn = vn; } public void run() { try { long startingTime=System.currentTimeMillis(); for (int inc = i; inc < total; inc++) { System.out.println(inc); String s = "//data-set/record[COL1='" + inc + "' and COL2='1' and COL3='Administrator' and COL4='HR Administrator Payroll, Day Rate' and COL5='Ea' and COL6='Robert Walters AUD' and COL7='1' and COL8='Administrator' and COL9='HR Administrator Payroll, Day Rate' and COL10='Ea']"; this.ap.selectXPath(s); // String s1 =null; long startingTime1=System.currentTimeMillis(); String s1=this.ap.evalXPathToString(); long endTime1=System.currentTimeMillis(); // System.out.println(endTime1-startingTime1); //int s1 = this.ap.evalXPath(); if (!StringUtils.isEmpty(s1)) { // System.out.println(s1); } System.out.println(s1); // this.ap.resetXPath(); } long endTime=System.currentTimeMillis(); // System.out.println(endTime-startingTime); } catch (Exception e) { // TODO Auto-generated catch block e.printStackTrace(); } }
}
public class VTD_EXCELExample extends Test {
public static void main(String[] args) { // TODO Auto-generated method stub try { //VTDGen vg = new VTDGen(); VTDGenHuge vgh = new VTDGenHuge(); vgh.parseFile("E:/Test_batch/validation1.xml", false,VTDGenHuge.MEM_MAPPED); VTDNavHuge vn = vgh.getNav(); int increment=1; int total=500; for (int i = 1; i < 3; i++) { AutoPilotHuge ap=new AutoPilotHuge(vn.cloneNav()); Test t = new Test(ap,increment,total); increment=increment+500; total=total+500; t.start(); } } catch (Exception e) { // TODO Auto-generated catch block System.out.println(e); } System.out.println("done"); }
}
ok, I tried your code with standard vtd-xml to test the performance, it took about 80ms per evalXPathToString() in 2.13.1... with some older version, the eval is a lot faster but less rigorous.... because xpath spec mandates the document order for the node set.... which only gets fully/conformantly implemented untill 2.12+ releases...
80ms seems a lot less than 800ms you got...
let me ask this: have you tested your app with a single thread and measured the performance as such? why do you want to fork some many threads and benchmark in such complicated environment? One step at a time, will you?
The reason you get empty space printed out as output is because in extended VTD implementation, the string value of a node is interpreted as the direct descent text. node... which doesn't return the text nodes of its descendant nodes... does that explanation make sense?
I can come up with a performance related patch, from 80ms to much lower... if the node order is not a big deal to your app... but 800ms is not what I got... I got around 60ms on average...
actually our app is loading huge file and checking against the xml value.for that we splitted the record into 10 threads and simultaneously all the thread will execute thatsy we are using multi thread.Autopilot huge is taking only less time than AutoPilot but in values are coming empty.what i have to do to get the values using evalXPathToString().
when you call evalXPathToString(). are you aware what you are supposed to get? also 800ms is not correct... I got far less than that, with ways to get much lower still... please clarify/repudiate the 800ms claim
do you understand my previous message explaining why u are getting empty strings with extended vtd-xml?
Sorry for asking again i m new thatsy asking i want to check values are available or not with certain condition and if values are coming i will return condition satisfy..for this i tried evalXPathToBoolean() it si also working fine.So can u suggest which one sal i use evalXPathToBoolean() or evalXPathToString()
And one more issue please integrate with spring batch partition and test the records it is not working properly sometimes evalXPathToBoolean() is returning true and some times it is returning false.Please look into that.
again, I need a solid test case... a code piece without dependency on spring would work
Hi sorry some configuration problem in spring thatsy not worked.now it is working fine this issue.
ask your boss to get you a new machine... I got sub 1ms on average in query performance...
Last edit: jimmy zhang 2016-12-15
I will close this thread