Menu

#11 evalXPathToString() is not working in AutoPilotHuge

2.0
closed
nobody
None
2016-12-17
2016-12-13
Ramesh
No

Hi I want to read huge xml file more than 100000 records and will validate input value using xpath query.So i tried AutoPilot but it is taking 800ms per record.I need more fast process so i tried AutoPilotHuge but in evalXPathToString() is comng null only.Code is below please check that.

package com.test;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;

import org.springframework.util.StringUtils;

import com.ximpleware.AutoPilot;
import com.ximpleware.VTDGen;
import com.ximpleware.VTDNav;
import com.ximpleware.XPathParseException;
import com.ximpleware.extended.AutoPilotHuge;
import com.ximpleware.extended.VTDGenHuge;
import com.ximpleware.extended.VTDNavHuge;

class Test extends Thread {

private AutoPilotHuge ap;
private int i;
private int total;
public Test() {

}

public Test(AutoPilotHuge ap1,int inc,int total) {

    this.i=inc;
    this.total=total;
     this.ap=ap1;
//  this.vn = vn;

}

public void run() {
    try {
        long startingTime=System.currentTimeMillis();
        for (int inc = i; inc < total; inc++) {
        System.out.println(inc);
            String s = "//data-set/record[COL1='" + inc
                    + "' and COL2='1' and COL3='Administrator' and COL4='HR Administrator Payroll, Day Rate' and COL5='Ea' and COL6='Robert Walters AUD' and COL7='1' and COL8='Administrator' and COL9='HR Administrator Payroll, Day Rate' and COL10='Ea']";

            this.ap.selectXPath(s);
        //  String s1 =null;
            long startingTime1=System.currentTimeMillis();
            String s1=this.ap.evalXPathToString();

            long endTime1=System.currentTimeMillis();
        //  System.out.println(endTime1-startingTime1);
            //int s1 = this.ap.evalXPath();
            if (!StringUtils.isEmpty(s1)) {
            //   System.out.println(s1);
            }
            System.out.println(s1);
        //  this.ap.resetXPath();
        }
        long endTime=System.currentTimeMillis();
    //  System.out.println(endTime-startingTime);
    } catch (Exception e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

}

}

public class VTD_EXCELExample extends Test {

public static void main(String[] args) {
    // TODO Auto-generated method stub

    try {

        //VTDGen vg = new VTDGen();
        VTDGenHuge vgh = new VTDGenHuge();
        vgh.parseFile("E:/Test_batch/validation1.xml", false,VTDGenHuge.MEM_MAPPED);

        VTDNavHuge vn = vgh.getNav();
        int increment=1;
        int total=500;
        for (int i = 1; i < 3; i++) {

            AutoPilotHuge ap=new AutoPilotHuge(vn.cloneNav());
            Test t = new Test(ap,increment,total);
            increment=increment+500;
            total=total+500;
            t.start();
        }

    } catch (Exception e) {
        // TODO Auto-generated catch block
        System.out.println(e);
    }
    System.out.println("done");
}

}

1 Attachments

Discussion

  • jimmy zhang

    jimmy zhang - 2016-12-13

    ok, I tried your code with standard vtd-xml to test the performance, it took about 80ms per evalXPathToString() in 2.13.1... with some older version, the eval is a lot faster but less rigorous.... because xpath spec mandates the document order for the node set.... which only gets fully/conformantly implemented untill 2.12+ releases...
    80ms seems a lot less than 800ms you got...
    let me ask this: have you tested your app with a single thread and measured the performance as such? why do you want to fork some many threads and benchmark in such complicated environment? One step at a time, will you?

     
  • jimmy zhang

    jimmy zhang - 2016-12-13

    The reason you get empty space printed out as output is because in extended VTD implementation, the string value of a node is interpreted as the direct descent text. node... which doesn't return the text nodes of its descendant nodes... does that explanation make sense?

     
  • jimmy zhang

    jimmy zhang - 2016-12-14

    I can come up with a performance related patch, from 80ms to much lower... if the node order is not a big deal to your app... but 800ms is not what I got... I got around 60ms on average...

     
  • Ramesh

    Ramesh - 2016-12-14

    actually our app is loading huge file and checking against the xml value.for that we splitted the record into 10 threads and simultaneously all the thread will execute thatsy we are using multi thread.Autopilot huge is taking only less time than AutoPilot but in values are coming empty.what i have to do to get the values using evalXPathToString().

     
  • jimmy zhang

    jimmy zhang - 2016-12-14

    when you call evalXPathToString(). are you aware what you are supposed to get? also 800ms is not correct... I got far less than that, with ways to get much lower still... please clarify/repudiate the 800ms claim

     
  • jimmy zhang

    jimmy zhang - 2016-12-14

    do you understand my previous message explaining why u are getting empty strings with extended vtd-xml?

     
  • Ramesh

    Ramesh - 2016-12-14

    Sorry for asking again i m new thatsy asking i want to check values are available or not with certain condition and if values are coming i will return condition satisfy..for this i tried evalXPathToBoolean() it si also working fine.So can u suggest which one sal i use evalXPathToBoolean() or evalXPathToString()

     
  • Ramesh

    Ramesh - 2016-12-14

    And one more issue please integrate with spring batch partition and test the records it is not working properly sometimes evalXPathToBoolean() is returning true and some times it is returning false.Please look into that.

     
  • jimmy zhang

    jimmy zhang - 2016-12-14

    again, I need a solid test case... a code piece without dependency on spring would work

     
  • Ramesh

    Ramesh - 2016-12-15

    Hi sorry some configuration problem in spring thatsy not worked.now it is working fine this issue.

     
  • jimmy zhang

    jimmy zhang - 2016-12-15

    ask your boss to get you a new machine... I got sub 1ms on average in query performance...

     

    Last edit: jimmy zhang 2016-12-15
  • jimmy zhang

    jimmy zhang - 2016-12-17

    I will close this thread

     
  • jimmy zhang

    jimmy zhang - 2016-12-17
    • status: open --> closed
     

Log in to post a comment.