Thanks Michael,

I have attached the one sample source code for your reference.
it would be great if you can give me some  direction or problem solution.

i am trying to get all the input element from response.

Note: I am using Httpclient and JTidy API



Regards,
Gaurav

import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import net.sf.saxon.Configuration;
import net.sf.saxon.dom.DocumentWrapper;
import net.sf.saxon.query.DynamicQueryContext;
import net.sf.saxon.query.StaticQueryContext;
import net.sf.saxon.query.XQueryExpression;
import net.sf.saxon.trans.XPathException;

import org.apache.commons.httpclient.HttpException;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.util.EntityUtils;
import org.w3c.dom.Document;
import org.w3c.tidy.Tidy;

public class HttpclientTutorial1 {

    private static String uri = "https://superseeker.super.ato.gov.au/SuperSeekerWeb/default.aspx?pid=71";
    private static String url="https://superseeker.super.ato.gov.au";
    private String restURI=null;
    private static String query = "for $x in  //input \n" +
                                  "return $x \n";

    public static void main(String[] args) {

        HttpclientTutorial1 hct = new HttpclientTutorial1();
        try {
            hct.getScrapedData();
        } catch (ClientProtocolException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

    }

    private void getScrapedData() throws ClientProtocolException, IOException {

        // Create an instance of HttpClient.
        DefaultHttpClient httpclient = new DefaultHttpClient();

        HttpGet httpget = new HttpGet(uri);

        try {

            HttpResponse responseBody = httpclient.execute(httpget);
            Pattern p = Pattern.compile("window.location.*pid=71");
            Matcher m = p.matcher(EntityUtils.toString(responseBody.getEntity()));
            boolean found = false;

            while (m.find()) {
                System.out.println("I found the text "+m.group()+ " starting at " +m.start() + "index and ending at index "+ m.end());
                restURI=org.apache.commons.lang.StringUtils.removeStart(m.group(), "window.location = '");
                System.out.println(restURI);
                    found = true;
            }
            if(!found){
                System.out.println("No match found");
            }
            HttpGet httpget1=new HttpGet(url+restURI);
            responseBody = httpclient.execute(httpget1);
            List result=getElementFromResponse(responseBody,query);
           
        } catch (HttpException e) {
            System.err.println("Fatal protocol violation: " + e.getMessage());
            e.printStackTrace();
        } catch (IOException e) {
            System.err.println("Fatal transport error: " + e.getMessage());
            e.printStackTrace();
        } catch (XPathException e) {
            e.printStackTrace();
        } finally {
            // Release the connection.
            httpclient.getConnectionManager().shutdown();
        }

    }


    private List getElementFromResponse(
            HttpResponse responseBody,String query) throws IllegalStateException, IOException, XPathException {
        HttpEntity entity = responseBody.getEntity();
        List result=null;
        if (entity != null) {
            InputStream responseBodyStream = entity.getContent();
            // Convert the response into document object
            Document tidyDOM = ConvertResponseIntoDomObject(responseBodyStream);
            // get the element using Xquery and extract the input
            // attributes.
            result = retriveDomElementFromDocumentObject(
                    query, url+restURI, tidyDOM);
            System.out.print("size:" +result.size());

        }

        return result;
    }

   
    private List retriveDomElementFromDocumentObject(
            String query, String url, Document doc) throws XPathException {
        Configuration c = new Configuration();
        StaticQueryContext qp = c.newStaticQueryContext();
        XQueryExpression xe = qp.compileQuery(query);
        DynamicQueryContext dqc = new DynamicQueryContext(c);
        dqc.setContextItem(new DocumentWrapper(doc, url, c));
        List domElement = xe.evaluate(dqc);
        return domElement;
    }

    private Document ConvertResponseIntoDomObject(InputStream responseBody) {

        Tidy tidy = new Tidy();
        //tidy.setXHTML(true);
        tidy.setQuiet(false);
        tidy.setShowWarnings(true);
        OutputStream o=System.out;
        Document dom = tidy.parseDOM(responseBody, o);
       
       
        return dom;
    }


}


On Fri, Nov 20, 2009 at 7:32 AM, Michael Kay <mike@saxonica.com> wrote:
You haven't shown your source document, so I can't tell why this query retrieves nothing. My guess would be that the <input> elements are in a namespace, whereas your query is only selecting <input> elements that are in no namespace.
 
Incidentally, the query
 
   for $x in //input return $x
 
can be abbreviated to
 
   //input

From: Gaurav sharma [mailto:sham.gaurav@gmail.com]
Sent: 19 November 2009 10:10Subject: [saxon] issue while compile and execute an XQuery expression withSaxon.


Hi All,

I am SAXON user. And trying to compile and execute an XQuery expression with Saxon.

objective : execute the Xquey on document object and get the Domelement list.


I am using below code for that.
 
Configuration c = new Configuration();
StaticQueryContext qp = new StaticQueryContext(c);
XQueryExpression xe = qp.compileQuery(query);
DynamicQueryContext dqc = new DynamicQueryContext(c);
dqc.setContextNode(new DocumentWrapper(dom, url, c));
List result = xe.evaluate(dqc);

 

 

Here query= "for $x in  //input \n" + "return $x \n";

And dom is a object of org.w3c.dom.Document.

Document object is having input element. But when I evaluate xquery expression, list size is coming zero. Also I am not getting any exception.

I tried same code with some other URL where I am able to see the result. So what is the problem with other one.

 

Can anyone please answer following question?

 

-        Why the list size is coming zero if Document object is having input element.

-        How can I find the baseURI (variable name - url).

-        Is there any way to trace the log/warning/errors.

 

Thanks,

Gaurav



------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/saxon-help



--
Regards,
Gaurav Sharma
HCL Tech , Gurgaon
Mobile +919818305458