I have been trying to use the XQJ api to screen scrap a html page that has
been loaded into dom via HTML tidy. The Query I am try to run is:
for $x in //div
When I run the query it results in just empty <run></run>, however if I change
for $x in .//div to
for $x in // the query returns the html between the
I have checked my orignal query using the xquisitor gui tool and it works, but
just not when I try to implement it using XQJ
The XQJ code I am using is:
Document doc = fetchPage(); //fetchs and runs html tidy and returns W3C dom
SaxonXQDataSource ds = new SaxonXQDataSource(config);
XQConnection con = ds.getConnection();
XQItem item = con.createItemFromNode(doc.getChildNodes().item(1),
XQPreparedExpression xpres = con.prepareExpression(queryabove);
XQResultSequence seq = xpres.executeQuery();
I'm new to xquery and XQJ so i'm not sure if the problem is with my XQJ code
or the xquery I'm trying to run.
If you look more closely at your source XML you will almost certainly find
that the elements are in a namespace, probably
http://www.w3.org/1999/xhtml. So if you want
to select elements from this namespace, you will need to start your query with
declare default element namespace = "[url]http://www.w3.org/1999/xhtml[/url]";
Unfortunately this will have the side-effect of putting your output elements
(run and race) in this namespace as well, which is probably not what you want.
The workaround is to bind a specific prefix
declare namespace h = "[url]http://www.w3.org/1999/xhtml[/url]";
for $x in //h:div[...] return ...
This is a weakness in the design of the XQuery language.
Please note that this forum isn't really intended for general XQuery coding
help that's independent of the Saxon product. You should try the talk @
x-query.com mailing list, or stackoverflow.com.
Log in to post a comment.