From: Adam R. <ada...@de...> - 2007-04-10 09:42:21
|
Andrew, I took your XQuery code and tested it here and it works perfectly, I get the full and expected response. Things for you to check - 1) Do you have the NekoHTML jar in your lib/optional folder. 2) Do you need to go through a proxy server on your network for http. If so have you setup java to use the proxy server, either through runtime or environment options? Thanks Adam. On Tue, 2007-04-10 at 15:36 +1000, Andrew Lonie wrote: > Hi Adam I have a quick question. I've built and installed the html > module from eXist's subversion, but when I invoke html:doc(URL) all I > seem to get back is an extremely basic html document with no content. > For instance, using the xquery sandbox: > > declare namespace html="http://exist-db.org/xquery/html"; > let $document := html:doc("http://www.yahoo.com") > return > $document > > returns > > <HTML> > <HEAD> </HEAD> > </HTML> > > html:doc("http://google.com") returns > > <HTML> > <HEAD> > <META http-equiv ="content-type" content ="text/html; charset=ISO-8859-1"/> > </HEAD> > </HTML> > > The logs seem to indicate that the html module is working OK: > > -------------------- Extract from exist.log ---------------------- > > let $document := html:doc("http://www.yahoo.com") > return > $document > 2007-04-10 15:31:24,375 [http-80-Processor24] DEBUG (HTTPUtils.java > [addLastModifiedHeader]:61) - mostRecentDocumentTime: 0 > 2007-04-10 15:31:26,140 [http-80-Processor25] DEBUG (NativeBroker.java > [getXMLResource]:1541) - document '/db' not found! > 2007-04-10 15:31:26,156 [http-80-Processor25] DEBUG > (XQueryContext.java [getStaticallyKnownDocuments]:639) - reading > collection /db > 2007-04-10 15:31:26,156 [http-80-Processor25] DEBUG (XQuery.java > [compile]:156) - Compilation took 0 > 2007-04-10 15:31:27,140 [http-80-Processor25] DEBUG (DocFunction.java > [getHTMLDocument]:122) - Converting HTML to XML using NekoHTML parser > for: http://www.yahoo.com > 2007-04-10 15:31:27,187 [http-80-Processor25] DEBUG (HTTPUtils.java > [addLastModifiedHeader]:61) - mostRecentDocumentTime: 0 > 2007-04-10 15:31:27,234 [http-80-Processor24] DEBUG > (NativeSerializer.java [serializeToReceiver]:129) - serializing > document 34 (/db/sandbox/xml-highlight.xsl) to SAX took 16 > 2007-04-10 15:31:27,234 [http-80-Processor24] DEBUG (HTTPUtils.java > [addLastModifiedHeader]:61) - mostRecentDocumentTime: 0 > 2007-04-10 15:31:32,406 [http-80-Processor25] DEBUG (Compile.java > [eval]:50) - eval: declare namespace > html="http://exist-db.org/xquery/html"; > > let $document := html:doc("http://www.yahoo.com") > return > $document > 2007-04-10 15:31:32,406 [http-80-Processor25] DEBUG (HTTPUtils.java > [addLastModifiedHeader]:61) - mostRecentDocumentTime: 0 > > ---------------------------------------- > > Any ideas? > > Andrew > > On 09/04/07, Adam Retter <ada...@de...> wrote: > > The HTML module is already in eXist's subversion. If you want to get the > > code for it directly, take a a look here - > > http://exist.svn.sourceforge.net/viewvc/exist/trunk/eXist/extensions/modules/src/org/exist/xquery/modules/html/ > > > > Andrew Tsui - would you be interested in sharing/committing such a > > module into the eXist code base? > > > > Thanks Adam. > > > > On Mon, 2007-04-09 at 15:59 +1000, Andrew Lonie wrote: > > > Thanks Adam. Andrew Tsui has just sent me a web services module he > > > developed for eXist which works well - it seems sensible to look at > > > combining/adapting this into a generic HTTP client module (if Andrew > > > is OK with that and copyright permits - Andrew?). I'd certainly like > > > to contribute though my Java skills are moderate at best. I'm also > > > happy to hear that there is an HTML to XML tidying module - I'm > > > currently doing it through a Tagsoup-based servlet - so I'd be very > > > interested in looking at your HTML module anyway Adam. > > > > > > Andrew > > > > > > > > > > > > On 09/04/07, Adam Retter <ada...@de...> wrote: > > > > hmmm... > > > > > > > > Interesting, very interesting. Unfortunately Dannes is correct, Web > > > > Services and eXist pique my interest ;-) > > > > > > > > Taking a look at the Data-Direct approach I think perhaps they are a > > > > little too strict in their approach although the general concept is a > > > > good one. > > > > > > > > All a Web Service request really is, is a HTTP POST (or sometimes a HTTP > > > > GET), now eXist allows a HTTP get, by using the doc() function. So what > > > > we really need is a method for allowing a HTTP POST. I recently added a > > > > HTML extension module to eXist that allows you to HTTP GET a HTML > > > > document and have it `tidied` into an (X)HTML/XML type document. Perhaps > > > > this should be reorganised into a more generic HTTP module that allows > > > > for HTTP operations of various types, web services, html, etc. Im not > > > > sure that a strict web services module is required as eXist/XQuery has > > > > all the necessary constructs for dealing with XML, its just a matter of > > > > getting that XML. > > > > > > > > So, Andrew, How are your Java skills? Would you be interested in > > > > creating a more generic HTTP extension module for eXist. If you know a > > > > little Java, I can promise you its not very difficult and I will of > > > > course help, but I am a bit short of time, otherwise I would do this > > > > myself. > > > > > > > > Thanks Adam. > > > > > > > > On Tue, 2007-04-03 at 08:13 +0200, Dannes Wessels wrote: > > > > > Hi, > > > > > > > > > > On 4/3/07, Andrew Lonie <and...@gm...> wrote: > > > > > > Hi. I've spent a fair bit of time searching for an xquery extension > > > > > > module to allow SOAP calls from eXist or Saxon directly, similar to > > > > > > the ws:call() function provided by the DataDirect xquery > > > > > > implementation - is anyone working on this, or know of such a module? > > > > > > > > > > I guess you refer to > > > > > http://www.datadirect.com/products/xquery/data-integration/index.ssp ? > > > > > > > > > > As far as I know there are no activities for such a feature........ Adam? > > > > > > > > > > D. > > > > > > > > > > > > > > -- > > > > Adam Retter > > > > > > > > Principal Developer > > > > Devon Portal Project > > > > Room 310 > > > > County Hall > > > > Topsham Road > > > > Exeter > > > > EX2 4QD > > > > > > > > t: 01392 38 3683 > > > > f: 01392 38 2966 > > > > e: ada...@de... > > > > w: www.devonline.gov.uk > > > > > > -- > > Adam Retter > > > > Principal Developer > > Devon Portal Project > > Room 310 > > County Hall > > Topsham Road > > Exeter > > EX2 4QD > > > > t: 01392 38 3683 > > f: 01392 38 2966 > > e: ada...@de... > > w: www.devonline.gov.uk > > -- Adam Retter Principal Developer Devon Portal Project Room 310 County Hall Topsham Road Exeter EX2 4QD t: 01392 38 3683 f: 01392 38 2966 e: ada...@de... w: www.devonline.gov.uk |