From: Adam R. <ada...@de...> - 2007-04-10 12:50:10
|
Okay well from the log file it looks like the NekoHTML library is being picked up correctly. If you dont have a proxy server then it is not a problem there... Im not quite sure on the problem to be honest, as I know of other people using this module successfully... Do you have access to Ethereal or Wireshark? If so you could use that to check that the HTTP request and response are as expected? If that is all okay then maybe a little debugging with Eclipse/Netbeans is in order... Thanks Adam. On Tue, 2007-04-10 at 13:43 +0100, Andrew Tsui wrote: > Hi Adam, > > I am using the released version of the war distribution which is > available on the eXist main web page (I think build 4311). > > 1) I have "nekohtml-0.9.5.jar" under exist/WEB-INF > > 2) We don't have a proxy server here, so I did not tested it with any > proxy settings > > > Thanks > Andrew > > ============================== > > On 10/04/07, Adam Retter <ada...@de...> wrote: > Andrew, > > I took your XQuery code and tested it here and it works > perfectly, I get > the full and expected response. Things for you to check - > > 1) Do you have the NekoHTML jar in your lib/optional folder. > 2) Do you need to go through a proxy server on your network > for http. If > so have you setup java to use the proxy server, either through > runtime > or environment options? > > Thanks Adam. > > On Tue, 2007-04-10 at 15:36 +1000, Andrew Lonie wrote: > > Hi Adam I have a quick question. I've built and installed > the html > > module from eXist's subversion, but when I invoke > html:doc(URL) all I > > seem to get back is an extremely basic html document with no > content. > > For instance, using the xquery sandbox: > > > > declare namespace html="http://exist-db.org/xquery/html"; > > let $document := html:doc("http://www.yahoo.com") > > return > > $document > > > > returns > > > > <HTML> > > <HEAD> </HEAD> > > </HTML> > > > > html:doc("http://google.com") returns > > > > <HTML> > > <HEAD> > > <META http-equiv ="content-type" content ="text/html; > charset=ISO-8859-1"/> > > </HEAD> > > </HTML> > > > > The logs seem to indicate that the html module is working > OK: > > > > -------------------- Extract from exist.log > ---------------------- > > > > let $document := html:doc(" http://www.yahoo.com") > > return > > $document > > 2007-04-10 15:31:24,375 [http-80-Processor24] DEBUG > (HTTPUtils.java > > [addLastModifiedHeader]:61) - mostRecentDocumentTime: 0 > > 2007-04-10 15:31:26,140 [http-80-Processor25] DEBUG > (NativeBroker.java > > [getXMLResource]:1541) - document '/db' not found! > > 2007-04-10 15:31:26,156 [http-80-Processor25] DEBUG > > (XQueryContext.java [getStaticallyKnownDocuments]:639) - > reading > > collection /db > > 2007-04-10 15:31:26,156 [http-80-Processor25] DEBUG > (XQuery.java > > [compile]:156) - Compilation took 0 > > 2007-04-10 15:31:27,140 [http-80-Processor25] DEBUG > ( DocFunction.java > > [getHTMLDocument]:122) - Converting HTML to XML using > NekoHTML parser > > for: http://www.yahoo.com > > 2007-04-10 15:31:27,187 [http-80-Processor25] DEBUG > ( HTTPUtils.java > > [addLastModifiedHeader]:61) - mostRecentDocumentTime: 0 > > 2007-04-10 15:31:27,234 [http-80-Processor24] DEBUG > > (NativeSerializer.java [serializeToReceiver]:129) - > serializing > > document 34 (/db/sandbox/xml- highlight.xsl) to SAX took 16 > > 2007-04-10 15:31:27,234 [http-80-Processor24] DEBUG > (HTTPUtils.java > > [addLastModifiedHeader]:61) - mostRecentDocumentTime: 0 > > 2007-04-10 15:31:32,406 [http-80-Processor25] DEBUG > ( Compile.java > > [eval]:50) - eval: declare namespace > > html="http://exist-db.org/xquery/html"; > > > > let $document := html:doc("http://www.yahoo.com") > > return > > $document > > 2007-04-10 15:31:32,406 [http-80-Processor25] DEBUG > (HTTPUtils.java > > [addLastModifiedHeader]:61) - mostRecentDocumentTime: 0 > > > > ---------------------------------------- > > > > Any ideas? > > > > Andrew > > > > On 09/04/07, Adam Retter <ada...@de...> wrote: > > > The HTML module is already in eXist's subversion. If you > want to get the > > > code for it directly, take a a look here - > > > > http://exist.svn.sourceforge.net/viewvc/exist/trunk/eXist/extensions/modules/src/org/exist/xquery/modules/html/ > > > > > > Andrew Tsui - would you be interested in > sharing/committing such a > > > module into the eXist code base? > > > > > > Thanks Adam. > > > > > > On Mon, 2007-04-09 at 15:59 +1000, Andrew Lonie wrote: > > > > Thanks Adam. Andrew Tsui has just sent me a web services > module he > > > > developed for eXist which works well - it seems sensible > to look at > > > > combining/adapting this into a generic HTTP client > module (if Andrew > > > > is OK with that and copyright permits - Andrew?). I'd > certainly like > > > > to contribute though my Java skills are moderate at > best. I'm also > > > > happy to hear that there is an HTML to XML tidying > module - I'm > > > > currently doing it through a Tagsoup-based servlet - so > I'd be very > > > > interested in looking at your HTML module anyway Adam. > > > > > > > > Andrew > > > > > > > > > > > > > > > > On 09/04/07, Adam Retter <ada...@de...> > wrote: > > > > > hmmm... > > > > > > > > > > Interesting, very interesting. Unfortunately Dannes is > correct, Web > > > > > Services and eXist pique my interest ;-) > > > > > > > > > > Taking a look at the Data-Direct approach I think > perhaps they are a > > > > > little too strict in their approach although the > general concept is a > > > > > good one. > > > > > > > > > > All a Web Service request really is, is a HTTP POST > (or sometimes a HTTP > > > > > GET), now eXist allows a HTTP get, by using the doc() > function. So what > > > > > we really need is a method for allowing a HTTP POST. I > recently added a > > > > > HTML extension module to eXist that allows you to HTTP > GET a HTML > > > > > document and have it `tidied` into an (X)HTML/XML type > document. Perhaps > > > > > this should be reorganised into a more generic HTTP > module that allows > > > > > for HTTP operations of various types, web services, > html, etc. Im not > > > > > sure that a strict web services module is required as > eXist/XQuery has > > > > > all the necessary constructs for dealing with XML, its > just a matter of > > > > > getting that XML. > > > > > > > > > > So, Andrew, How are your Java skills? Would you be > interested in > > > > > creating a more generic HTTP extension module for > eXist. If you know a > > > > > little Java, I can promise you its not very difficult > and I will of > > > > > course help, but I am a bit short of time, otherwise I > would do this > > > > > myself. > > > > > > > > > > Thanks Adam. > > > > > > > > > > On Tue, 2007-04-03 at 08:13 +0200, Dannes Wessels > wrote: > > > > > > Hi, > > > > > > > > > > > > On 4/3/07, Andrew Lonie <and...@gm...> > wrote: > > > > > > > Hi. I've spent a fair bit of time searching for an > xquery extension > > > > > > > module to allow SOAP calls from eXist or Saxon > directly, similar to > > > > > > > the ws:call() function provided by the DataDirect > xquery > > > > > > > implementation - is anyone working on this, or > know of such a module? > > > > > > > > > > > > I guess you refer to > > > > > > > http://www.datadirect.com/products/xquery/data-integration/index.ssp ? > > > > > > > > > > > > As far as I know there are no activities for such a > feature........ Adam? > > > > > > > > > > > > D. > > > > > > > > > > > > > > > > > -- > > > > > Adam Retter > > > > > > > > > > Principal Developer > > > > > Devon Portal Project > > > > > Room 310 > > > > > County Hall > > > > > Topsham Road > > > > > Exeter > > > > > EX2 4QD > > > > > > > > > > t: 01392 38 3683 > > > > > f: 01392 38 2966 > > > > > e: ada...@de... > > > > > w: www.devonline.gov.uk > > > > > > > > -- > > > Adam Retter > > > > > > Principal Developer > > > Devon Portal Project > > > Room 310 > > > County Hall > > > Topsham Road > > > Exeter > > > EX2 4QD > > > > > > t: 01392 38 3683 > > > f: 01392 38 2966 > > > e: ada...@de... > > > w: www.devonline.gov.uk > > > > -- > Adam Retter > > Principal Developer > Devon Portal Project > Room 310 > County Hall > Topsham Road > Exeter > EX2 4QD > > t: 01392 38 3683 > f: 01392 38 2966 > e: ada...@de... > w: www.devonline.gov.uk > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance > to share your > opinions on IT & business topics through brief surveys-and > earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Exist-open mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-open > -- Adam Retter Principal Developer Devon Portal Project Room 310 County Hall Topsham Road Exeter EX2 4QD t: 01392 38 3683 f: 01392 38 2966 e: ada...@de... w: www.devonline.gov.uk |