From: Pope, J. <Jac...@bl...> - 2008-01-11 11:10:43
|
Hiya Erik, Thanks for your help. I've now got an ArcProxy running (by copying wayback.war to arc-proxy.war, deploying the proxy and changing its config so only the ArcProxy/LocationDB section is not commented out), but I still can't see the files. I've run curl as you suggested and it appears to work (urls munged below): curl http://www.example.com:8080/arc-proxy/locationDB -d operation=3Dadd -d name=3D/wap/filestore/2523139/arcs/IAH-20070920101741-00000-wap300.bl.uk.= a rc.gz -d url=3Dhttp://www.example.com:8080/arc-proxy/arcs/IAH-20070920101741-00000= - wap300.bl.uk.arc.gz OK added url http://www.example.com:8080/arc-proxy/arcs/IAH-20070920101741-00000-wap3 00.bl.uk.arc.gz for /wap/filestore/2523139/arcs/IAH-20070920101741-00000-wap300.bl.uk.arc.gz Yet when I try to browse the wayback for a URL in that arc I get an error saying the resource is unavailable with the following error in catalina.out: INFO: initialized org.archive.wayback.resourcestore.http.FileLocationDB com.sleepycat.je.DatabaseException: Unable to locate(IAH-20070920101741-00000-wap300.bl.uk.arc.gz) at org.archive.wayback.resourcestore.http.ArcProxyServlet.doGet(ArcProxySer vlet.java:90) at javax.servlet.http.HttpServlet.service(HttpServlet.java:690) at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applica tionFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilt erChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValv e.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValv e.java:175) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java :128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java :102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve. java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:2 63) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:84 4) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process( Http11Protocol.java:584) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:619) It's not a problem with the arc file, as this was fine when I was using a local ARC store. Any ideas? Cheers, Jack Jackson Pope Technical Lead Web Archiving Team The British Library +44 (0)1937 54 6942 -----Original Message----- From: Erik Hetzner [mailto:eri...@uc...]=20 Sent: 11 January 2008 01:43 To: arc...@li... Cc: Pope, Jackson Subject: Re: [Archive-access-discuss] FW: Wayback 0.8.0 and ArcProxy Hi Jackson, At Thu, 10 Jan 2008 13:29:05 -0000, "Pope, Jackson" <Jac...@bl...> wrote: >=20 > Hiya, >=20 > Are there any instructions on how to setup an ArcProxy for use with > Wayback 0.8.0? Unfortunately the docs on the web site are for the 1.0 series of Wayback. It's been a while since I set up an arc proxy for the 0.8 series but I'll do my best. =20 > I've got wayback installed and working with Nutchwax, and > now I'm trying to get the arcs proxied rather than use the > LocalRestoreStore. I want the ArcProxy setup on the same machine, and > the files presented via HTTP on the same machine too (though they are > stored on an NFS server). If you only have one set of files served via HTTP, I am not sure you need the proxy. You should be able to get away with just an HTTP resource store; the proxy is only necessary if you have many HTTP servers serving ARC content, and you need a central location to keep track of where ARCs are located. > I've uncommented the appropriate section of the web.xml, and tryied > running location-client, but I'm not sure I've got an ArcProxy running > (is it a separate download or part of Wayback?), and I don't know what > setting to use in the location-client calls or the Remote HTTP1.1 > Resource Store, to get this setup correctly. Is there a document kicking > around that explains this? If you have got an arc proxy working correctly, and you have added to it with the location client, you should be able to do a GET on http://example.org/proxy-prefix/IAH-20070705232355-00000-example.org.arc .gz for a known ARC to get it back. I find it easier to use the following curl command to add ARCs to the arc proxy than using the location-client: curl ${LOCATIONDB_URL} -d operation=3Dadd -d name=3D${F} -d url=3D${BASE_URL}${F} where LOCATIONDB_URL is the arc proxy URL, F is the name of the arc file, and BASE_URL is the base url of the HTTP server where you are serving arc files from. Hope that helps. best, Erik Hetzner ;; Erik Hetzner, California Digital Library ;; gnupg key id: 1024D/01DB07E3 *************************************************************************= * =20 Experience the British Library online at www.bl.uk =20 The British Library's new interactive Annual Report and Accounts 2006/07 = : www.bl.uk/mylibrary =20 Help the British Library conserve the world's knowledge. Adopt a Book. = www.bl.uk/adoptabook =20 The Library's St Pancras site is WiFi - enabled =20 *************************************************************************= =20 The information contained in this e-mail is confidential and may be = legally privileged. It is intended for the addressee(s) only. If you are = not the intended recipient, please delete this e-mail and notify the = pos...@bl... : The contents of this e-mail must not be disclosed or = copied without the sender's consent.=20 =20 The statements and opinions expressed in this message are those of the = author and do not necessarily reflect those of the British Library. The = British Library does not take any responsibility for the views of the = author.=20 =20 *************************************************************************= |