From: Sverre B. <sv...@us...> - 2005-10-20 13:57:00
|
Update of /cvsroot/archive-access/archive-access/projects/wera/src/articles In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv8085 Modified Files: what-is-wera.xml Log Message: Updated after comments from JohnErik Index: what-is-wera.xml =================================================================== RCS file: /cvsroot/archive-access/archive-access/projects/wera/src/articles/what-is-wera.xml,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** what-is-wera.xml 20 Oct 2005 13:28:01 -0000 1.1 --- what-is-wera.xml 20 Oct 2005 13:56:49 -0000 1.2 *************** *** 142,177 **** <title>Practical use</title> - <para>The figure below shows a more likely setup of Wera and it's - surroundings,- Web Archive, NutchWax and Wera are located on different - machines. All the Arc files recides at host A1 to An, and all these has - on beforehand been indexed by NutchWax (see NutchWax documentation for - details on indexing). </para> - - <figure> - <title>Wera interfacing several archive nodes - Currently - unsupported</title> - - <mediaobject> - <imageobject> - <imagedata fileref="images/wera2.png" /> - </imageobject> - </mediaobject> - </figure> - - <para>So how do we make Wera aware which Arc Retriever to fetch a given - resource from prior to displaying it in the timeline view? Each resource - in a ARC collection will have to be marked with collection name in the - index. E.g in the example figure all resources in ARC files on A1 would - be tagged A1, resource on A2 is tagged A2 etc. In the Wera configuration - each collection has to be mapped to a given arc retriever.</para> - - <note> - <para>Currently Wera does not support the direct mapping between - collection and retriever. Such mapping will be added in a later - release. However, it does support mapping between collection and other - Wera installations. See below for background and details on - this.</para> - </note> - <para>The original vision for the NwaToolset (the predecessor of Wera) was to enable search across the different Nordic Web Archives and --- 142,145 ---- *************** *** 180,202 **** url="http://fastsearch.com/">Fast Search & Transfer</ulink>'s multi node architecture. To enable Wera to retrieve a particular document with ! a given aid from the right archive the collection field was introduced. ! The Wera config file would hold the mapping from collection to archive ! (or rather Wera installation).</para> <para>Another reason to include the collection field was to ensure that the actual link rewriting was done by the owner of the document. Each ! archive holder would have to set up their own NwaToolset Access Module. ! When one Access module was requesting a document from a remote archive ! the remote Access module should make the necessary changes to the ! document before delivering it to the calling Access Module. The reason ! for this was to make sure that the owner had full control over what was ! delivered to the calling site, thus being able to threat the document in ! accordance with local policies rather than the policies of the caller ! site. The figure below illustrates the currently supported use of ! mapping between collection and archive nodes.</para> <figure> ! <title>Wera interfacing several archive nodes - Currently ! supported</title> <mediaobject> --- 148,170 ---- url="http://fastsearch.com/">Fast Search & Transfer</ulink>'s multi node architecture. To enable Wera to retrieve a particular document with ! a given aid from the right archive the collection field was introduced ! in the index (also present in the NutxhWax index). The Wera config file ! holds the mapping from collection to archive (or rather Wera ! installation).</para> <para>Another reason to include the collection field was to ensure that the actual link rewriting was done by the owner of the document. Each ! archive holder would have to set up their own Wera installation. When ! one Wera was requesting a document from a remote archive the remote Wera ! should make the necessary changes to the document before delivering it ! to the calling Wera. The reason for this was to make sure that the owner ! had full control over what was delivered to the calling site, thus being ! able to threat the document in accordance with local policies rather ! than the policies of the caller site. The figure below illustrates the ! currently supported use of mapping between collection and archive ! nodes.</para> <figure> ! <title>Wera interfacing several archive nodes</title> <mediaobject> *************** *** 218,226 **** the links point to itself rather than to W2's Wera.</para> ! <para>Of course, all the Wera might recide on the same host, which in ! effect will be the same solution as described in the previous example ! (direct mapping between collection and retriever). It will however ! introduce some extra overhead, with unnecessary http traffic and file ! parsing.</para> </section> </section> --- 186,195 ---- the links point to itself rather than to W2's Wera.</para> ! <para>In a real-life large scale Web Archive where the ARC files are ! distributed across tens or hundreds of hosts it will not be practical to ! set up one Wera installation for each of these. A better solution will ! be to introduce communication between the different retrievers or have ! one front-end retriever interfacing all the other retrievers within one ! archive. This has to be added in a later release of Wera.</para> </section> </section> |