From: Sverre B. <sv...@us...> - 2005-10-21 07:33:50
|
Update of /cvsroot/archive-access/archive-access/projects/wera/src/articles In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv22634 Modified Files: what-is-wera.xml Log Message: minor change Index: what-is-wera.xml =================================================================== RCS file: /cvsroot/archive-access/archive-access/projects/wera/src/articles/what-is-wera.xml,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** what-is-wera.xml 20 Oct 2005 16:34:11 -0000 1.3 --- what-is-wera.xml 21 Oct 2005 07:33:39 -0000 1.4 *************** *** 70,74 **** </figure> ! <para>Explanation of figure 1:</para> <itemizedlist> --- 70,74 ---- </figure> ! <para>Explanation of above figure:</para> <itemizedlist> *************** *** 113,140 **** to the timeline view script (1,2). For that particular version Wera constructs a request to the <emphasis>arcretriever</emphasis> ! containing the name ! of the ARC file where the version resides as well as the offset ! within that file where the version is stored (the ARC name and ! offset are stored in the index). Wera now requests, and receives ! an archived resource (3, 4) from the ! <emphasis>arcretriever</emphasis> (request ! example: <literal>http://localhost:8082/arcretriever/arcretriever?reqtype=getfile&aid=5902508/IAH-20051004171809-00000-test</literal>). ! If the resource is of type ! <literal>text/html</literal> (information in result set ! from NutchWax), a javascript link rewriter is inserted in the ! resource to ensure that links point to Wera rather than out to the ! internet. Before Wera delivers the resource to the users browser, ! header information on content type and encoding is set according ! to values received in the NutchWax result set. This is done to ! ensure that the users browser renders the resource ! correctly.</para> <note> <para>A resource of type <literal>text/html</literal> will often ! contain inline ! references to images etc. Provided the javascript link rewriter ! does its job on these, the step above will be repeated for each ! of these.</para> </note> </listitem> --- 113,136 ---- to the timeline view script (1,2). For that particular version Wera constructs a request to the <emphasis>arcretriever</emphasis> ! containing the name of the ARC file where the version resides as ! well as the offset within that file where the version is stored ! (the ARC name and offset are stored in the index). Wera now ! requests, and receives an archived resource (3, 4) from the ! <emphasis>arcretriever</emphasis> (request example: <literal>http://localhost:8082/arcretriever/arcretriever?reqtype=getfile&aid=5902508/IAH-20051004171809-00000-test</literal>). ! If the resource is of type <literal>text/html</literal> ! (information in result set from NutchWax), a javascript link ! rewriter is inserted in the resource to ensure that links point to ! Wera rather than out to the internet. Before Wera delivers the ! resource to the users browser, header information on content type ! and encoding is set according to values received in the NutchWax ! result set. This is done to ensure that the users browser renders ! the resource correctly.</para> <note> <para>A resource of type <literal>text/html</literal> will often ! contain inline references to images etc. Provided the javascript ! link rewriter does its job on these, the step above will be ! repeated for each of these.</para> </note> </listitem> *************** *** 146,173 **** <title>Practical use</title> ! <para>The original vision for the ! <ulink url="http://nwa.nb.no">NwaToolset</ulink> (the predecessor of Wera) ! was to enable search across the different Nordic Web Archives and ! provide seamless navigation within the different archives. The ability ! to search across the different indexes was solved by the using <ulink url="http://fastsearch.com/">Fast Search & Transfer</ulink>'s multi node architecture. To enable Wera to retrieve a particular document with a given <literal>aid</literal> (Archive ID) from the right archive the ! collection field was introduced ! in the index (also present in the NutchWax index). The Wera config file ! holds the mapping from collection to archive (or rather Wera ! installation).</para> <para>Another reason to include the collection field was to ensure that the actual link rewriting was done by the owner of the document. Each archive holder would have to set up their own Wera installation. When ! one Wera was requesting a document from a remote archive, the remote Wera ! should make the necessary changes to the document before delivering it ! to the calling Wera. The reason for this was to make sure that the owner ! had full control over what was delivered to the calling site, thus being ! able to threat the document in accordance with local policies rather ! than the policies of the caller site. The figure below illustrates the ! currently supported use of mapping between collection and archive ! nodes.</para> <figure> --- 142,168 ---- <title>Practical use</title> ! <para>The original vision for the <ulink ! url="http://nwa.nb.no">NwaToolset</ulink> (the predecessor of Wera) was ! to enable search across the different Nordic Web Archives and provide ! seamless navigation within the different archives. The ability to search ! across the different indexes was solved by the using <ulink url="http://fastsearch.com/">Fast Search & Transfer</ulink>'s multi node architecture. To enable Wera to retrieve a particular document with a given <literal>aid</literal> (Archive ID) from the right archive the ! collection field was introduced in the index (also present in the ! NutchWax index). The Wera config file holds the mapping from collection ! to archive (or rather Wera installation).</para> <para>Another reason to include the collection field was to ensure that the actual link rewriting was done by the owner of the document. Each archive holder would have to set up their own Wera installation. When ! one Wera was requesting a document from a remote archive, the remote ! Wera should make the necessary changes to the document before delivering ! it to the calling Wera. The reason for this was to make sure that the ! owner had full control over what was delivered to the calling site, thus ! being able to threat the document in accordance with local policies ! rather than the policies of the caller site. The figure below ! illustrates the currently supported use of mapping between collection ! and archive nodes.</para> <figure> *************** *** 181,195 **** </figure> ! <para>In the Wera installation of ! <emphasis>W1</emphasis> the different collections indexed ! in NutchWax are mapped to corresponding Wera installations of ! <emphasis>W2- Wn</emphasis>. ! When the timeline view on W1 encounters a resource located on a ! different node (e.g. the collection mapping points to the Wera ! installation of <emphasis>W2</emphasis>) it requests that resource from ! the Wera installation at <literal>W2</literal>. Wera at ! <literal>W2</literal> fetches the resource from its Retriever and does ! the necessary changes to the file before delivering it to Wera at ! <literal>W1</literal> (e.g. inserts javascript link rewriter or rewrites it server side). When Wera at <literal>W1</literal> receives this file it does an additional --- 176,188 ---- </figure> ! <para>In the Wera installation of <emphasis>W1</emphasis> the different ! collections indexed in NutchWax are mapped to corresponding Wera ! installations of <emphasis>W2- Wn</emphasis>. When the timeline view on ! W1 encounters a resource located on a different node (e.g. the ! collection mapping points to the Wera installation of ! <emphasis>W2</emphasis>) it requests that resource from the Wera ! installation at <literal>W2</literal>. Wera at <literal>W2</literal> ! fetches the resource from its Retriever and does the necessary changes ! to the file before delivering it to Wera at <literal>W1</literal> (e.g. inserts javascript link rewriter or rewrites it server side). When Wera at <literal>W1</literal> receives this file it does an additional *************** *** 205,207 **** </section> </section> ! </article> --- 198,200 ---- </section> </section> ! </article> \ No newline at end of file |