From: Michael S. <sta...@us...> - 2005-10-22 01:27:11
|
Update of /cvsroot/archive-access/archive-access/projects/nutch/src/articles In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv19865/src/articles Modified Files: releasenotes.xml Log Message: * src/articles/releasenotes.xml Add fixes and adds. Index: releasenotes.xml =================================================================== RCS file: /cvsroot/archive-access/archive-access/projects/nutch/src/articles/releasenotes.xml,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** releasenotes.xml 18 Oct 2005 23:21:11 -0000 1.2 --- releasenotes.xml 22 Oct 2005 01:27:03 -0000 1.3 *************** *** 13,20 **** </articleinfo> ! <sect1 id="1_6_0"> ! <title>Release 0.4.0 - NOT YET RELEASED</title> <abstract> ! <para>TODO</para> </abstract> <para>NutchWAX has been built against Nutch 0.7.0 (There seem to be issues --- 13,20 ---- </articleinfo> ! <sect1 id="0_4_0"> ! <title>Release 0.4.0 - 10/10/21</title> <abstract> ! <para>Bug fixes.</para> </abstract> <para>NutchWAX has been built against Nutch 0.7.0 (There seem to be issues *************** *** 25,32 **** <sect2 id="0_4_0_limitations"> <title>Known Limitations/Issues</title> ! <sect3 id="bdb_nfs"><title>java.io.IOException: No locks available</title> ! <para>Bdb will complain 'No locks available' when crawler is being ! built/run on an NFS mount. Workaround is not run on an NFS-mounted ! volume. </para> </sect3> --- 25,36 ---- <sect2 id="0_4_0_limitations"> <title>Known Limitations/Issues</title> ! <para>General limitation of current platform are listed in ! Section 7. <emphasis>Observations</emphasis> on Page 9 of ! <ulink url="http://archive-access.sourceforge.net/projects/nutch/iwaw/iwaw-wacsearch.pdf">Full Text Search of Web Archive Collections</ulink>. ! </para> ! <sect3 id="pdf"><title>PDFs</title> ! <para>PDFs whose size is greater than 10megs are skipped completely. ! Legitimate PDFs whose http content-length does not strictly gree with ! the ARC length are also skipped. </para> </sect3> *************** *** 34,56 **** <sect2 id="0_4_0_changes"> <title>Changes</title> ! <sect3 id="postselector"> ! <title>Postselector</title> ! <para>The Postselector has been refactored out of existence. ! Its responsibilities have been parcelled out to two new Processors: ! LinksScoper and FrontierScheduler. LinksScoper is responsible for ! scope checking of extracted links. FrontierScheduler does the ! scheduling of URIs with the Frontier. ! </para> ! <para>This change was done to allow introduction of processors ! between scope checking and Frontier scheduling steps. ! </para> ! <para>Because of this change, order files from 1.4.0 Heritrix or ! before will need to be updated -- Postselector references replaced ! by LinkScoper and FrontierScheduler references -- before they ! can be used with Heritrix 1.6.0 (Referencing a non-existent ! Postselector in an order file usually shows as -50 fetch status ! in crawl.log). ! </para> ! </sect3> </sect2> </sect1> --- 38,74 ---- <sect2 id="0_4_0_changes"> <title>Changes</title> ! <para><table> ! <title>Bugs/Features</title> ! <tgroup cols="5"> ! <thead> ! <row> ! <entry>ID</entry> ! <entry>Type</entry> ! <entry>Summary</entry> ! <entry>Open Date</entry> ! <entry>By</entry> ! <entry>Filer</entry> ! </row> ! </thead> ! ! <tbody> ! <row><entry><ulink url="http://sourceforge.net/tracker/index.php?func=detail&group_id=118427&atid=681140&aid=1313214">1313214</ulink></entry><entry>Add</entry><entry>Dedup'ing that considers collection field.</entry><entry>2005-10-04 12:46</entry><entry>stack-sf</entry><entry>stack-sf</entry></row> ! <row><entry><ulink url="http://sourceforge.net/tracker/index.php?func=detail&group_id=118427&atid=681140&aid=1309781">1309781</ulink></entry><entry>Add</entry><entry>Add in skipping certain types if > size</entry><entry>2005-09-30 14:01</entry><entry>stack-sf</entry><entry>stack-sf</entry></row> ! <row><entry><ulink url="http://sourceforge.net/tracker/index.php?func=detail&group_id=118427&atid=681140&aid=1244843">1244843</ulink></entry><entry>Add</entry><entry>Allow querying on mime primary and sub type</entry><entry>2005-07-25 16:13</entry><entry>stack-sf</entry><entry>stack-sf</entry></row> ! <row><entry><ulink url="http://sourceforge.net/tracker/index.php?func=detail&group_id=118427&atid=681140&aid=1280825">1280825</ulink></entry><entry>Add</entry><entry>Make nutch merge segment work against nutchwax segments</entry><entry>2005-09-02 10:00</entry><entry>stack-sf</entry><entry>stack-sf</entry></row> ! ! ! <row><entry><ulink url="http://sourceforge.net/tracker/index.php?func=detail&group_id=118427&atid=681137&aid=1247571">1247571</ulink></entry><entry>Fix</entry><entry>Items not getting indexed</entry><entry>2005-07-29 09:55</entry><entry>stack-sf</entry><entry>stack-sf</entry></row> ! <row><entry><ulink url="http://sourceforge.net/tracker/index.php?func=detail&group_id=118427&atid=681137&aid=1312212">1312212</ulink></entry><entry><entry>Fix</entry>bad xml chars in search results</entry><entry>2005-10-03 12:11</entry><entry>stack-sf</entry><entry>stack-sf</entry></row> ! <row><entry><ulink url="http://sourceforge.net/tracker/index.php?func=detail&group_id=118427&atid=681137&aid=1244894">1244894</ulink></entry><entry><entry>Fix</entry>Cannot query for non-ISO8859 characters</entry><entry>2005-07-25 18:38</entry><entry>stack-sf</entry><entry>stack-sf</entry></row> ! <row><entry><ulink url="http://sourceforge.net/tracker/index.php?func=detail&group_id=118427&atid=681137&aid=1312208">1312208</ulink></entry><entry><entry>Fix</entry>Query time encoding issues</entry><entry>2005-10-03 12:11</entry><entry>stack-sf</entry><entry>stack-sf</entry></row> ! <row><entry><ulink url="http://sourceforge.net/tracker/index.php?func=detail&group_id=118427&atid=681137&aid=1312217">1312217</ulink></entry><entry><entry>Fix</entry>Not indexing images</entry><entry>2005-10-03 12:18</entry><entry>stack-sf</entry><entry>stack-sf</entry></row> ! <row><entry><ulink url="http://sourceforge.net/tracker/index.php?func=detail&group_id=118427&atid=681137&aid=1244875">1244875</ulink></entry><entry>Fix</entry><entry>exacturl encoding not working</entry><entry>2005-07-25 17:21</entry><entry>stack-sf</entry><entry>stack-sf</entry></row> ! <row><entry><ulink url="http://sourceforge.net/tracker/index.php?func=detail&group_id=118427&atid=681137&aid=1281697">1281697</ulink></entry><entry>Fix</entry><entry>searching czech words not working</entry><entry>2005-09-04 10:36</entry><entry>stack-sf</entry><entry>kranach</entry></row> ! ! </tbody> ! </tgroup> ! </table></para> ! </sect2> </sect1> |