From: Michael S. <sta...@us...> - 2005-10-17 20:57:14
|
Update of /cvsroot/archive-access/archive-access/projects/nutch/src/articles In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv22684/articles Added Files: releasenotes.xml Log Message: * articles/releasenotes.xml Startup some release notes. --- NEW FILE: releasenotes.xml --- <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"> <article> <title>Nutchwax Release Notes</title> <articleinfo> <date>$Date: 2005/10/17 20:57:03 $</date> <authorgroup> <corpauthor>Internet Archive</corpauthor> </authorgroup> </articleinfo> <sect1 id="1_6_0"> <title>Release 0.4.0 - NOT YET RELEASED</title> <abstract> <para>TODO</para> </abstract> <sect2 id="0_4_0_limitations"> <title>Known Limitations/Issues</title> <sect3 id="bdb_nfs"><title>java.io.IOException: No locks available</title> <para>Bdb will complain 'No locks available' when crawler is being built/run on an NFS mount. Workaround is not run on an NFS-mounted volume. </para> </sect3> </sect2> <sect2 id="0_4_0_changes"> <title>Changes</title> <sect3 id="postselector"> <title>Postselector</title> <para>The Postselector has been refactored out of existence. Its responsibilities have been parcelled out to two new Processors: LinksScoper and FrontierScheduler. LinksScoper is responsible for scope checking of extracted links. FrontierScheduler does the scheduling of URIs with the Frontier. </para> <para>This change was done to allow introduction of processors between scope checking and Frontier scheduling steps. </para> <para>Because of this change, order files from 1.4.0 Heritrix or before will need to be updated -- Postselector references replaced by LinkScoper and FrontierScheduler references -- before they can be used with Heritrix 1.6.0 (Referencing a non-existent Postselector in an order file usually shows as -50 fetch status in crawl.log). </para> </sect3> </sect2> </sect1> </article> |