Update of /cvsroot/archive-access/archive-access/projects/nutch/src/articles
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv22684/articles
Added Files:
releasenotes.xml
Log Message:
* articles/releasenotes.xml
Startup some release notes.
--- NEW FILE: releasenotes.xml ---
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<article>
<title>Nutchwax Release Notes</title>
<articleinfo>
<date>$Date: 2005/10/17 20:57:03 $</date>
<authorgroup>
<corpauthor>Internet Archive</corpauthor>
</authorgroup>
</articleinfo>
<sect1 id="1_6_0">
<title>Release 0.4.0 - NOT YET RELEASED</title>
<abstract>
<para>TODO</para>
</abstract>
<sect2 id="0_4_0_limitations">
<title>Known Limitations/Issues</title>
<sect3 id="bdb_nfs"><title>java.io.IOException: No locks available</title>
<para>Bdb will complain 'No locks available' when crawler is being
built/run on an NFS mount. Workaround is not run on an NFS-mounted
volume.
</para>
</sect3>
</sect2>
<sect2 id="0_4_0_changes">
<title>Changes</title>
<sect3 id="postselector">
<title>Postselector</title>
<para>The Postselector has been refactored out of existence.
Its responsibilities have been parcelled out to two new Processors:
LinksScoper and FrontierScheduler. LinksScoper is responsible for
scope checking of extracted links. FrontierScheduler does the
scheduling of URIs with the Frontier.
</para>
<para>This change was done to allow introduction of processors
between scope checking and Frontier scheduling steps.
</para>
<para>Because of this change, order files from 1.4.0 Heritrix or
before will need to be updated -- Postselector references replaced
by LinkScoper and FrontierScheduler references -- before they
can be used with Heritrix 1.6.0 (Referencing a non-existent
Postselector in an order file usually shows as -50 fetch status
in crawl.log).
</para>
</sect3>
</sect2>
</sect1>
</article>
|