|
From: Kaisa K. <kau...@cc...> - 2006-12-20 11:02:12
|
Thanks for the new nutchwax release 0.8.0 I haven't yet studied it deeper, only test-indexed one collection. I had a problem with pdf files because a script 'parse-pdf' is missing. I can't find it in nutchwax-0.8.0/bin Yes, I have xpdf installed in path but I guess this script is needed to launch it? Quote from logs => 'External command /bin/bash ./bin/parse-pdf.sh failed with error: /bin/bash: ./bin/parse-pdf.sh: No such file or directory..' Otherwise, it's very useful to now have incremental indexing and multiple collections in a single index. Best, Kaisa ---------- Forwarded message ---------- Date: Tue, 12 Dec 2006 17:45:20 -0800 From: Michael Stack <st...@ar...> To: arc...@li... Subject: [Archive-access-discuss] [ANN] nutchwax-0.8.0 released This note is to announce release of NutchWAX 0.8.0. Its available for download from sourceforge at http://sourceforge.net/project/showfiles.php?group_id=118427&package_id=128933&release_id=470852. NutchWAX 0.8.0 is built against Nutch 0.8.1, released 09/24/2006. A version of this software was recently used to make an index of greater than 400 million documents. See Release Notes [http://archive-access.sourceforge.net/projects/nutch/articles/releasenotes.html] for significant changes and fixes since NutchWAX 0.6.0. The site documentation has also been significantly revised. Yours, Internet Archive Webteam |