Update of /cvsroot/archive-access/archive-access/projects/nutch/src/java/org/archive/access/nutch
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv32095/src/java/org/archive/access/nutch
Modified Files:
Arc2Segment.java
Log Message:
Committing last changes for nutchwax point release. Below are untested.
Committing so I can test later tonight from home.
* conf/nutch-site.xml.template
Option to enable indexing of redirects.
* src/java/org/archive/access/nutch/Arc2Segment.java
If enabled, index redirects too.
* src/web/search.jsp
Add google-like paging (Listing of page numbers). Doesn't recover
like google's when we go past actual count of hits but for most usage,
its fine. Can improve upon it later.
Index: Arc2Segment.java
===================================================================
RCS file: /cvsroot/archive-access/archive-access/projects/nutch/src/java/org/archive/access/nutch/Arc2Segment.java,v
retrieving revision 1.31
retrieving revision 1.32
diff -C2 -d -r1.31 -r1.32
*** Arc2Segment.java 21 Oct 2005 01:29:29 -0000 1.31
--- Arc2Segment.java 23 Nov 2005 23:56:16 -0000 1.32
***************
*** 98,101 ****
--- 98,108 ----
}
}
+ private static boolean indexRedirects = false;
+ static {
+ String tmp = NutchConf.get().get("archive.index.redirects");
+ if (tmp != null && tmp.toLowerCase().equals("true")) {
+ indexRedirects = true;
+ }
+ }
/** Get the MimeTypes resolver instance. */
***************
*** 151,155 ****
for (Iterator i = arc.iterator(); i.hasNext();) {
ARCRecord rec = (ARCRecord)i.next();
! if (rec.getStatusCode() != 200) {
continue;
}
--- 158,165 ----
for (Iterator i = arc.iterator(); i.hasNext();) {
ARCRecord rec = (ARCRecord)i.next();
! if (rec.getStatusCode() != 200 ||
! (this.indexRedirects &&
! rec.getStatusCode() >= 300 &&
! rec.getStatusCode() < 400)) {
continue;
}
|