Two related problems both linked off the current
text-area based seed-submission in config screens, and
giant seed lists:
(1) The text area has a maximum effective contents
size; a giant seed list can exceed this. As a result,
the seeds displayed from a running crawl (for example,
one started from teh command line) can be truncated.
Upon making any configuration change, the reread of the
seeds from the textarea loses seeds from before the
config change. If the scope is derived from the seeds,
URLs that should be in scope are now ruled out of
scope, and thus discarded (with -5000 status) upon recheck.
(2) Altenatively, attempting to submit a too-long seed
list sometimes generates a
ArrayIndexOutOfBoundsException (often with a '200000'
index) inside Jetty code. This seems somewhat random;
sometimes it does, sometimes it doesn't (perhaps
silently truncating the content).
In general, the textarea-based way of entering seeds,
and having that textarea input clobber anything in a
prexisting file, is seriously flawed and needs to be
fixed.
Gordon Mohr
None
None
Public
|
Date: 2007-03-14 00:20
|
|
Date: 2005-03-02 19:51 Logged In: YES |
|
Date: 2005-01-17 13:22 Logged In: YES |
| Field | Old Value | Date | By |
|---|---|---|---|
| status_id | Open | 2005-03-02 19:51 | gojomo |
| resolution_id | None | 2005-03-02 19:51 | gojomo |
| close_date | - | 2005-03-02 19:51 | gojomo |
| priority | 9 | 2005-01-17 13:22 | gojomo |
Copyright © 2010 Geeknet, Inc. All rights reserved. Terms of Use