Running a broad, polite, site-first crawl with a
DiskIncludedFrontier and 150 toethreads on a red box
(labcrawl02):
(1) starting crawl took many minutes, while
disk-hashtable was zeroed. During this time job
appeared as neither pending nor in-progress, which was
confusing.
(2) Once crawl started, performance never exceeded 10
uris/second, more often was < 5 uris/second.
DiskIncludedFrontier's current weaknesses should be
documented, and improved where possible, perhaps with a
completely different approach.
Gordon Mohr
Frontier
None
Public
|
Date: 2007-03-14 00:14
|
|
Date: 2004-10-27 16:11 Logged In: YES |
|
Date: 2004-10-21 18:20 Logged In: YES |
|
Date: 2004-10-21 00:37 Logged In: YES |
|
Date: 2004-10-20 21:43 Logged In: YES |
| Field | Old Value | Date | By |
|---|---|---|---|
| status_id | Open | 2004-10-27 16:11 | gojomo |
| resolution_id | None | 2004-10-27 16:11 | gojomo |
| close_date | - | 2004-10-27 16:11 | gojomo |
| priority | 6 | 2004-10-20 21:43 | gojomo |
| assigned_to | nobody | 2004-10-20 21:43 | gojomo |
| priority | 5 | 2004-09-01 21:57 | gojomo |
Copyright © 2010 Geeknet, Inc. All rights reserved. Terms of Use