If the FTP URI included an invalid host (or username@host
misinterpreted as host), bug #935352 was causing an
attempted DNS precondition try, up to the max-retries, for
the FTP CrawlURI (rather than just failing after one try
confirmed no DNS record existed). After max-rretries, the
FTP CrawlURI, still with the DEFERRED status code from its
DNS preconditions, would wind up in the crawl-log with an
indication that multiple tries had occurred. That's been
fixed as part of that bug.
If the FTP URI included a valid host for which DNS lookup
succeeded, the FTP CrawlURI would be set to DEFERRED in
order to the let the first DNS attempt occur. However, on
its second try through, its fetch-status would remain
DEFERRED: there was no reset of fetch-status upon a new try.
It's been working OK for HTTP and DNS URIs because even with
the old/stale fetchStatus, a new trip through usually meant
a new set status. For FTPs, this meant that even though a
new precondition wasn't scheduled, the Frontier would see it
as DEFERRED and let it try again, up to the max retries.
Changing CrawlURI.processingCleanup() to reset fetchStatus
to UNATTEMPTED resolves the problem; FTP URIs whose DNS
succeeds then simply fall-through as unattempted, the proper
behavior if no FetchFTP processor is present.