Sorry, I lack understanding of the Crawler > Crawl-process Tab
Specifically I am confused by "Number of Urls to crawl" and "Maximum number of URLs per host"
and "RunOnce" versus "RunForever"
My site has some 8000+ Urls (mostly Webshop-items from a database)
I want OSS to:
1. check for new pages
2. delete gone pages
3. update page informations
I want this once a day around midnight.
What would the optimal settings for the above parameters be?
RunOnce - RunForever ?
Number of URLS to crawl....-- eg.: 10.000 ?
Max Nr. of Urls per host... -- also 10.000 as I have only one host...?
Fetch interval: 1 per Day... but at which time?
Can anyone clarify this?
How would I crawl some 1000 pages?
Log in to post a comment.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.