First of all I would like to thank you and your team for this great open
source search server.
Currently our OSS instance crawls an old webserver with more than 100,000
To obtain a current search-index, we have made following settings in the web-
Fetch interval between re-fetches (days): 1
Number of simultaneous threads: 20
Number of URLs to crawl: 100
Maximum number of URLs per host: 1000000
Delay between each successive access, in seconds: 1
Unfortunately, with this config only one or two threads used to crawl.
Is there a way to use more threads to crawl.
Thank you for your support !
Currently, to avoid uncontrolled spam, OSS use one thread per hostname. To use
the 20 threads, you should crawl at least 20 distinct hostname.
To expedite the indexation you can also remove the delay by entering 0
Thanks for your quick reply.
The value of 0 for the delay works great.
Is there a way to run the crawling and the optimization of the index in
The optimization of the index requires a lot of time.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.