It would be a nice addition if we could have OSS to crawl only Pages listed in a specific XML/RSS File. (like RSS Cralwer @ Searchblox)
We already had one Website where we only needed to crawl/index a few specific Pages - stuff them into a XML File, point OSS to this file, only those pages get indexed. (no further crawling)
XML file should be configureable - as sitemap.xml most likely contains all pages of the Website.