Restrict the link finding on some URLs
Status: Beta
Brought to you by:
huni
It would be nice to be able to crawl only specific parts of a site, by searching links only on some pages of the site.
That could be implemented with methods that would work the same way as the filter methods :
$crawler->addURLFilterRule()
$crawler->addURLFollowRule()
But instead of ignoring completely these URLs, the crawler would still parse these URLs, but would not look for any link on the page.
Methods could be named like this, for example :
$crawler->addLinkSearchURLFilterRule()
$crawler->addLinkSearchURLFollowRule()
And would be a part of the "Linkfinding settings" section of the PHPCrawler class.
Anonymous
View and moderate all "feature-requests Discussion" comments posted by this user
Mark all as spam, and block user from posting to "Feature Requests"
As a note, we can already do this, by leaving the handleDocumentInfo() if the referer_url matches some conditions.
But having those filters would speed up the process, since these urls wouldn't even be parsed.