How do I crawl a page, get the links off that page and only crawl those pages and then stop?
I am using:
$crawler->setFollowMode(0); // follow to any domain
and haven't defined $crawler->setPageLimit(); at all. // this appears to stop after an aggregate total number of pages have been crawler - not what I want.
Is there a way to say "crawl one page, get its link, follow one level deep for each of those links and then stop"?
Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This can't be done directly with phpcrawl (yet), but you may use the referer-information within the handlePageData-method to accomplish this:
// ...functionhandlePageData(&$page_data){// Let the crawler stop if the referer isn't the entry-page anymoreif($page_data["referer_url"]!=""&&$page_data["referer_url"]!="http://www.urltopage.de/")return-1;}// ...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
How do I crawl a page, get the links off that page and only crawl those pages and then stop?
I am using:
$crawler->setFollowMode(0); // follow to any domain
and haven't defined $crawler->setPageLimit(); at all. // this appears to stop after an aggregate total number of pages have been crawler - not what I want.
Is there a way to say "crawl one page, get its link, follow one level deep for each of those links and then stop"?
Thanks.
Hi!
Sorry for my late answer.
This can't be done directly with phpcrawl (yet), but you may use the referer-information within the handlePageData-method to accomplish this:
View and moderate all "Help" comments posted by this user
Mark all as spam, and block user from posting to "Forum"
Could you please giv an example for second level depth? Thanks.