Menu

Limit pages crawled per domian?

Help
gyuri79
2014-04-26
2014-04-29
  • gyuri79

    gyuri79 - 2014-04-26

    I would like to crawl the web, and limit crawls to x number of pages per url. How can I do that with phpcrawl? I notice I ac only set a global page limit it seems http://phpcrawl.cuab.de/classreferences/PHPCrawler/method_detail_tpl_method_setPageLimit.htm

     

    Last edit: gyuri79 2014-04-26
  • Anonymous

    Anonymous - 2014-04-29

    Yes, you are right, there is no option to limit the number of crawled pages per domain.

    Maybe you could just crawl a single domain at once with a page limit, collect all links to external domains from it (and store it to a database or something) and then start a new instance of the crawler for every of these domains in a loop?

    But feel free to add a request for a "per domain page-limit" to the list of feature-requestsd! (http://sourceforge.net/p/phpcrawl/feature-requests/?source=navbar)

     

Anonymous
Anonymous

Add attachments
Cancel