Menu

Limit pages crawled per domian?

Help
gyuri79
2014-04-26
2014-04-29
  • gyuri79

    gyuri79 - 2014-04-26

    I would like to crawl the web, and limit crawls to x number of pages per url. How can I do that with phpcrawl? I notice I ac only set a global page limit it seems http://phpcrawl.cuab.de/classreferences/PHPCrawler/method_detail_tpl_method_setPageLimit.htm

     

    Last edit: gyuri79 2014-04-26
  • Anonymous

    Anonymous - 2014-04-29

    Yes, you are right, there is no option to limit the number of crawled pages per domain.

    Maybe you could just crawl a single domain at once with a page limit, collect all links to external domains from it (and store it to a database or something) and then start a new instance of the crawler for every of these domains in a loop?

    But feel free to add a request for a "per domain page-limit" to the list of feature-requestsd! (http://sourceforge.net/p/phpcrawl/feature-requests/?source=navbar)

     

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.