Menu

Is it possible to randomly choose a proxy from proxy pool for each page?

Help
Anonymous
2015-05-25
2017-10-18
  • Anonymous

    Anonymous - 2015-05-25

    I was able to specify proxy for each child request, but I am wonder if it is possible to modify the crawler in a way that it uses a different proxy for each page, not for the entire child process? The idea is to have some kind of proxies pool, and upon each page request a random proxy to be taken from there?

    Also is there any way to handle exception that is being thrown when not able to connect to the proxy? Currently as far as I can see, if a child cannot connect to the proxy for any of the pages it throws an exception and aborts. I would like to be able to control this behavior so either the child should just ignore this page and move to the next one, or simply try to load the page without using the proxy?

     

    Last edit: Anonymous 2015-05-25
  • Anonymous

    Anonymous - 2015-07-30

    I use haproxy, and load-balance connections coming in from phpcrawl across my proxies that way. Maybe you'd have luck with that method.

     
  • Anonymous

    Anonymous - 2017-10-18

    This is a good feature that I'm also looking for.

     

Anonymous
Anonymous

Add attachments
Cancel