Menu

Having problem with some links on webistes...

Help
2012-12-12
2013-04-09
  • Devender Bindal

    Devender Bindal - 2012-12-12

    very nice tools..

    when i was trying to get data from https://www.onlinesbi.com.
    setpagelimit(100);
    and crawler will die.

    If i set setpagelime(13);
    it is working I think there is some problem occuring when crawler goes to this link
    "https://www.onlinesbi.com/personal/osbi_etdr_estdr_faq.html"
    Please reply as soon as possible.

     
  • Nobody/Anonymous

    ($status == true)
    this  condition is never executing. I don't know why maybe it is some required i have not done or something else.

    But I corrected this by using time function.
    record time initially and in the loop to get time consumed and if this time greater then $this->socketReadTimeout).
    if ($status == true||($new_time>=$this->socketReadTimeout))

    thanks

     
  • Nobody/Anonymous

    Hey,

    doesn't simply $crawler->setStreamTimeout(100) and $crawler->setConnectionTimeout(100) do the trick?
    (Just set it to very high values).
    If you got a timout-problem with that site this should help.

     
  • Nobody/Anonymous

    Sorry for late reply

    I want to crawl thousand of website to generate data in small time as much as possible, so setting it to $crawler->setStreamTimeout(100) will not do the trick.

     
  • Nobody/Anonymous

    setStreamTimeout(100) just says that the crawler waits 100 seconds for thr reply of a website until it aborts
    the request.

    It has NOTHING to do with the speed of the crawler. If website/webserver is slow the crawler can't do anything about it.
    It's just a timoit-setting, not a "make it slow"-setting ;)

     

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.