Menu

Content not received

Help
Moin Haque
2012-12-13
2016-07-22
  • Moin Haque

    Moin Haque - 2012-12-13

    Hi all

    I am encountering a problem using the crawler where for some sites, I get loads of the 'content not received' error message. Here are a couple of examples and some stats:

    http://store.makro.co.uk/
    Links followed: 4683, Content not received: 3823

    http://www.brakesce.co.uk/
    Links followed: 2429, Content not received: 2164

    I was wondering if anyone had any idea how to stop this happening. I have crawled other sites using the same script/server, and its been fine. Also, if you could inform what usually triggers the message i.e. what features in the script / server / website that cause it to happen, then that would be great too.

    Thanks

     
    • Anonymous

      Anonymous - 2018-11-14
      Post awaiting moderation.
  • Nobody/Anonymous

    Hi!

    Did you try to increase the stream-timeout and connection-timout? Some slow sites (or servers) don't respond within the default timeoutsettings, maybe that's the reason.

    And did you ttake a look at the error-code ($DocInfo->error)?

    Also take a look at the FAQs (http://phpcrawl.cuab.de/faq.html, first point).

     
  • Nobody/Anonymous

    … sorry, it's $DocInfo->error_string, not $DocInfo->error.

     
  • Moin Haque

    Moin Haque - 2012-12-17

    "Did you try to increase the stream-timeout and connection-timout?"

    Thanks, that did the trick.

     
  • Nobody/Anonymous

    Good to hear.

    Maybe the default stream- and connection-timouts should get increased in the next version.

     
  • Anonymous

    Anonymous - 2015-07-15

    Do you remember how much did you increase it?

     
  • Anonymous

    Anonymous - 2015-09-19

    Where Can I find these crawled data on my system once crawling finished?

     
  • Anonymous

    Anonymous - 2016-07-22

    im also cant find where is that files once i done crawled , kindly help me on this
    thank you

     

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.