Content not received

Help
Moin Haque
2012-12-13
2016-07-22
  • Moin Haque

    Moin Haque - 2012-12-13

    Hi all

    I am encountering a problem using the crawler where for some sites, I get loads of the 'content not received' error message. Here are a couple of examples and some stats:

    http://store.makro.co.uk/
    Links followed: 4683, Content not received: 3823

    http://www.brakesce.co.uk/
    Links followed: 2429, Content not received: 2164

    I was wondering if anyone had any idea how to stop this happening. I have crawled other sites using the same script/server, and its been fine. Also, if you could inform what usually triggers the message i.e. what features in the script / server / website that cause it to happen, then that would be great too.

    Thanks

     
  • Nobody/Anonymous

    Hi!

    Did you try to increase the stream-timeout and connection-timout? Some slow sites (or servers) don't respond within the default timeoutsettings, maybe that's the reason.

    And did you ttake a look at the error-code ($DocInfo->error)?

    Also take a look at the FAQs (http://phpcrawl.cuab.de/faq.html, first point).

     
  • Nobody/Anonymous

    … sorry, it's $DocInfo->error_string, not $DocInfo->error.

     
  • Moin Haque

    Moin Haque - 2012-12-17

    "Did you try to increase the stream-timeout and connection-timout?"

    Thanks, that did the trick.

     
  • Nobody/Anonymous

    Good to hear.

    Maybe the default stream- and connection-timouts should get increased in the next version.

     
  • Comment has been marked as spam. 
    Undo

    You can see all pending comments posted by this user  here

    Anonymous - 2015-07-15

    Do you remember how much did you increase it?

     
  • Comment has been marked as spam. 
    Undo

    You can see all pending comments posted by this user  here

    Anonymous - 2015-09-19

    Where Can I find these crawled data on my system once crawling finished?

     
  • Comment has been marked as spam. 
    Undo

    You can see all pending comments posted by this user  here

    Anonymous - 2016-07-22

    im also cant find where is that files once i done crawled , kindly help me on this
    thank you

     


Anonymous

Cancel  Add attachments





Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks