Menu

Not all pages are crawled and/or not all links are found

Help
Anonymous
2015-03-26
2015-03-27
  • Anonymous

    Anonymous - 2015-03-26

    Hi!
    I would like to use your library to make a broken link checker. But when I try to get all links for http://nvvk.eu/, my result is 625 links to check. But deadlinkchecker.com found 1747 links for that website.

     
  • Anonymous

    Anonymous - 2015-03-26

    Hi!

    Do you have a example for a URL phpcrawl didn't find and deadlinkchecker did?

    Myabe deadlinkchecker is looking for image-links and css and js URLs too and you excluded them in phpcrawl?

    Just guessing.

     
  • Anonymous

    Anonymous - 2015-03-26

    I don't exclude images, if you mean $crawler->addURLFilterRule(). My app definitely checks images.

     
  • Uwe Hunfeld

    Uwe Hunfeld - 2015-03-26

    What about external URLs?

    How do you check the links?

     
  • Anonymous

    Anonymous - 2015-03-26

    I get $DocInfo->links_found for every page and collect all links in a global array.

     
  • Anonymous

    Anonymous - 2015-03-26

    OK, sounds right, strange.

    I'll give it a test tomorrow (or the day after).

    Again: Do you know any URL deadlinkchecker finds on that page and phpcrawl doesn't?
    Without it gets difficult.

     
  • Anonymous

    Anonymous - 2015-03-26

    When I run /test_interface/index.php, I get all URLs requested.
    When I run example.php, I get a part of URLs.

    I tried to understand what settings are different, but no success.
    Any ideas?

     
  • Anonymous

    Anonymous - 2015-03-26

    For example, test interface finds this link http://www.nvvk.eu/schuld-hulpverlening/ledenoverzicht/lid/27, but example.php doesn't.
    I believe I do something wrong with settings, but I can't find the mistake by myself.

     
  • Anonymous

    Anonymous - 2015-03-27

    It seems solved.
    Great library. Thank you very much!

     
  • Uwe Hunfeld

    Uwe Hunfeld - 2015-03-27

    Ok, great.

    What was the problem?

     
  • Anonymous

    Anonymous - 2015-03-27

    It's very simple. I just removed traffic limit.
    Thanks again.

     

    Last edit: Anonymous 2015-03-27

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.