I want to find all broken links (404s) on a site. If setFollowMode = 0 then PHPCrawl can find broken links to external sites, but it will also then crawl external sites. If setFollowMode > 0 it will not find broken links to external sites.
How can I get PHPCrawl to find all 404s, including broken links to external sites, but not crawl through external sites?
Last edit: Anonymous 2016-11-11
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
View and moderate all "Help" comments posted by this user
Mark all as spam, and block user from posting to "Forum"
I want to find all broken links (404s) on a site. If setFollowMode = 0 then PHPCrawl can find broken links to external sites, but it will also then crawl external sites. If setFollowMode > 0 it will not find broken links to external sites.
How can I get PHPCrawl to find all 404s, including broken links to external sites, but not crawl through external sites?
Last edit: Anonymous 2016-11-11
I think you're looking for the status code.
$DocInfo->http_status_code
This is in the base example page:
http://phpcrawl.cuab.de/example.html