When I set setUrlCacheType(PHPCrawlerUrlCacheTypes::URLCACHE_SQLITE); Iv got:
Abort reason: 1
Links followed: 2
Documents received: 1
on server, on localhost works fine (abort reason 3).
Iv add die() to handleDocumentInfo, sqlite files are created in working directory. (without die() they are removing)
What im doing wrong?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
next Im remove die from handleDocumentInfo and comment this lines:
// Free/unlock caches
// $this->CookieCache->cleanup();
// $this->LinkCache->cleanup();
// Delete working-dir
// PHPCrawlerUtils::rmDir($this->working_directory);
next Im look into urlcache.db3 and found, that:
"1","0","1","0","c92ece30d757716614bc3fd52d746923",,,,,"http://www.mysite/wtf","","0"
"2","0","1","0","2272a3dc1e4910e50f62bde5872cc0c4","http://www.mysite/wtf/",,,"http://www.mysite/wtf","http://www.mysite/wtf/","1","0"
(is_redirect_url = 1 for id 2)
so, i replace ;
$crawler->setURL("www.mysite/wtf");
$crawler->setURL("www.mysite/wtf/");
and its a magic!) I'v got normal "abort reason 3"... will try do next steps). Thank you for this wounderfull library!
Last edit: Anonymous 2015-08-19
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
its get 100 pages (mb its ) and say agin "abort reason 1".
So, Iv add counter, and call die() evry 20 pages.... now Iv got 192 pages and abort reason 1 agin.
something wrong).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
View and moderate all "Help" comments posted by this user
Mark all as spam, and block user from posting to "Forum"
When I set setUrlCacheType(PHPCrawlerUrlCacheTypes::URLCACHE_SQLITE); Iv got:
Abort reason: 1
Links followed: 2
Documents received: 1
on server, on localhost works fine (abort reason 3).
Iv add die() to handleDocumentInfo, sqlite files are created in working directory. (without die() they are removing)
What im doing wrong?
View and moderate all "Help" comments posted by this user
Mark all as spam, and block user from posting to "Forum"
next Im remove die from handleDocumentInfo and comment this lines:
// Free/unlock caches
// $this->CookieCache->cleanup();
// $this->LinkCache->cleanup();
// PHPCrawlerUtils::rmDir($this->working_directory);
next Im look into urlcache.db3 and found, that:
"1","0","1","0","c92ece30d757716614bc3fd52d746923",,,,,"http://www.mysite/wtf","","0"
"2","0","1","0","2272a3dc1e4910e50f62bde5872cc0c4","http://www.mysite/wtf/",,,"http://www.mysite/wtf","http://www.mysite/wtf/","1","0"
(is_redirect_url = 1 for id 2)
so, i replace ;
$crawler->setURL("www.mysite/wtf");
$crawler->setURL("www.mysite/wtf/");
and its a magic!) I'v got normal "abort reason 3"... will try do next steps). Thank you for this wounderfull library!
Last edit: Anonymous 2015-08-19
View and moderate all "Help" comments posted by this user
Mark all as spam, and block user from posting to "Forum"
its get 100 pages (mb its ) and say agin "abort reason 1".
So, Iv add counter, and call die() evry 20 pages.... now Iv got 192 pages and abort reason 1 agin.
something wrong).
Hi!
Sorry for my late answer!
Did you find the problem (or a solution for the problerm) meanwhile?
Could you please post the URL you are trying to crawl and where this problem occurs?