Menu

setUrlCacheType(PHPCrawlerUrlCacheTypes::URLCACHE_SQLITE);

Help
Anonymous
2015-08-18
2015-08-29
  • Anonymous

    Anonymous - 2015-08-18

    When I set setUrlCacheType(PHPCrawlerUrlCacheTypes::URLCACHE_SQLITE); Iv got:
    Abort reason: 1
    Links followed: 2
    Documents received: 1
    on server, on localhost works fine (abort reason 3).
    Iv add die() to handleDocumentInfo, sqlite files are created in working directory. (without die() they are removing)
    What im doing wrong?

     
  • Anonymous

    Anonymous - 2015-08-19

    next Im remove die from handleDocumentInfo and comment this lines:
    // Free/unlock caches
    // $this->CookieCache->cleanup();
    // $this->LinkCache->cleanup();

    // Delete working-dir
    

    // PHPCrawlerUtils::rmDir($this->working_directory);
    next Im look into urlcache.db3 and found, that:
    "1","0","1","0","c92ece30d757716614bc3fd52d746923",,,,,"http://www.mysite/wtf","","0"
    "2","0","1","0","2272a3dc1e4910e50f62bde5872cc0c4","http://www.mysite/wtf/",,,"http://www.mysite/wtf","http://www.mysite/wtf/","1","0"
    (is_redirect_url = 1 for id 2)
    so, i replace ;
    $crawler->setURL("www.mysite/wtf");
    $crawler->setURL("www.mysite/wtf/");

    and its a magic!) I'v got normal "abort reason 3"... will try do next steps). Thank you for this wounderfull library!

     

    Last edit: Anonymous 2015-08-19
  • Anonymous

    Anonymous - 2015-08-19

    its get 100 pages (mb its ) and say agin "abort reason 1".
    So, Iv add counter, and call die() evry 20 pages.... now Iv got 192 pages and abort reason 1 agin.
    something wrong).

     
  • Uwe Hunfeld

    Uwe Hunfeld - 2015-08-29

    Hi!

    Sorry for my late answer!

    Did you find the problem (or a solution for the problerm) meanwhile?

    Could you please post the URL you are trying to crawl and where this problem occurs?

     

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.