Menu

Multiprocess Doesn't work

Help
Anonymous
2018-02-19
2018-03-04
  • Anonymous

    Anonymous - 2018-02-19

    This is my code. It not works. When i run this, it returns nothing. No results or no errors.
    I don't know where is my fault.
    Please help me.

    <?php
    include("libs/PHPCrawler.class.php");
    class MyCrawler extends PHPCrawler
    {
      function handleDocumentInfo($DocInfo)
      {
        if (PHP_SAPI == "cli") $lb = "\n";
        else $lb = "<br />";
        // Print the URL and the HTTP-status-Code
        echo "Page requested: ".$DocInfo->url." (".$DocInfo->http_status_code.")".$lb;
        // Print the refering URL
        echo "Referer-page: ".$DocInfo->referer_url.$lb;
        // Print if the content of the document was be recieved or not
        if ($DocInfo->received == true)
          echo "Content received: ".$DocInfo->bytes_received." bytes".$lb;
        else
          echo "Content not received".$lb;
        echo $lb;
        flush();
      }
    }
    $crawler = new MyCrawler();
    // URL to crawl (the entry-page of the mysql-documentation on php.net)
    $crawler->setURL("https://tangailpratidin.com/");
    // Only receive content of documents with content-type "text/html"
    $crawler->addReceiveContentType("#text/html#");
    // Ignore links to pictures, css-documents etc (prefilter)
    $crawler->addURLFilterRule("#\.(jpg|gif|png|pdf|jpeg|css|js)$# i");
    // That's it, start crawling using 5 processes
    $crawler->goMultiProcessed(5);
    // At the end, after the process is finished, we print a short
    // report (see method getReport() for more information)
    $report = $crawler->getProcessReport();
    if (PHP_SAPI == "cli") $lb = "\n";
    else $lb = "<br />";
    echo "Summary:".$lb;
    echo "Links followed: ".$report->links_followed.$lb;
    echo "Documents received: ".$report->files_received.$lb;
    echo "Bytes received: ".$report->bytes_received." bytes".$lb;
    echo "Process runtime: ".$report->process_runtime." sec".$lb;
    ?>
    
     

    Last edit: Anonymous 2018-02-19
  • Anonymous

    Anonymous - 2018-02-19

    Have you tried it without multiprocess?

     
    • Anonymous

      Anonymous - 2018-02-20

      Yes. but it takes a long time.

       
  • Anonymous

    Anonymous - 2018-02-20

    Do the multiprocess docs apply to you?

    1. Some PHP-extensions are required to successfully run phpcrawl in multi-process mode (PCNTL-extension, SEMAPHORE-extension, PDO-extension). For more details see the requirements page.
    2. The multi-process mode only works on unix/linux-based systems
    3. Scripts using phpcrawl with mutliple processes have to be run from the commandline (php CLI)
    4. Increasing the number of processes to very high values does't automatically mean that the crawling-process will go off faster! The ideally number of processes depends on a lot of circumstances like the available bandwidth, the local technical environment (CPU), the delivery-rate and data-rate of the server hosting the taget-website and so on.
      Using something between 3 to 10 processes should be good values to start from.
     
    • Anonymous

      Anonymous - 2018-03-04

      I have enabled PDO-extnsion in my cpanel but didn't find PCNTL-extension, SEMAPHORE-extension. Please see this image

       

      Last edit: Anonymous 2018-03-05

Anonymous
Anonymous

Add attachments
Cancel