This is my code. It not works. When i run this, it returns nothing. No results or no errors.
I don't know where is my fault.
Please help me.
<?phpinclude("libs/PHPCrawler.class.php");classMyCrawlerextendsPHPCrawler{functionhandleDocumentInfo($DocInfo){if(PHP_SAPI=="cli")$lb="\n";else$lb="<br />";// Print the URL and the HTTP-status-Codeecho"Page requested: ".$DocInfo->url." (".$DocInfo->http_status_code.")".$lb;// Print the refering URLecho"Referer-page: ".$DocInfo->referer_url.$lb;// Print if the content of the document was be recieved or notif($DocInfo->received==true)echo"Content received: ".$DocInfo->bytes_received." bytes".$lb;elseecho"Content not received".$lb;echo$lb;flush();}}$crawler=newMyCrawler();// URL to crawl (the entry-page of the mysql-documentation on php.net)$crawler->setURL("https://tangailpratidin.com/");// Only receive content of documents with content-type "text/html"$crawler->addReceiveContentType("#text/html#");// Ignore links to pictures, css-documents etc (prefilter)$crawler->addURLFilterRule("#\.(jpg|gif|png|pdf|jpeg|css|js)$# i");// That's it, start crawling using 5 processes$crawler->goMultiProcessed(5);// At the end, after the process is finished, we print a short// report (see method getReport() for more information)$report=$crawler->getProcessReport();if(PHP_SAPI=="cli")$lb="\n";else$lb="<br />";echo"Summary:".$lb;echo"Links followed: ".$report->links_followed.$lb;echo"Documents received: ".$report->files_received.$lb;echo"Bytes received: ".$report->bytes_received." bytes".$lb;echo"Process runtime: ".$report->process_runtime." sec".$lb;?>
Last edit: Anonymous 2018-02-19
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Some PHP-extensions are required to successfully run phpcrawl in multi-process mode (PCNTL-extension, SEMAPHORE-extension, PDO-extension). For more details see the requirements page.
The multi-process mode only works on unix/linux-based systems
Scripts using phpcrawl with mutliple processes have to be run from the commandline (php CLI)
Increasing the number of processes to very high values does't automatically mean that the crawling-process will go off faster! The ideally number of processes depends on a lot of circumstances like the available bandwidth, the local technical environment (CPU), the delivery-rate and data-rate of the server hosting the taget-website and so on.
Using something between 3 to 10 processes should be good values to start from.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
View and moderate all "Help" comments posted by this user
Mark all as spam, and block user from posting to "Forum"
This is my code. It not works. When i run this, it returns nothing. No results or no errors.
I don't know where is my fault.
Please help me.
Last edit: Anonymous 2018-02-19
View and moderate all "Help" comments posted by this user
Mark all as spam, and block user from posting to "Forum"
Have you tried it without multiprocess?
View and moderate all "Help" comments posted by this user
Mark all as spam, and block user from posting to "Forum"
Yes. but it takes a long time.
View and moderate all "Help" comments posted by this user
Mark all as spam, and block user from posting to "Forum"
Do the multiprocess docs apply to you?
Using something between 3 to 10 processes should be good values to start from.
View and moderate all "Help" comments posted by this user
Mark all as spam, and block user from posting to "Forum"
I have enabled PDO-extnsion in my cpanel but didn't find PCNTL-extension, SEMAPHORE-extension. Please see this image
Last edit: Anonymous 2018-03-05