Menu

#115 Problem scrapping webs with loop

closed
None
2018-12-07
2012-09-18
No

Hi, I have a problem trying to scrap a web, with "pagination".
for ($i=20; $i <= 60; ) {
$html = file_get_html('http://tesis.pucp.edu.pe/repositorio/browse?order=ASC&rpp=20&sort_by=2&etal=-1&offset='.$i.'&type=dateissued');
echo $i;
echo "<BR>------------------------<BR>";
echo $html->plaintext;
echo "<BR>------------------------<BR>";
unset($html);
$i = $i + 20;
}

As you see, $i change url, and it means, change plaintext, but, it show same firtst web,instead URL changed. Why?.

Discussion

  • John Schlick

    John Schlick - 2012-12-30

    Without looking into this much...

    Please try:
    for...
    $url = "http..." . $i . ...
    echo $url
    $html = file_get_html($url);

    This way we can SEE the EXACT url that is causing you problems, and deal with THAT url.

    Also, if you download fromt he svn repository, there is an examples/scraping page with an example_scraping_general.php file in it that will allow you to JUST run simple_html_dom.php on it's own away from all the other code that yuo may have to see if it works and parses the specific page that is giving you trouble.

     
  • LogMANOriginal

    LogMANOriginal - 2018-12-07
    • status: open --> closed
    • assigned_to: LogMANOriginal
     
  • LogMANOriginal

    LogMANOriginal - 2018-12-07

    Closing because no reply from author.

     

Log in to post a comment.