#8 Very weird problem


I'm using simple_html_dom to parse html pages that i found is an rss.
I'm using simplepie to parse the rss.
So. i'm using the method

$item_url = $item->get_permalink();

from simplepie api to get the page url and then calling

$remote_html = new simple_html_dom();
$remote_html->load_file( $item_url );

simple html dom gets the html from the page but some tags are missing.
When i'm entering the url hardcoded i don't have this problem.
I tried to print the url and is identical in both situations.
I also tried the non object oriented way but still the same.
I also tried to get the page with curl and then parse the html string but nothing either.
Is this a simple_html_dom problem or php's?


  • John Schlick
    John Schlick

    I can't tell if this is a simple_html_dom issue or not.
    Please give me the url of the page in question.

    There are times when file_get_contents fails, especially on some url's with redirects. this is not a simple_html_dom problem.

    Also, what do you mean by "some tags are missing", it's also common for some file_get_contents calls to fail if you don't include post data so that that site knows exactly who you are or what you want... That would also not be simple_html_dom's problem.

    The first thing to do is to verify that file_get_contents gets you the file with all the tags that you think it should on it's own. (as in do that call, and hten see wha thte results are before trying to create the dom)

    Also, I will note that you are calling load_file, which i personally don't do, (not that it's invalid) I personally use file_get_html or str_get_html as appropriate.