Menu

#88 HTML parsing fails "failed to open stream:"..

closed
2018-12-08
2011-11-07
jobe lawn
No

Hi there. Using simplehtmldom_1_5-1.zip packages @version 1.11 ($Rev: 184 $) .. Suppose to be 1.5?

I'm having a trouble opening cerain URL that opens on avery browser.. Other sites open correctly.

include_once('simple_html_dom.php');
$html = file_get_html("www.telvis.fi/lite/?vw=channel&sh=new&ch=tv2");
foreach($html->find('table tr [class=zeb]') as $d){
echo $d->plaintext;
return;
}

also the PHP 5.3 Windows "VC9 x86 Thread Safe (2011-Aug-23 12:01:10)" on Apacvhe 2.2x yells "mb_detect_encoding" for being undefined..

Discussion

  • jobe lawn

    jobe lawn - 2011-11-07

    This works ok.

    $html = file_get_html("http://m.yle.fi/w/etusivu");

    foreach($html->find(' div[id=page-content]') as $d){
    echo $d->plaintext;
    return;
    }

     
  • jobe lawn

    jobe lawn - 2011-11-07

    When writting the site's source into html file and then opening the same way $html = file_get_html("file.html") parsing works.. Error when opening the site with HTTP is ailed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in C:\simple_html_dom.php on line 39

     
  • jobe lawn

    jobe lawn - 2011-11-09

    I got this to work: Have to dump the file..

    $opts = array('http'=>array('method'=>"GET",'header'=>"Accept-language: en\r\n" ."User-Agent: not for you\r\n"));
    $context = stream_context_create($opts);
    $url = "http://www.telvis.fi/lite/?vw=channel&sh=new&ch=tv1";
    $file = file_get_contents($url, false, $context);

     
  • LogMANOriginal

    LogMANOriginal - 2018-12-08

    Thanks for opening this issue!

    file_get_html is most useful for local files. Remote files should be loaded manually and provided to str_get_html instead.

    That being said, file_get_html can handle URLs only if they start with "http" or "https" (as you have figured out yourself). You can check if a URL has the schema defined using parse_url.

    Note: There is no way for simple_html_dom to clearly distinguish between local files and remote URLs, so this has to be done by the caller.

     
  • LogMANOriginal

    LogMANOriginal - 2018-12-08
    • status: open --> closed
    • assigned_to: S. C. Chen --> LogMANOriginal
     

Log in to post a comment.

MongoDB Logo MongoDB