Menu

#17 can I fail gracefully on file_get_html

v1.0_(example)
closed
None
1
2019-04-20
2012-11-08
Ken Irwin
No

Hi folks,

I've been using the Simple HTML DOM Parser a lot lately and it's great. But I have a problem I don't know how to solve. Sometimes when I got to file_get_html() a file, the whole script hangs and ends abruptly with no error messages/codes. I'd like to figure out how to fail more gracefully.

Here's an example:
http://www6.wittenberg.edu/lib/ken/c4l/domfail/?url=good
http://www6.wittenberg.edu/lib/ken/c4l/domfail/?url=bad

On the "good" file (22k), it gets the html just fine and reports that has done a successful $html->find() operation.

On the "bad" file (280k), it just dies, without reporting even that it was unable to complete the file_get_html.

======
$html = file_get_html($url);

if ($html) {
print "

File gotten

\n";
// ($holdings = $html->find($innreach[holdings_selector])) || $size = -1;
$holdings = $html->find($innreach[holdings_selector]);
if ($holdings) { print "

find() successful

\n";}
else { print "

find() command failed

\n"; }
}
else { print "

can't get url

"; }
====

If you go to the URL, you can actually download the whole bit of code, the two files, and the current version of the simple_html_dom.php that I'm using.

I just want to know: if file_get_html() fails, is there a way to say "it failed! do something else!"

I think this should be simple, but I don't know how.

Thanks
Ken

Discussion

  • Ken Irwin

    Ken Irwin - 2012-11-09

    I previously and incorrectly said that this resulted in failure without an error. I turns out that when I switched from E_WARN to E_ALL I get this:
    Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 16 bytes) in /var/www/docs/lib/ken/c4l/domfail/simple_html_dom.php on line 1551

    How can I avoid this? Thanks!
    Ken

     
  • LogMANOriginal

    LogMANOriginal - 2019-04-20
    • status: open --> closed
    • assigned_to: LogMANOriginal
    • discussion: enabled --> disabled
     
  • LogMANOriginal

    LogMANOriginal - 2019-04-20

    Closing for two reasons

    1) As mentioned above, the root issue was clear after changing error logging to E_ALL, resulting in "Allowed memory size of 33554432 bytes exhausted (tried to allocate 16 bytes) in /var/www/docs/lib/ken/c4l/domfail/simple_html_dom.php on line 1551".

    This can be fixed by increasing the amount of memory for PHP.

    Please note that the script can fail at any time if the source data is too complex and more memory required than available.

    2) Performance and memory usage was improved since this ticket was opened and limits were put in place to somewhat reduce the chance of hangs.

    Please don't hesitate to open a new ticket for further discussion.

     
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.