I will write it down just for new user can test easily Slashdot example.
It's not exactly the same as previous example but it works.
Prints titles of first page articles in Slashdot.
$html = file_get_html( $url);
// Finds nodes within span tag and class 'story-title' tag a without class assigned
$articles = $html->find('span.story-title a[!class]'); // result is an array of onode objects
foreach ( $articles as $key => $article){
echo $key;
echo ( $article->innertext).'
'; // echo inner text
}
I paste here the piece of html that it finds
Find external 'span' tag with class 'story-title' , inside there are 2 'a' tag. But I do not want the second one 'a' tag which has class 'story-sourcelnk'.
The Disastrous Voyage of Satoshi, the World's First Cryptocurrency Cruise Ship
https://www.theguardian.com/news/2021/sep/07/disastrous-voyage-satoshi-cryptocurrency-cruise-ship-seassteading" target="_blank"> (theguardian.com)
Last edit: ignasi tort 2021-09-08
Thanks for the feedback!
The example in 1.9 is probably not functional anymore, but there is an updated version in the current master that still works. Here is the link for future reference:
https://sourceforge.net/p/simplehtmldom/repository/ci/master/tree/example/scraping/example_scraping_slashdot.php