Menu

#79 tbody doesn't work in find()

closed
None
2020-05-21
2011-08-26
Marc Gagnon
No

I was running tests and encountered problems using tbody element as a selector in find(). Looking through code, find() has a specific check for tbody. I can comment out that line, but I'm wondering why there was a specific check for that one element.

Discussion

  • Anonymous

    Anonymous - 2011-09-01

    I can confirm this bug. And it leads to unexpected behavior, for example when selecting TR tags that are in TBODY section. What happens is that TR's from THEAD also get included (which wasn't wanted in my case and before finding this bug I could not understand why it is happening). Commenting the mentioned line fixed the problem for me (thank you, gagnon).

    Please fix this bug.

     
  • Anonymous

    Anonymous - 2011-09-28

    Steps to reproduce the bug:

    $html = str_get_html('











    THIS IS THE WRONG ONE
    THIS ONE IS CORRECT

    ');

    echo $html->find('table tbody tr', 0)->innertext;
    ?>

    system$ php domtest.php
    THIS IS THE WRONG ONE

     
  • Anonymous

    Anonymous - 2011-09-28

    some remarks:

    I can confirm it has something to do with line 651: if ($m[1]==='tbody') continue;
    commenting this one out solves my problem.

    This line of code appears in r174 but I cannot find something related in the changelog: http://simplehtmldom.svn.sourceforge.net/viewvc/simplehtmldom/trunk/change_log.txt?r1=173&r2=174

    Description of the line of code says: // for browser generated xpath

     

    Last edit: Anonymous 2015-08-15
  • Meglio

    Meglio - 2014-07-10

    I can also confirm this bug:

    It leads to unexpected behavior, for example when selecting TR tags that are in TBODY section. What happens is that TR's from THEAD also get included (which wasn't wanted in my case and before finding this bug I could not understand why it is happening).

     
  • Albirew

    Albirew - 2016-05-19

    bug still present in 2016

     
  • LogMANOriginal

    LogMANOriginal - 2018-12-15
    • status: open --> closed
    • assigned_to: LogMANOriginal
     
  • LogMANOriginal

    LogMANOriginal - 2018-12-15

    Thanks for reporting this issue!

    You are right, this is incorrect behavior for CSS selectors, yet necessary for XPath selectors generated by browsers (although not responsibility of the parser). The reason for this is that browsers add the 'tbody' element to tables which don't have it, while the raw document stays untouched (that is what you'd normally pass to the parser).

    This is fixed in [f24dd8] by removing the offending line. Existing selectors must be updated (remove 'tbody') in order to maintain the previous state. Otherwise results may change (i.e. element not found or index suddenly points to the wrong item).

     

    Related

    Commit: [f24dd8]

  • Einat Dagan

    Einat Dagan - 2020-03-20

    This bug is still with us :-( on Corona days..2020

     
    • LogMANOriginal

      LogMANOriginal - 2020-05-21

      Can you give me an example that fails for you?

       

Log in to post a comment.