Menu

match node starting from a given string index

Help
moeing
2005-10-13
2013-04-27
  • moeing

    moeing - 2005-10-13

    Hi,

    Is it possible to starting searching for nodes in a html file starting from a given string position?

    Say I know the html code and text for that part of file - where texts of interest appears - will be unique, is there a way to say starting looking forward or backward from this string index for a given node type? Rather than looping all similar type of node and checking for a given string (which may appear elsewhere).

    thanks.

     
    • Derrick Oswald

      Derrick Oswald - 2005-10-14

      You could use a stateful filter.
      The filter would subclass StringFilter or RegexFilter and add a state flag where you've set the flag when you encounter the string.  This only works forward from that point though.

      public class StatefulStringFilter extends StringFilter
      {
        boolean mTriggered; // goes true when pattern seen

      ... usual constructors delegating to superclass

        boolean accept (Node node)
        {
          boolean ret;
          // return true if pattern already seen or node matches
          ret = mTriggered || super.accept (node);
          mTriggered = ret;
          return (ret);
        }
      }

      Then you can us this filter 'AND'ed with your 'other' filter that wants to work only after the string has been found...

      parser.extractAllNodesThatMatch (
        new AndFilter (
          new StatefulStringFilter ("<pattern>"),
          new DoThisAfterSeeingPatternFilter (yadda)));

       
    • moeing

      moeing - 2005-10-15

      Hi,

      This example you gave; the pattern (as in new StatefulStringFilter ("<pattern>")) can only be text (as in text in textnode) and cannot be made up of partial html code and text right? as it extends StringFilter which only search for text.

      By the way, I think HtmlParser is great! I've looked at other parsers and found this one easiest to use.

      thanks.

       
      • Derrick Oswald

        Derrick Oswald - 2005-10-15

        True, the example uses text matching.
        One generalization would be for the stateful filter to take any other filter and once the subordinate filter 'trips' then always return true.

        public class StatefulFilter implements Filter
        {
        NodeFilter mSubordinate; // sub filter to check
        boolean mTriggered; // goes true when subordinate filter goes true

        public StatefulFilter (NodeFilter subordinate)
        {
          mTriggered = false;
          mSubordinate = subordinate;
        }

        boolean accept (Node node)
        {
        boolean ret;
        // return true if triggered or subordinate node matches
        ret = mTriggered || subordinate.accept (node);
        mTriggered = ret;
        return (ret);
        }
        }

        Then you could use any other filter matching various text and HTML as the trigger:

        parser.extractAllNodesThatMatch (
        new AndFilter (
        new StatefulFilter (
            <complex filter>
        ),
        new DoThisAfterTriggeringFilter (yadda)));

        You can use the FilterBuilder application to build the complex filter, but the StatefulFilter obviously won't be available within that program.

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.