The parser doesn't really deal in lines of text, since most HTML
disregards linebreaks (the <pre> tag is the only exception I can think of).
What you probably want is subsequent nodes. For this use the children of
the parent of the node you have.
Some methods were recently added on AbstractNode (which TextNode
inherits from) to handle this...
getPreviousSibling() and getNextSibling()
These are only available in the latest Integration Build.
If you really want lines of text, the Page object available from the
parser, can be asked to fetch a line with GetLine().
This method has two overloads, one takes a cursor argument the other an
integer position.
The position is available from the node you have with getStartPosition()
or getEndPosition().
That gets you the contents of the line in the HTML stream for the node
you have.
Subsequent lines are a little tougher to get a hold of.
The line information is held in a PageIndex object which the Page
doesn't expose. But it could if you added a method.
If you had one of those you could step through the lines of the file.
Derrick
quanta veloce wrote:
> Hi,
>
> Can HTMLParser allow one to extract into an array lines before or
> after a search string?
>
> For instance:
>
> <CENTER>
> <TABLE ALIGN="CENTER" BORDER=5>
> <TR>
> <TD width=150 align=center><B>Area</B></TD>
> <TD width=120 align=center><B>Instantaneous Load</B></TD>
> </TR>
> <TR>
> <TD>PJM MID ATLANTIC REGION</TD>
> <TD align=right>33929</TD>
> </TR>
> <TR>
> <TD>PJM WESTERN REGION</TD>
> <TD align=right>39400</TD>
> </TR>
> <TR>
> <TD>PJM SOUTHERN REGION</TD>
> <TD align=right>9857</TD>
> </TR>
> <TR>
> <TD>PJM RTO</TD>
> <TD align=right>83186</TD>
> </TR>
> </TABLE>
> </CENTER>
> <P><CENTER>Loads are calculated from raw telemetry data and are
> approximate.</CENTER>
> <CENTER>The displayed values are NOT official PJM Loads.</CENTER>
> <BR><BR><BR>
> <P><CENTER><H2>Current PJM Transmission Limits</H2></CENTER>
> <P align=center>None
>
> </BODY>
> </HTML>
>
> In the following URL I matched the string "Current PJM Transmission
> Limits" and I want to obtain any and all lines after this match...or
> even the next 3 lines, etc.,
>
> Any help would be appreciated!
> Thanks,
>
>
> ------------------------------------------------------------------------
> Relax. Yahoo! Mail virus scanning
> <http://us.rd.yahoo.com/mail_us/taglines/viruscc/*http://communications.yahoo.com/features.php?page=221>
> helps detect nasty viruses!
|