>
> Dhaval Udani wrote:
> > Considering that HTMLParser is a SAX-based parser, it
> should be possible
> to
> > have all the nodes at the first level itself as a flat structure.
> Additionally
> > the embedded nodes should also be referenced as children of
> other nodes.
> Am I
> > correct in the understanding or is there something that I
> have missed out.
>
> This is done with the HtmlScanner - registered when you called
> registerDomScanners().
>
Somik, I am slightly confused out here. Even if I register HtmlScanner, how
will I get a flat structrue of all the nodes. Will I yet not get a tree-like
representation and I will have to parse through the children of all of them.
Dhaval
|