Re: [Htmlparser-user] iterate through node list
Brought to you by:
derrickoswald
From: Mattia T. <mat...@gm...> - 2007-09-17 11:44:59
|
Hi try this: in a new class, after importing: import org.htmlparser.tags.*; import org.htmlparser.util.*; insert next method: protected URL[] extractLinks(String url) throws ParserException { Parser parser; Vector vector; LinkTag link; URL[] ret; parser = new Parser(url); ObjectFindingVisitor visitor = new ObjectFindingVisitor( LinkTag.class); parser.visitAllNodesWith(visitor); Node[] nodes = visitor.getTags(); vector = new Vector(); for (int i = 0; i < nodes.length; i++) try { link = (LinkTag) nodes[i]; System.out.println(link.getLink() + " " + link.getLinkText ()); vector.add(new URL(link.getLink())); } catch (MalformedURLException murle) { murle.printStackTrace(); } ret = new URL[vector.size()]; vector.copyInto(ret); return (ret); } Hope this help. Cheers Mattia 2007/9/17, Nic Soltani <oo...@gm...>: > > Hi > I created a NodeList which contains hyperlinks extracted from an HTML > webpage, > I need to be able to iterate through every single node and extract its > href. > Wondering if anyone can help me with: > > 1. how to Iterate nodes 1 by 1 > 2. extract href > > > NodeList URLs = ExtractHyperLinks(HTML); > /* > * at this stage we have all: > * <A HREF="link1">something1</A> > * <A HREF="link2">something2</A> > * <A HREF="link3">something3</A> > * <A HREF="link4">something4</A> > * <A HREF="link5">something5</A> > */ > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Htmlparser-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > |