Re: [Htmlparser-user] How to extract more than one tag by only once parsering?
Brought to you by:
derrickoswald
|
From: Ian M. <ian...@gm...> - 2006-08-04 10:42:24
|
As long as you keep the original reference to the NodeList created by Parser.parse, and you haven't modified that NodeList, you should be able to reuse it, I think. Ian On 8/3/06, Jesse Hou <hp...@gm...> wrote: > > Hi All, When I'm using the htmlparser library, I suffered from a > difficulty. In a html there are many tags such as title, div, input, span > and so on. For example: > > <title>this is a test </title> > > > //...... any other tags > > <div class="A"> > <span class="B"><a href=" www.google.com ">google</a></span> > </div> > > > //...... any other tags > > <div class="C"> > <div class="D"><input type="text" id="E" value="msn" /></div> > </div> > > //...... any other tags > > > <div class="C"> > <div class="E"><span class="B"><input type="text" id="E" value="aol" > /><a href=" www.live.com ">live</a></span></div> > </div> > > In this example maybe the whole html include many tags. if I want to get the > content 'this is a test', maybe I can use a TagNameFilter, I have to parse > the whole html. If I want to get the content 'google' or ' www.google.com' > then I have to parse the whole html for the second time and if I want to get > 'msn', 'aol', 'live' maybe I should parse the whole html for several times. > In this way I can get the content what I need but maybe this way will impact > the performance. Is there any other way to do that? Maybe I can also use > OrFilter to get the Nodes but how can I identify a text match which tag? If > I want to store them into DB I have no idea how to do that by only once > parsing the html (the best performance). I beg your help. :-) > > Thanks and Best Regards > > Jesse > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > > _______________________________________________ > Htmlparser-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > |