extract contents of some tags from aHTML file

  • fatima

    fatima - 2007-04-12

    I am trying to use HTML parser to convert a HTML files to XML files,so I need to obtain contents of some tags(ie.tittle,href,..),If anyone have some documets ralated to this,send it to me please ? I have no idea how to use HTML parser to handle this ?tanks.

    • varma10

      varma10 - 2007-07-18

          I am also working on something similar to ours. i need to extract some tags( td, anchor...).did u find any effective way to do that using HTML parser.i did the following for extracting see if this works for you

                     Parser parser = new Parser("http://");
                 NodeList nl = parser.parse(null);
                 NodeList a1 = nl.extractAllNodesThatMatch(new TagNameFilter("td"),true);//extracting tags by name "TD"//
                              for( i=0;i<len;i+=1)
                                TagNode tag= (TagNode)a1.elementAt(i);
                                tit=tag.toPlainTextString();// extracts the text into a string
                                System.out.println("td is " + tit);

  • Anonymous - 2012-03-07

    never mind!

  • Anonymous - 2012-03-07

    never mind

  • Anonymous - 2012-03-07


  • Anonymous - 2012-03-07



Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks