Menu

extract contents of some tags from aHTML file

fatima
2007-04-12
2013-04-27
  • fatima

    fatima - 2007-04-12

    Hi,
    I am trying to use HTML parser to convert a HTML files to XML files,so I need to obtain contents of some tags(ie.tittle,href,..),If anyone have some documets ralated to this,send it to me please ? I have no idea how to use HTML parser to handle this ?tanks.

     
    • varma10

      varma10 - 2007-07-18

      Hi,
          I am also working on something similar to ours. i need to extract some tags( td, anchor...).did u find any effective way to do that using HTML parser.i did the following for extracting see if this works for you

                     Parser parser = new Parser("http://");
                 NodeList nl = parser.parse(null);
                 NodeList a1 = nl.extractAllNodesThatMatch(new TagNameFilter("td"),true);//extracting tags by name "TD"//
                       
                  len=a1.size();
                              for( i=0;i<len;i+=1)
                            {
                                TagNode tag= (TagNode)a1.elementAt(i);
                               
                                tit=tag.toPlainTextString();// extracts the text into a string
                                System.out.println("td is " + tit);
                            }

       
  • Anonymous

    Anonymous - 2012-03-07

    never mind!

     
  • Anonymous

    Anonymous - 2012-03-07

    never mind

     
  • Anonymous

    Anonymous - 2012-03-07

    wtf

     
  • Anonymous

    Anonymous - 2012-03-07

    sorry….

     

Log in to post a comment.