Hello everybody,
I am new to HTMLParser. I am trying to parse a page which has a <tr> tag as follows :
<tr> Translation
<td>First : <a href="http://www.freetranslation.com" > </a></td>
</tr>
Now I have been able to pick up all the <tr> from a Table. My code is as the following :
public class TestParse
{
public static void main(String[] args)
{
I have been able to pick up all the <tr> tag from the first table and store them up in NodeList list.
But how could I print only those <tr> which has "Translation" in it's text?
Could anybody help me out?
Shuvadeep
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It should be something like:
NodeList list = newparse.extractAllNodesThatMatch (
new AndFilter (
new TagNameFilter ("TR")
new HasChildFilter (
new StringFilter ("Translation", false)
true)));
You can use the FilterBuilder application to create and test out filters you construct graphically:
bin/filterbuilder
see http://sourceforge.net/project/screenshots.php?group_id=24399 for a screenshot. It has a help and a tutorial that covers most everything you can do with it. When you 'Save As' it creates a java file for you.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello everybody,
I am new to HTMLParser. I am trying to parse a page which has a <tr> tag as follows :
<tr> Translation
<td>First : <a href="http://www.freetranslation.com" > </a></td>
</tr>
Now I have been able to pick up all the <tr> from a Table. My code is as the following :
public class TestParse
{
public static void main(String[] args)
{
try{
Parser parser = new Parser("file:///E:/HTMLParser/test/a2.htm");
String [] tagsToBeFound = {"table"};
TagFindingVisitor visitor = new TagFindingVisitor(tagsToBeFound);
parser.visitAllNodesWith(visitor);
Node [] alltableTags = visitor.getTags(0);
Lexer lex=new Lexer(alltableTags[0].toHtml());
Parser newparse=new Parser(lex);
NodeList list = newparse.extractAllNodesThatMatch(new AndFilter(new TagNameFilter("TR")));
}
catch(ParserException e)
{
System.out.println("Parsing error "+e);
}
}
}
I have been able to pick up all the <tr> tag from the first table and store them up in NodeList list.
But how could I print only those <tr> which has "Translation" in it's text?
Could anybody help me out?
Shuvadeep
It should be something like:
NodeList list = newparse.extractAllNodesThatMatch (
new AndFilter (
new TagNameFilter ("TR")
new HasChildFilter (
new StringFilter ("Translation", false)
true)));
You can use the FilterBuilder application to create and test out filters you construct graphically:
bin/filterbuilder
see http://sourceforge.net/project/screenshots.php?group_id=24399 for a screenshot. It has a help and a tutorial that covers most everything you can do with it. When you 'Save As' it creates a java file for you.
Thanks. It works. I am studying FilterBuilder and post further query as and when needed. Thank you very much once again.
Shuvadeep