How to remove tags and the removen tags'
content ?
Then save the result to the file? @@.
By using the NodeVisitor pattern,followings is my codes.
The ^^^^ area is my focus.
Could someone good give me the hand? @@.
<----I want to move the ScriptTag and it's content. @@. ---->
}
}
public class ToHtmlDemo
{
public static void main (String[] args) throws ParserException
{
Parser parser = new Parser ("http://www.yzu.edu.tw");
parser.visitAllNodesWith(new MyVisitor());
}
}
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You should be able to collect all the nodes you want to remove in a NodeList:
NodeList list = new NodeList (); // top level nodes gathered
NodeIterator iterator = parser.elements ();
while (iterator.hasMoreNodes ())
list.add (iteartor.NextNode ());
And after completion of the visiting:
for (int i = 0; i < list.size (); i++)
list.elementAt (i).accept (myvisitor);
Run through the list and do something like:
NodeList matching_ones = myvisitor.getMatchingNodes ();
for (int i = 0; i < list.size (); i++)
item = matching_ones.elementAt (i);
...
list = item.getParent ().getChildren ();
for (int i = 0; i < list.size (); i++)
if (item == list.elementAt (i))
{
list.remove (i);
break;
}
}
Then print the list of all nodes in a similar manner:
for (int i = 0; i < list.size (); i++)
System.out.print (list.elementAt (i).toHtml ());
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Follwings is my codes.
It work successfully to move the ScriptTag and it's content.
But my codes cannot move the word "Script" is in 'a' tag.
I mean the follwing line.
<a href="JavaScript:loadwindow(106,90);" style="font-family:Verdana;">元智Intranet</a>
Could someone good give me the hand? @@.
thank you
may god be with you
---------------------------------------------------------------
regards:
How to remove tags and the removen tags'
content ?
Then save the result to the file? @@.
By using the NodeVisitor pattern,followings is my codes.
The ^^^^ area is my focus.
Could someone good give me the hand? @@.
thank you
May god bless you all
--------------------------------------------
import org.htmlparser.Parser;
import org.htmlparser.util.NodeIterator;
import org.htmlparser.util.*;
import org.htmlparser.util.ParserException;
import org.htmlparser.visitors.HtmlPage;
import org.htmlparser.tags.*;
import org.htmlparser.visitors.NodeVisitor;
import org.htmlparser.*;
class MyVisitor extends NodeVisitor{
public void visitTag(Tag tag)
{
if(tag instanceof ScriptTag)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
<----I want to move the ScriptTag and it's content. @@. ---->
}
}
public class ToHtmlDemo
{
public static void main (String[] args) throws ParserException
{
Parser parser = new Parser ("http://www.yzu.edu.tw");
parser.visitAllNodesWith(new MyVisitor());
}
}
You should be able to collect all the nodes you want to remove in a NodeList:
NodeList list = new NodeList (); // top level nodes gathered
NodeIterator iterator = parser.elements ();
while (iterator.hasMoreNodes ())
list.add (iteartor.NextNode ());
And after completion of the visiting:
for (int i = 0; i < list.size (); i++)
list.elementAt (i).accept (myvisitor);
Run through the list and do something like:
NodeList matching_ones = myvisitor.getMatchingNodes ();
for (int i = 0; i < list.size (); i++)
item = matching_ones.elementAt (i);
...
list = item.getParent ().getChildren ();
for (int i = 0; i < list.size (); i++)
if (item == list.elementAt (i))
{
list.remove (i);
break;
}
}
Then print the list of all nodes in a similar manner:
for (int i = 0; i < list.size (); i++)
System.out.print (list.elementAt (i).toHtml ());
regards:
Many thanks to your reply.
Follwings is my codes.
It work successfully to move the ScriptTag and it's content.
But my codes cannot move the word "Script" is in 'a' tag.
Could someone good give me the hand? @@.
thank you
may god be with you
---------------------------------------------------------------
import org.htmlparser.Parser;
import org.htmlparser.util.NodeIterator;
import org.htmlparser.util.*;
import org.htmlparser.util.ParserException;
import org.htmlparser.visitors.HtmlPage;
import org.htmlparser.tags.*;
import org.htmlparser.visitors.NodeVisitor;
import org.htmlparser.*;
import org.htmlparser.*;
import org.htmlparser.filters.*;
import org.htmlparser.filters.*;
import java.io.*;
public class ToHtmlDemoTest
{
public static void main (String[] args) throws ParserException
{
NodeList list = new NodeList();
NodeFilter filter=new NotFilter(new TagNameFilter("Script"));
Parser parser = new Parser("http://www.yzu.edu.tw");
NodeIterator iterator = parser.elements();
while(iterator.hasMoreNodes()){
list.add(iterator.nextNode());
}
list.keepAllNodesThatMatch(filter,true);
for (int i = 0;i<list.size(); i++)
System.out.print(list.elementAt(i).toHtml());
}
}
regards:
Many thanks to your reply.
Follwings is my codes.
It work successfully to move the ScriptTag and it's content.
But my codes cannot move the word "Script" is in 'a' tag.
I mean the follwing line.
<a href="JavaScript:loadwindow(106,90);" style="font-family:Verdana;">元智Intranet</a>
Could someone good give me the hand? @@.
thank you
may god be with you
---------------------------------------------------------------
import org.htmlparser.Parser;
import org.htmlparser.util.NodeIterator;
import org.htmlparser.util.*;
import org.htmlparser.util.ParserException;
import org.htmlparser.visitors.HtmlPage;
import org.htmlparser.tags.*;
import org.htmlparser.visitors.NodeVisitor;
import org.htmlparser.*;
import org.htmlparser.*;
import org.htmlparser.filters.*;
import org.htmlparser.filters.*;
import java.io.*;
public class ToHtmlDemoTest
{
public static void main (String[] args) throws ParserException
{
NodeList list = new NodeList();
NodeFilter filter=new NotFilter(new TagNameFilter("Script"));
Parser parser = new Parser("http://www.yzu.edu.tw");
NodeIterator iterator = parser.elements();
while(iterator.hasMoreNodes()){
list.add(iterator.nextNode());
}
list.keepAllNodesThatMatch(filter,true);
for (int i = 0;i<list.size(); i++)
System.out.print(list.elementAt(i).toHtml());
}
}