In this post I am going to show you how to effectively remove comments from an XML document using the combination of XMLModifier and XPath. The input XML document looks like the following.
<clients> <!-- some other code here --> <function> </function> <function> </function> <function> <name>data_values</name> <variables> <variable><!-- some other code here --> <name>temp</name> <!-- some other code here --> <type>double</type> </variable> </variables><!-- some other code here --> <block><!-- some other code here --> <opster>temp = 1</opster> </block> </function> </clients>
The code that performs the task is listed below. The key is the XPath expression "//comment()" which selects all the comment nodes in the document. After binding VTDNav object to the XMLModifier object, you can simply call the "remove()" method, which will not only remove the content of the comment, but also the surrounding delimiting text (i.e. <!-, and ->).
import com.ximpleware.*; import java.io.*; public class removeNodesDemo { public static void main(String[] args) throws VTDException, IOException{ VTDGen vg = new VTDGen(); if (!vg.parseFile("d:\\xml\\input2.xml",false)) return; VTDNav vn = vg.getNav(); AutoPilot ap = new AutoPilot(vn); XMLModifier xm = new XMLModifier(vn); ap.selectXPath("//comment()"); int i=0; while((i=ap.evalXPath())!=-1){ xm.remove(); } xm.output("d:\\xml\\output2.xml"); } }
The output XML is
<clients> <function> </function> <function> </function> <function> <name>data_values</name> <variables> <variable> <name>temp</name> <type>double</type> </variable> </variables> <block> <opster>temp = 1</opster> </block> </function> </clients>
You might ask what if I want to remove an attribute node, a text node, or a CDATA node, an element node, or an processing instruction node?
The effective of XMLModifer's remove() method has the following effect on each type of nodes:
In other words, to remove all processing instruction nodes, just substitute the XPath expression above with "//processing-instruction()."