From: SourceForge.net <no...@so...> - 2010-07-02 12:21:08
|
The following forum message was posted by asheara at http://sourceforge.net/projects/jtidy/forums/forum/41437/topic/3758644: Hi All, I\'m trying to extract a div element with its content from html file. Then I want to write a new html file wich its only content will be the extracted div. Everything\'s ok until the point where I try to do tidy.pprint(w3cDoc, bos); After this pprint sentences I inspect bos and it is empty. I had never work with jTidy, it\'s so hard to find examples o tutorials, any idea? Thank you, This is the complete code block [code]Document doc = tidy.parseDOM(new FileInputStream(\"myFile.html\"), null); DOMReader reader = new DOMReader(); org.dom4j.Document dom4jDoc = reader.read(doc); String node = \"//div[@id=\'contenedor\']\"; Node myNode = dom4jDoc.selectSingleNode(node); miNodo.setDocument(null); miNodo.setParent(null); //Create new Document org.dom4j.Document newHTML = DocumentHelper.createDocument(); newHTML.add(miNodo); DOMWriter writer = new DOMWriter(); try { Document w3cDoc = writer.write(newHTML); ByteArrayOutputStream bos = new ByteArrayOutputStream(); tidy.pprint(w3cDoc, bos);[/code] |