Re: [Htmlparser-user] HTML parser for HTML translation
Brought to you by:
derrickoswald
From: Somik R. <so...@ya...> - 2003-01-24 18:03:37
|
> What I want to do is to parse HTML code and > translate the content and > the put the translated text/content back into the > original HTML structure. > > Does this HTML parser suitable of doing this kind of > task ? By translating content, I guess you mean translation of meaningful text data (not tags). That is easily possible. You can look at the StringExtractor example (org.htmlparser.parserapplications) or the StringFindingVisitor (org.htmlparser.visitors). The simplest approach is to write your own visitor - StringTranslatingVisitor, that runs through the entire html, and wherever it finds strings, these are translated as per your wishes. Here is a sample program : import org.htmlparser.HTMLRemarkNode; import org.htmlparser.HTMLStringNode; import org.htmlparser.tags.HTMLEndTag; import org.htmlparser.tags.HTMLTag; public class StringTranslatingVisitor extends HTMLVisitor { StringBuffer htmlData = new StringBuffer(); public void visitStringNode(HTMLStringNode stringNode) { String yourStuff=""; // Perform modifications here. // finally, add to htmlData htmlData.append(yourStuff); } public void visitEndTag(HTMLEndTag endTag) { htmlData.append(endTag.toHTML()); } public void visitTag(HTMLTag tag) { htmlData.append(tag.toHTML()); } public String getHtml() { return htmlData.toString(); } public void visitRemarkNode(HTMLRemarkNode remarkNode) { htmlData.append(remarkNode.toHTML()); } } To use this, create your parser - HTMLParser parser = new HTMLParser("http://someurl.com"); parser.registerScanners(); StringTranslatingVisitor visitor = new StringTranslatingVisitor(); parser.visitAllNodesWith(visitor); System.out.println(visitor.getHTML()); Regards, Somik __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com |