From: Rajorshi B. <raj...@in...> - 2010-03-31 16:03:40
|
Hello there, I'm trying to use jtidy to format/cleanup some HTML contained in a Java String. What I see is that often, spaces are lost. For instance, suppose the markup ishello worldThe space (not nbsp, but it's rendered by browsers and mail clients nevertheless) is lost, and it transforms into:helloworldAnd hence shows up in a browser as "helloworld" instead of "hello world".The following is my code. Am I doing something obviously wrong here?Code:InputStream is = new ByteArrayInputStream(rawHtml.getBytes("utf8"));Tidy tidy = new Tidy();tidy.setInputEncoding("utf8");ByteArrayOutputStream baos = new ByteArrayOutputStream();tidy.parseDOM(is, baos);String pure = baos.toString("utf8");Thanks in advance!RajDear jtidyuser! Get Yourself a cool, short @in.com Email ID now! |