[Htmlparser-user] format problem of text file after convertion of html to text file
Brought to you by:
derrickoswald
From: ChennaDulla <che...@go...> - 2003-02-07 14:27:44
|
hi i downloded htmlparser1.2 zip and i put htmlparser.jar file under lib on my server and org folder under web_inf ... it is wokring fine to convert html to text file but the problem is format of text file ... When i see text file after convertion the format is worst .. why is the happending like that ... no certain format by the time writing inot text file ... here is the code i am using to convert html to text file ... import org.htmlparser.util.HTMLEnumeration; import org.htmlparser.util.HTMLParserException; import org.htmlparser.HTMLNode; import org.htmlparser.HTMLParser; import java.io.*; import java.util.Properties; public class StringExtractor { // String htmlFile = "/export/a.html"; public StringExtractor() { } public void extractStrings(String htmlFile) throws HTMLParserException { try{ HTMLParser parser = new HTMLParser (htmlFile); BufferedWriter thewriter = new BufferedWriter (new FileWriter("/export/d.txt")); HTMLNode node; StringBuffer results= new StringBuffer(); for (HTMLEnumeration e = parser.elements ();e.hasMoreNodes();) { node = e.nextHTMLNode(); thewriter.write(node.toPlainTextString ()); } thewriter.close(); }catch(IOException e) { System.out.println ("error in ConvertJspToHtml.java==="+e ); } } } what changes i have to do to see html file in readable format .. if i run above file it the text file is generating but the format doesn't look good ... Any help on this please ... I am sending the one file as attachment .. i am getting output in text file like that. ... thanks. > -----Original Message----- > From: htm...@li... > [mailto:htm...@li...] On Behalf Of > dha...@or... > Sent: Thursday, February 06, 2003 11:47 PM > To: htm...@li... > Subject: RE: [Htmlparser-user] strip comments HTML source > > << File: BDY.RTF >> << File: BDY.RTF >> |