RE: [Htmlparser-user] Testing/feedback, question
Brought to you by:
derrickoswald
From: Claude D. <CD...@ar...> - 2002-06-25 16:19:24
|
Looks like the output is on System.out: com\kizna\html\HTMLParser.java" Line 311: System.err.println("Error! File "+resourceLocn+" not found!"); com\kizna\html\HTMLParser.java" Line 315: System.err.println("Error! URL "+resourceLocn+" Malformed!"); com\kizna\html\HTMLParser.java" Line 319: System.err.println("I/O Exception occured while reading "+resourceLocn); This is all in Version 1.1 (I need to use a production release, so that's what I've been testing so far). I'll check out the latest integration build to see if the same problems exist. I'd like to hear your view(s) on writing to the console. In production releases this should never happen. If necessary, I'd encourage the use of a reporting callback, something like the SAX ErrorHandler class, say: public interface HTMPParserFeedback { public void info(String message); public void warning(String message); public void error(String message, HTMLParserException e); } With a DefaultHTMPParserFeedback implementation that goes to the console. public class DefaultHTMPParserFeedback implements HTMPParserFeedback { public void info(String message) { System.out.println("INFO: " + message); } public void warning(String message) { System.out.println("WARNING: " + message); } public void error(String message, HTMLParserException e) { System.out.println("ERROR: " + message); e.printStackTrace(); } } This approach is especially conducive to user-configuration and helps elliminate errant output to the console. You can also add a debug(String message) method here as well. The callback enables the user of the library to send relevant output to either or all of: the console, log files, streams, etc. Of course, fatal errors should be still thrown back up the calling tree through exceptions, but non-fatal errors can easily be caught this way. To simplify usage, you can use a Factory/Manager class with static methods, like: public class FeedbackManager { protected static HTMPParserFeedback callback; public static void setParserFeedback(HTMPParserFeedback feedback) { callback =3D feedback; } public static void info(String message) { callback.info(message); } public static void warning(String message) { callback.warning(message); } public static void error(String message, HTMLParserException e) { callback.error(message, e); } } In practice, the inline code/usage looks like this: ... // General feedback methodCall(); FeedbackManager.info("Ready to perform some action"); anotherMethodCall(); FeedbackManager.info("Completed some action"); ... // Non fatal exception try { possibleNonFatalCall(); } catch (NonFatalException e) { FeedbackManager.error("more specific description of problem, in context", e); } ... // Fatal exception try { possibleFatalCall(); } catch (FatalException e) { throw new HTMPParserException("Fatal call description", e); } ... Not sure if this is helpful, but it's a strategy that's worked incredibly well for me, ultimately usuable for both end-users and developers who are working on the library itself. -----Original Message----- From: Somik Raha [mailto:so...@ya...]=20 Sent: Monday, June 24, 2002 6:27 PM To: htm...@li... Subject: Re: [Htmlparser-user] Testing/feedback, question Dear Claude, >We have (my company) processed >about 11 million HTML documents successfully (with the >Swing parser), >some of which we'll see tested again with the >HTMLParser code in the >next few weeks. Great - this will be a great service to this project and its community. Thank you very much. >To date, we have only run a few simple tests with the HTMLParser code >but it appears now that the library is writing to standard err. I would >expect all errors to result in parser-specific exceptions that the >calling application would be free to handle as it may see fit. Hmm.. although I agree with this, I have a question - what do you see being written to standard err ? My understanding is that, when the parser crashes, it usually throws an exception all the way up - so if you wrap your parsing block (the for loop) in a try-catch and look for a simple exception, you would be able to catch it. >Some of the data we are processing is not publicly available. The errors >we have seen are issues with vary large HTML files that were generated >from log files. These are suprisingly common but offer a special >challenge to HTML parsers in that they tend to contain large strings of >log file information between <pre></pre> tags. Sounds interesting. Even if we cant get the data that you tested with, we could simulate an equivalent testcase... >We'll probably be running about 1 or 2 million files through the parser >this week. I will try to report problems and get set up to build the >library so that I can offer more specific class/line-based >feedback/fixes. Cool. Looking forward to it. Cheers, Somik ------------------------------------------------------- Sponsored by: ThinkGeek at http://www.ThinkGeek.com/ _______________________________________________ Htmlparser-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-user |