[Htmlparser-developer] Fw: [Htmlparser-user] Testing/feedback, question
Brought to you by:
derrickoswald
From: Somik R. <so...@ya...> - 2002-06-26 01:38:54
|
----- Original Message -----=20 From: Claude Duguay=20 To: htm...@li...=20 Sent: Tuesday, June 25, 2002 6:38 AM Subject: [Htmlparser-user] Testing/feedback, question I've just started using the HTMLParser and hope to be able to provide improved throughput and reliability over the Swing HTML parser by applying this open source solution, hopefully offering bug fixes/enhancements back to the community. We have (my company) processed about 11 million HTML documents successfully (with the Swing parser), some of which we'll see tested again with the HTMLParser code in the next few weeks. To date, we have only run a few simple tests with the HTMLParser code but it appears now that the library is writing to standard err. I would expect all errors to result in parser-specific exceptions that the calling application would be free to handle as it may see fit. Some of the data we are processing is not publicly available. The errors we have seen are issues with vary large HTML files that were generated from log files. These are suprisingly common but offer a special challenge to HTML parsers in that they tend to contain large strings of log file information between <pre></pre> tags. We'll probably be running about 1 or 2 million files through the parser this week. I will try to report problems and get set up to build the library so that I can offer more specific class/line-based feedback/fixes. Thanks. ------------------------------------------------------- Sponsored by: ThinkGeek at http://www.ThinkGeek.com/ _______________________________________________ Htmlparser-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-user |