Thread: RE: [Htmlparser-user] Testing/feedback, question
Brought to you by:
derrickoswald
From: Claude D. <CD...@ar...> - 2002-06-25 16:19:24
|
Looks like the output is on System.out: com\kizna\html\HTMLParser.java" Line 311: System.err.println("Error! File "+resourceLocn+" not found!"); com\kizna\html\HTMLParser.java" Line 315: System.err.println("Error! URL "+resourceLocn+" Malformed!"); com\kizna\html\HTMLParser.java" Line 319: System.err.println("I/O Exception occured while reading "+resourceLocn); This is all in Version 1.1 (I need to use a production release, so that's what I've been testing so far). I'll check out the latest integration build to see if the same problems exist. I'd like to hear your view(s) on writing to the console. In production releases this should never happen. If necessary, I'd encourage the use of a reporting callback, something like the SAX ErrorHandler class, say: public interface HTMPParserFeedback { public void info(String message); public void warning(String message); public void error(String message, HTMLParserException e); } With a DefaultHTMPParserFeedback implementation that goes to the console. public class DefaultHTMPParserFeedback implements HTMPParserFeedback { public void info(String message) { System.out.println("INFO: " + message); } public void warning(String message) { System.out.println("WARNING: " + message); } public void error(String message, HTMLParserException e) { System.out.println("ERROR: " + message); e.printStackTrace(); } } This approach is especially conducive to user-configuration and helps elliminate errant output to the console. You can also add a debug(String message) method here as well. The callback enables the user of the library to send relevant output to either or all of: the console, log files, streams, etc. Of course, fatal errors should be still thrown back up the calling tree through exceptions, but non-fatal errors can easily be caught this way. To simplify usage, you can use a Factory/Manager class with static methods, like: public class FeedbackManager { protected static HTMPParserFeedback callback; public static void setParserFeedback(HTMPParserFeedback feedback) { callback =3D feedback; } public static void info(String message) { callback.info(message); } public static void warning(String message) { callback.warning(message); } public static void error(String message, HTMLParserException e) { callback.error(message, e); } } In practice, the inline code/usage looks like this: ... // General feedback methodCall(); FeedbackManager.info("Ready to perform some action"); anotherMethodCall(); FeedbackManager.info("Completed some action"); ... // Non fatal exception try { possibleNonFatalCall(); } catch (NonFatalException e) { FeedbackManager.error("more specific description of problem, in context", e); } ... // Fatal exception try { possibleFatalCall(); } catch (FatalException e) { throw new HTMPParserException("Fatal call description", e); } ... Not sure if this is helpful, but it's a strategy that's worked incredibly well for me, ultimately usuable for both end-users and developers who are working on the library itself. -----Original Message----- From: Somik Raha [mailto:so...@ya...]=20 Sent: Monday, June 24, 2002 6:27 PM To: htm...@li... Subject: Re: [Htmlparser-user] Testing/feedback, question Dear Claude, >We have (my company) processed >about 11 million HTML documents successfully (with the >Swing parser), >some of which we'll see tested again with the >HTMLParser code in the >next few weeks. Great - this will be a great service to this project and its community. Thank you very much. >To date, we have only run a few simple tests with the HTMLParser code >but it appears now that the library is writing to standard err. I would >expect all errors to result in parser-specific exceptions that the >calling application would be free to handle as it may see fit. Hmm.. although I agree with this, I have a question - what do you see being written to standard err ? My understanding is that, when the parser crashes, it usually throws an exception all the way up - so if you wrap your parsing block (the for loop) in a try-catch and look for a simple exception, you would be able to catch it. >Some of the data we are processing is not publicly available. The errors >we have seen are issues with vary large HTML files that were generated >from log files. These are suprisingly common but offer a special >challenge to HTML parsers in that they tend to contain large strings of >log file information between <pre></pre> tags. Sounds interesting. Even if we cant get the data that you tested with, we could simulate an equivalent testcase... >We'll probably be running about 1 or 2 million files through the parser >this week. I will try to report problems and get set up to build the >library so that I can offer more specific class/line-based >feedback/fixes. Cool. Looking forward to it. Cheers, Somik ------------------------------------------------------- Sponsored by: ThinkGeek at http://www.ThinkGeek.com/ _______________________________________________ Htmlparser-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-user |
From: Claude D. <CD...@ar...> - 2002-06-25 17:57:46
|
Here are some test results I thought you may be interested in: We ran about 58k files through our conversion process using both the old, Swing-based HTML parser and the new HTMLParser solution yesterday. Some of these files are not HTML and are routed to other parsers, but this particular set of files was especially problematic with the Swing parser. The exact nature of the Swing parser problem is a reallocation of buffer space with too small an increment deep down inside the parser code. In effect, some ungodly low number (4-8) of bytes are alllocated as the string grows each time, causing an array copy each time with a growing string. This is problematic when handling files with large text content between a specific set of tags, such as large log listings between <PRE> tags. Using the old (Swing) parser, we processed 57952 documents, encountered 67 errors, ran in 10305 minutes (several days), with an original aggregate file size of 6,252,739,014 bytes and a converted document collection size around 761,653,928 bytes. Using the new (HTMLParser) parser, we processed 58113 documents, encountered 69 errors, ran in 294 minutes, with an original aggregate file size of 6,256,488,243 bytes and a converted document collection size around 431,198,296 bytes. While this is not a conclusive test - there are clearly discrepencies between the two conversion runs that need to be resolved, such as different output size counts, which are attributable to changes we have made - the timing different is impressive: Going from 10305 minutes to 294 minutes, is just over 35 times faster. This is mostly attributable to the problematic files in this test set, which took on the order of hours to process each. Yet clearly the HTMLParser solution overcomes a serious bug in the Swing parser (which cannot be patched by anyone but Sun or it's Java license holders - given the way the Java license agreement it written). Note that the same low-level reallocation of string resources in the Swing parser is less problematic in cases where less text is found between each tag, but the performance differences should still be sigificant taken over a large set of files. I will share what I can as we learn more. |
From: Somik R. <so...@ya...> - 2002-06-26 01:18:16
|
Dear Claude, Great mail to read. Bytway, as I understand you've used v1.1 for these tests. However, I have made some special optimizations in v1.2, particularly to improve scalability. The String node parser now creates only one HTMLStringNode object for continuous text. So if you had 10,000 lines, v1.1 would create 10,000 objects, while v1.2 would create only one. The other scanners also have been optimized. I think this would result in a substantial improvement in your test results. Bytway, do you think you can write an article about your tests - we could put it up on the HTMLParser page. Also, send me your sourceforge id, I'd like to add you as a developer to this project, so that you can check in improvement directly to CVS. Regards, Somik ----- Original Message ----- From: "Claude Duguay" <CD...@ar...> To: <htm...@li...> Sent: Wednesday, June 26, 2002 2:57 AM Subject: RE: [Htmlparser-user] Testing/feedback, question Here are some test results I thought you may be interested in: We ran about 58k files through our conversion process using both the old, Swing-based HTML parser and the new HTMLParser solution yesterday. Some of these files are not HTML and are routed to other parsers, but this particular set of files was especially problematic with the Swing parser. The exact nature of the Swing parser problem is a reallocation of buffer space with too small an increment deep down inside the parser code. In effect, some ungodly low number (4-8) of bytes are alllocated as the string grows each time, causing an array copy each time with a growing string. This is problematic when handling files with large text content between a specific set of tags, such as large log listings between <PRE> tags. Using the old (Swing) parser, we processed 57952 documents, encountered 67 errors, ran in 10305 minutes (several days), with an original aggregate file size of 6,252,739,014 bytes and a converted document collection size around 761,653,928 bytes. Using the new (HTMLParser) parser, we processed 58113 documents, encountered 69 errors, ran in 294 minutes, with an original aggregate file size of 6,256,488,243 bytes and a converted document collection size around 431,198,296 bytes. While this is not a conclusive test - there are clearly discrepencies between the two conversion runs that need to be resolved, such as different output size counts, which are attributable to changes we have made - the timing different is impressive: Going from 10305 minutes to 294 minutes, is just over 35 times faster. This is mostly attributable to the problematic files in this test set, which took on the order of hours to process each. Yet clearly the HTMLParser solution overcomes a serious bug in the Swing parser (which cannot be patched by anyone but Sun or it's Java license holders - given the way the Java license agreement it written). Note that the same low-level reallocation of string resources in the Swing parser is less problematic in cases where less text is found between each tag, but the performance differences should still be sigificant taken over a large set of files. I will share what I can as we learn more. ------------------------------------------------------- This sf.net email is sponsored by: Jabber Inc. Don't miss the IM event of the season | Special offer for OSDN members! JabConf 2002, Aug. 20-22, Keystone, CO http://www.jabberconf.com/osdn _______________________________________________ Htmlparser-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-user |
From: Claude D. <CD...@ar...> - 2002-06-26 01:40:54
|
MSkgSSBoYXZlIGNoYW5nZWQgbXkgc2NoZWR1bGUgdG8gYWNjb21vZGF0ZSB5b3VycyB0byBzb21l IGRlZ3JlZS4gVGhhdCBpcyB0byBzYXksIEkndmUgb2JzZXJ2ZWQgdGhhdCB5b3VyIHBvc3RzIGFy ZSBhbHdheXMgYmV0d2VlbiA2cG0gYW5kIDJhbSBteSB0aW1lIChTZWF0dGxlKSBhbmQgdGhhdCB5 b3VyIGNvbXBhbnkgaXMgaW4gVG9reW8uIEFzIHN1Y2gsIEkgaW50ZW5kIHRvIHdvcmsgYmV0d2Vl biA2IGFuZCA5IGxvY2FsIHRpbWUgdG8gbW92ZSB0aGlzIGZvcndhcmQgZm9yIGEgZmV3IGRheXMs IGlmIEkgbWF5Lg0KIA0KMikgTXkgU291cmNlRm9yZ2UgSUQgaXMgJ2NsYXVkZWR1Z3VheScgYnV0 IEkgaGF2ZSBuZXZlciB1c2VkIGl0ICh0aG91Z2ggaXQncyBhY3RpdmUgLSBJIGp1c3QgY2hlY2tl ZCkuDQogDQozKSBJJ2xsIHRyeSB0byBjb2xsZWN0IHVwIHdoYXQgSSBjYW4gYXMgd2UgZG8gdGhl c2UgdGVzdHMgYnkgd3JpdGluZyB0byB0aGlzIG1haWxpbmcgbGlzdC4gSSBob3BlIHRoYXQgd2ls bCBhdCBsZWFzdCBjYXB0dXJlIHRoZSBpbmZvcm1hdGlvbi4gQWZ0ZXIgd2UndmUgY29sbGVjdGVk IGVub3VnaCB1c2VmdWwgZGF0YSwgSSdsbCBiZSBoYXBweSB0byB3cml0ZSBzb21ldGhpbmcgdXAg dGhhdCBjb2FsbGVzc2VzIGFzIG11Y2ggYXMgcG9zc2libGUgaW50byBhIHNpbmdsZSBkb2N1bWVu dCBmb3IgeW91Lg0KIA0KNCkgSSB3aWxsIHdvcmsgZnJvbSBob21lIGluIHRoZSBldmVuaW5ncyBh bmQgZnJvbSB0aGUgb2ZmaWNlIGluIHRoZSBtb3JuaW5ncyB0aGlzIHdlZWssIHNvIEknbGwgdXBn cmFkZSB0aGUgcHJvZHVjdCBhdCB3b3JrIGluIHRoZSBtb3JuaW5nIHRvIHVzZSB0aGUgMS4yIGlu dGVncmF0aW9uIHJlbGVhc2UgZm9yIHN1YnNlcXVlbnQgdGVzdGluZy4NCiANCk5vdGUgdGhhdCB0 aGUgdGVzdHMgYXJlIGxhcmVseSBpbi1wcm9kdWN0IHRlc3RzIGZvciBhIGRvY3VtZW50IGNvbnZl cnRlciB3aGljaCBpcyBwYXJ0IG9mIGEgbGFyZ2VyIHN5c3RlbS4gQXMgc3VjaCwgdGhlIHRpbWlu ZyBudW1iZXJzIEkgcHJvdmlkZSB3aWxsIGJlIGZvciBjb21wbGV0ZSBwcm9jZXNzaW5nLCBpbmNs dWRpbmcgZGlyZWN0b3J5IHRyYXZlcnNhbHMsIGZpbGUgY29udmVyc2lvbnMgYW5kIHRyYW5zbWlz c2lvbiB0aW1lcywgdGhyb3VnaCB0aGUgYnVsayBvZiB0aGlzIHByb2Nlc3NpbmcgaXMgcGFyc2lu Zy4gVHJhbnNtaXNzaW9ucyBhcmUgYmVpbmcgZG9uZSBvbiBhIExBTiBzbyB0aGV5IGFyZSB2ZXJ5 IGZhc3QgYW5kIGZpbGUgdHJhdmVyc2FsIHRpbWUgaXMgbmVnbGlnZWFibGUuDQogDQotLS0tLU9y aWdpbmFsIE1lc3NhZ2UtLS0tLSANCkZyb206IFNvbWlrIFJhaGEgW21haWx0bzpzb21pa0B5YWhv by5jb21dIA0KU2VudDogVHVlIDYvMjUvMjAwMiA2OjEzIFBNIA0KVG86IGh0bWxwYXJzZXItdXNl ckBsaXN0cy5zb3VyY2Vmb3JnZS5uZXQgDQpDYzogDQpTdWJqZWN0OiBSZTogW0h0bWxwYXJzZXIt dXNlcl0gVGVzdGluZy9mZWVkYmFjaywgcXVlc3Rpb24NCg0KDQoNCglEZWFyIENsYXVkZSwNCgkg ICAgR3JlYXQgbWFpbCB0byByZWFkLiBCeXR3YXksIGFzIEkgdW5kZXJzdGFuZCB5b3UndmUgdXNl ZCB2MS4xIGZvciB0aGVzZQ0KCXRlc3RzLiBIb3dldmVyLCBJIGhhdmUgbWFkZSBzb21lIHNwZWNp YWwgb3B0aW1pemF0aW9ucyBpbiB2MS4yLCBwYXJ0aWN1bGFybHkNCgl0byBpbXByb3ZlIHNjYWxh YmlsaXR5LiBUaGUgU3RyaW5nIG5vZGUgcGFyc2VyIG5vdyBjcmVhdGVzIG9ubHkgb25lDQoJSFRN TFN0cmluZ05vZGUgb2JqZWN0IGZvciBjb250aW51b3VzIHRleHQuIFNvIGlmIHlvdSBoYWQgMTAs MDAwIGxpbmVzLCB2MS4xDQoJd291bGQgY3JlYXRlIDEwLDAwMCBvYmplY3RzLCB3aGlsZSB2MS4y IHdvdWxkIGNyZWF0ZSBvbmx5IG9uZS4gVGhlIG90aGVyDQoJc2Nhbm5lcnMgYWxzbyBoYXZlIGJl ZW4gb3B0aW1pemVkLg0KCSAgICBJIHRoaW5rIHRoaXMgd291bGQgcmVzdWx0IGluIGEgc3Vic3Rh bnRpYWwgaW1wcm92ZW1lbnQgaW4geW91ciB0ZXN0DQoJcmVzdWx0cy4NCgkgICAgQnl0d2F5LCBk byB5b3UgdGhpbmsgeW91IGNhbiB3cml0ZSBhbiBhcnRpY2xlIGFib3V0IHlvdXIgdGVzdHMgLSB3 ZQ0KCWNvdWxkIHB1dCBpdCB1cCBvbiB0aGUgSFRNTFBhcnNlciBwYWdlLg0KCSAgICBBbHNvLCBz ZW5kIG1lIHlvdXIgc291cmNlZm9yZ2UgaWQsIEknZCBsaWtlIHRvIGFkZCB5b3UgYXMgYSBkZXZl bG9wZXIgdG8NCgl0aGlzIHByb2plY3QsIHNvIHRoYXQgeW91IGNhbiBjaGVjayBpbiBpbXByb3Zl bWVudCBkaXJlY3RseSB0byBDVlMuDQoJDQoJUmVnYXJkcywNCglTb21paw0KCS0tLS0tIE9yaWdp bmFsIE1lc3NhZ2UgLS0tLS0NCglGcm9tOiAiQ2xhdWRlIER1Z3VheSIgPENEdWd1YXlAYXJjZXNz YS5jb20+DQoJVG86IDxodG1scGFyc2VyLXVzZXJAbGlzdHMuc291cmNlZm9yZ2UubmV0Pg0KCVNl bnQ6IFdlZG5lc2RheSwgSnVuZSAyNiwgMjAwMiAyOjU3IEFNDQoJU3ViamVjdDogUkU6IFtIdG1s cGFyc2VyLXVzZXJdIFRlc3RpbmcvZmVlZGJhY2ssIHF1ZXN0aW9uDQoJDQoJDQoJSGVyZSBhcmUg c29tZSB0ZXN0IHJlc3VsdHMgSSB0aG91Z2h0IHlvdSBtYXkgYmUgaW50ZXJlc3RlZCBpbjoNCgkN CglXZSByYW4gYWJvdXQgNThrIGZpbGVzIHRocm91Z2ggb3VyIGNvbnZlcnNpb24gcHJvY2VzcyB1 c2luZyBib3RoIHRoZQ0KCW9sZCwgU3dpbmctYmFzZWQgSFRNTCBwYXJzZXIgYW5kIHRoZSBuZXcg SFRNTFBhcnNlciBzb2x1dGlvbiB5ZXN0ZXJkYXkuDQoJU29tZSBvZiB0aGVzZSBmaWxlcyBhcmUg bm90IEhUTUwgYW5kIGFyZSByb3V0ZWQgdG8gb3RoZXIgcGFyc2VycywgYnV0DQoJdGhpcyBwYXJ0 aWN1bGFyIHNldCBvZiBmaWxlcyB3YXMgZXNwZWNpYWxseSBwcm9ibGVtYXRpYyB3aXRoIHRoZSBT d2luZw0KCXBhcnNlci4NCgkNCglUaGUgZXhhY3QgbmF0dXJlIG9mIHRoZSBTd2luZyBwYXJzZXIg cHJvYmxlbSBpcyBhIHJlYWxsb2NhdGlvbiBvZiBidWZmZXINCglzcGFjZSB3aXRoIHRvbyBzbWFs bCBhbiBpbmNyZW1lbnQgZGVlcCBkb3duIGluc2lkZSB0aGUgcGFyc2VyIGNvZGUuIEluDQoJZWZm ZWN0LCBzb21lIHVuZ29kbHkgbG93IG51bWJlciAoNC04KSBvZiBieXRlcyBhcmUgYWxsbG9jYXRl ZCBhcyB0aGUNCglzdHJpbmcgZ3Jvd3MgZWFjaCB0aW1lLCBjYXVzaW5nIGFuIGFycmF5IGNvcHkg ZWFjaCB0aW1lIHdpdGggYSBncm93aW5nDQoJc3RyaW5nLiBUaGlzIGlzIHByb2JsZW1hdGljIHdo ZW4gaGFuZGxpbmcgZmlsZXMgd2l0aCBsYXJnZSB0ZXh0IGNvbnRlbnQNCgliZXR3ZWVuIGEgc3Bl Y2lmaWMgc2V0IG9mIHRhZ3MsIHN1Y2ggYXMgbGFyZ2UgbG9nIGxpc3RpbmdzIGJldHdlZW4gPFBS RT4NCgl0YWdzLg0KCQ0KCVVzaW5nIHRoZSBvbGQgKFN3aW5nKSBwYXJzZXIsIHdlIHByb2Nlc3Nl ZCA1Nzk1MiBkb2N1bWVudHMsIGVuY291bnRlcmVkDQoJNjcgZXJyb3JzLCByYW4gaW4gMTAzMDUg bWludXRlcyAoc2V2ZXJhbCBkYXlzKSwgd2l0aCBhbiBvcmlnaW5hbA0KCWFnZ3JlZ2F0ZSBmaWxl IHNpemUgb2YgNiwyNTIsNzM5LDAxNCBieXRlcyBhbmQgYSBjb252ZXJ0ZWQgZG9jdW1lbnQNCglj b2xsZWN0aW9uIHNpemUgYXJvdW5kIDc2MSw2NTMsOTI4IGJ5dGVzLg0KCQ0KCVVzaW5nIHRoZSBu ZXcgKEhUTUxQYXJzZXIpIHBhcnNlciwgd2UgcHJvY2Vzc2VkIDU4MTEzIGRvY3VtZW50cywNCgll bmNvdW50ZXJlZCA2OSBlcnJvcnMsIHJhbiBpbiAyOTQgbWludXRlcywgd2l0aCBhbiBvcmlnaW5h bCBhZ2dyZWdhdGUNCglmaWxlIHNpemUgb2YgNiwyNTYsNDg4LDI0MyBieXRlcyBhbmQgYSBjb252 ZXJ0ZWQgZG9jdW1lbnQgY29sbGVjdGlvbg0KCXNpemUgYXJvdW5kIDQzMSwxOTgsMjk2IGJ5dGVz Lg0KCQ0KCVdoaWxlIHRoaXMgaXMgbm90IGEgY29uY2x1c2l2ZSB0ZXN0IC0gdGhlcmUgYXJlIGNs ZWFybHkgZGlzY3JlcGVuY2llcw0KCWJldHdlZW4gdGhlIHR3byBjb252ZXJzaW9uIHJ1bnMgdGhh dCBuZWVkIHRvIGJlIHJlc29sdmVkLCBzdWNoIGFzDQoJZGlmZmVyZW50IG91dHB1dCBzaXplIGNv dW50cywgd2hpY2ggYXJlIGF0dHJpYnV0YWJsZSB0byBjaGFuZ2VzIHdlIGhhdmUNCgltYWRlIC0g dGhlIHRpbWluZyBkaWZmZXJlbnQgaXMgaW1wcmVzc2l2ZToNCgkNCglHb2luZyBmcm9tIDEwMzA1 IG1pbnV0ZXMgdG8gMjk0IG1pbnV0ZXMsIGlzIGp1c3Qgb3ZlciAzNSB0aW1lcyBmYXN0ZXIuDQoJ DQoJVGhpcyBpcyBtb3N0bHkgYXR0cmlidXRhYmxlIHRvIHRoZSBwcm9ibGVtYXRpYyBmaWxlcyBp biB0aGlzIHRlc3Qgc2V0LA0KCXdoaWNoIHRvb2sgb24gdGhlIG9yZGVyIG9mIGhvdXJzIHRvIHBy b2Nlc3MgZWFjaC4gWWV0IGNsZWFybHkgdGhlDQoJSFRNTFBhcnNlciBzb2x1dGlvbiBvdmVyY29t ZXMgYSBzZXJpb3VzIGJ1ZyBpbiB0aGUgU3dpbmcgcGFyc2VyICh3aGljaA0KCWNhbm5vdCBiZSBw YXRjaGVkIGJ5IGFueW9uZSBidXQgU3VuIG9yIGl0J3MgSmF2YSBsaWNlbnNlIGhvbGRlcnMgLSBn aXZlbg0KCXRoZSB3YXkgdGhlIEphdmEgbGljZW5zZSBhZ3JlZW1lbnQgaXQgd3JpdHRlbikuDQoJ DQoJTm90ZSB0aGF0IHRoZSBzYW1lIGxvdy1sZXZlbCByZWFsbG9jYXRpb24gb2Ygc3RyaW5nIHJl c291cmNlcyBpbiB0aGUNCglTd2luZyBwYXJzZXIgaXMgbGVzcyBwcm9ibGVtYXRpYyBpbiBjYXNl cyB3aGVyZSBsZXNzIHRleHQgaXMgZm91bmQNCgliZXR3ZWVuIGVhY2ggdGFnLCBidXQgdGhlIHBl cmZvcm1hbmNlIGRpZmZlcmVuY2VzIHNob3VsZCBzdGlsbCBiZQ0KCXNpZ2lmaWNhbnQgdGFrZW4g b3ZlciBhIGxhcmdlIHNldCBvZiBmaWxlcy4gSSB3aWxsIHNoYXJlIHdoYXQgSSBjYW4gYXMNCgl3 ZSBsZWFybiBtb3JlLg0KCQ0KCQ0KCQ0KCS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCglUaGlzIHNmLm5ldCBlbWFpbCBpcyBzcG9uc29yZWQg Ynk6IEphYmJlciBJbmMuDQoJRG9uJ3QgbWlzcyB0aGUgSU0gZXZlbnQgb2YgdGhlIHNlYXNvbiB8 IFNwZWNpYWwgb2ZmZXIgZm9yIE9TRE4gbWVtYmVycyENCglKYWJDb25mIDIwMDIsIEF1Zy4gMjAt MjIsIEtleXN0b25lLCBDTyBodHRwOi8vd3d3LmphYmJlcmNvbmYuY29tL29zZG4NCglfX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KCUh0bWxwYXJzZXItdXNl ciBtYWlsaW5nIGxpc3QNCglIdG1scGFyc2VyLXVzZXJAbGlzdHMuc291cmNlZm9yZ2UubmV0DQoJ aHR0cHM6Ly9saXN0cy5zb3VyY2Vmb3JnZS5uZXQvbGlzdHMvbGlzdGluZm8vaHRtbHBhcnNlci11 c2VyDQoJDQoJDQoJDQoJLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLQ0KCVRoaXMgc2YubmV0IGVtYWlsIGlzIHNwb25zb3JlZCBieTogSmFiYmVy IEluYy4NCglEb24ndCBtaXNzIHRoZSBJTSBldmVudCBvZiB0aGUgc2Vhc29uIHwgU3BlY2lhbCBv ZmZlciBmb3IgT1NETiBtZW1iZXJzIQ0KCUphYkNvbmYgMjAwMiwgQXVnLiAyMC0yMiwgS2V5c3Rv bmUsIENPIGh0dHA6Ly93d3cuamFiYmVyY29uZi5jb20vb3Nkbg0KCV9fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fDQoJSHRtbHBhcnNlci11c2VyIG1haWxpbmcg bGlzdA0KCUh0bWxwYXJzZXItdXNlckBsaXN0cy5zb3VyY2Vmb3JnZS5uZXQNCglodHRwczovL2xp c3RzLnNvdXJjZWZvcmdlLm5ldC9saXN0cy9saXN0aW5mby9odG1scGFyc2VyLXVzZXINCgkNCg0K |
From: Somik R. <so...@ya...> - 2002-06-26 01:49:25
|
1) I have changed my schedule to accomodate yours to some degree. That = is to say, I've observed that your posts are always between 6pm and 2am = my time (Seattle) and that your company is in Tokyo. As such, I intend = to work between 6 and 9 local time to move this forward for a few days, = if I may. Cool! This collaboration will be fun. 2) My SourceForge ID is 'claudeduguay' but I have never used it (though = it's active - I just checked). I've added you as a developer. You can check out the code, and run the = build scripts. Instructions are at = http://sourceforge.net/cvs/?group_id=3D24399 3) I'll try to collect up what I can as we do these tests by writing to = this mailing list. I hope that will at least capture the information. = After we've collected enough useful data, I'll be happy to write = something up that coallesses as much as possible into a single document = for you. =20 Great. 4) I will work from home in the evenings and from the office in the = mornings this week, so I'll upgrade the product at work in the morning = to use the 1.2 integration release for subsequent testing. =20 Sounds good. Note that the tests are larely in-product tests for a document converter = which is part of a larger system. As such, the timing numbers I provide = will be for complete processing, including directory traversals, file = conversions and transmission times, through the bulk of this processing = is parsing. Transmissions are being done on a LAN so they are very fast = and file traversal time is negligeable. Cool. Bytway, could you subscribe to the htmlparser-developer mailing list ? = We could continue our collaboration on that list, and leave this forum = for user questions. = http://lists.sourceforge.net/lists/listinfo/htmlparser-developer Cheers, Somik |
From: Claude D. <CD...@ar...> - 2002-06-26 01:44:19
|
V1JUIGV4Y2VwdGlvbiBoYW5kbGluZyB2cy4gZmVlZGJhY2ssIG9ubHkgZmF0YWwgZXhjZXB0aW9u cyBzaG91bGQgYmUgdGhyb3duIGFuZCBmZWVkYmFjaywgd2hlcmUgeW91IGFyZSBjdXJyZW50bHkg dXNpbmcgU3lzdGVtLm91dCBvciBTeXN0ZW0uZXJyIHNob3VsZCBnbyB0aHJvdWdoIGFuIGludGVy ZmFjZSB0aGF0IHVzZXJzIGNhbiByZXJvdXRlIGFzIHRoZXkgbWlnaHQgcHJlZmVyICh0byBsb2dz LCBjb25zb2xlIG9yIGlnbm9yZSB0aGVtKS4gSSBoYXZlIHdyaXR0ZW4gdXAgdGhlIGNsYXNzZXMg YW5kIHBhY2thZ2VkIHRoZW0gdW5kZXIgdGhlIGNvbS5raXpuYS5odG1sLnV0aWwgcGFja2FnZS4g SSBjYW4gc2VuZCB0aGVzZSB0byB5b3UgaW4gYW55IGZvcm0geW91IGxpa2UuDQogDQpUaGUgZmls ZXMgYXJlOg0KIA0KSFRNTEZlZWRiYWNrDQpEZWZhdWx0SFRNTEZlZWRiYWNrDQpGZWVkYmFja01h bmFnZXINCkhUTUxQYXJzZXJFeGNlcHRpb24gKGEgY2hhaW5lZCBleGNlcHRpb24gY2xhc3MpLg0K IA0KSSBhbSBkZWJhdGluZyB3aGV0aGVyIHRvIGtlZXAgdGhlIENoYWluZWRFeGNlcHRpb24gY2xh c3MgYXMgYSBiYXNlIGNsYXNzIGZvciBtb3JlIGdlbmVyYWwgdXNlIGFuZCB1c2UgYW4gSFRNTFBh cnNlckV4Y2VwdGlvbiBzdWJjbGFzcy4gQW55IHRob3VnaHRzPw0KDQoJLS0tLS1PcmlnaW5hbCBN ZXNzYWdlLS0tLS0gDQoJRnJvbTogU29taWsgUmFoYSBbbWFpbHRvOnNvbWlrQHlhaG9vLmNvbV0g DQoJU2VudDogVHVlIDYvMjUvMjAwMiA2OjIwIFBNIA0KCVRvOiBodG1scGFyc2VyLXVzZXJAbGlz dHMuc291cmNlZm9yZ2UubmV0IA0KCUNjOiANCglTdWJqZWN0OiBSZTogW0h0bWxwYXJzZXItdXNl cl0gVGVzdGluZy9mZWVkYmFjaywgcXVlc3Rpb24NCgkNCgkNCg0KCT5Mb29rcyBsaWtlIHRoZSBv dXRwdXQgaXMgb24gU3lzdGVtLm91dDoNCgkNCgk+Y29tXGtpem5hXGh0bWxcSFRNTFBhcnNlci5q YXZhIiBMaW5lIDMxMToNCgk+U3lzdGVtLmVyci5wcmludGxuKCJFcnJvciEgRmlsZSAiK3Jlc291 cmNlTG9jbisiIG5vdA0KCT5mb3VuZCEiKTsNCgk+Y29tXGtpem5hXGh0bWxcSFRNTFBhcnNlci5q YXZhIiBMaW5lIDMxNToNCgk+U3lzdGVtLmVyci5wcmludGxuKCJFcnJvciEgVVJMICIrcmVzb3Vy Y2VMb2NuKyINCgk+TWFsZm9ybWVkISIpOw0KCT5jb21ca2l6bmFcaHRtbFxIVE1MUGFyc2VyLmph dmEiIExpbmUgMzE5Og0KCT5TeXN0ZW0uZXJyLnByaW50bG4oIkkvTyBFeGNlcHRpb24gb2NjdXJl ZCB3aGlsZSByZWFkaW5nDQoJPiIrcmVzb3VyY2VMb2NuKTsNCgkNCglPaCB5ZXMgLSBidXQgdGhl c2UgYXJlIHByZXR0eSBiYWQgZXJyb3JzLCBiY29zIHRoZSBwYXJzaW5nIGNhbnQgaGFwcGVuIGFz DQoJdGhlIHJlc291cmNlIGNvdWxkbnQgYmUgZm91bmQuIE1heWJlIGluc3RlYWQgb2YgYSBjYWxs YmFjaywgd2Ugc2hvdWxkIHRocm93DQoJZXhjZXB0aW9ucyBoZXJlLg0KCQ0KCVRoZSBpZGVhIG9m IGEgY2FsbGJhY2sgaXMgYWxzbyB2ZXJ5IGdvb2QgLSB3ZSBtaWdodCBiZSBhYmxlIHRvIHVzZSBp dCB0bw0KCXJlcG9ydCB3YXJuaW5ncyB3aGVuIHRoZSBwYXJzZXIgbmVlZHMgdG8gYXV0by1hZGp1 c3QgZXJyb25lb3VzIHRhZ3MuLiBCdXQNCgl0aGF0IHdvdWxkIGFkZCB0byB0aGUgd2VpZ2h0IG9m IHRoZSBwYXJzZXIuDQoJDQoJVGhlIGlzc3VlIGlzIDogQ2FsbEJhY2sgdnMuIGV4Y2VwdGlvbi4N CglJbiB0aGUgbGF0dGVyIGNhc2UsIHRoZSBwYXJzZXIgZG9lcyBub3QgaGF2ZSB0byB3b3JyeSBh Ym91dCByZWNvdmVyeSwgdGh1cw0KCWtlZXBpbmcgaXQgcmVhbGx5IGNvbXBhY3QuIEhvd2V2ZXIs IGluIHRoZSBmb3JtZXIgY2FzZSwgdGhlIHBhcnNlciBoYXMgdG8NCgloYXZlIHJlY292ZXJ5IGNv ZGUgaW4gY2FzZSBvZiBmYXRhbCBlcnJvcnMuLg0KCQ0KCVdoYXQgYXJlIHlvdXIgdGhvdWdodHMg Pw0KCQ0KCVJlZ2FyZHMsDQoJU29taWsNCgkNCgkNCgkNCgktLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQoJVGhpcyBzZi5uZXQgZW1haWwgaXMg c3BvbnNvcmVkIGJ5OiBKYWJiZXIgSW5jLg0KCURvbid0IG1pc3MgdGhlIElNIGV2ZW50IG9mIHRo ZSBzZWFzb24gfCBTcGVjaWFsIG9mZmVyIGZvciBPU0ROIG1lbWJlcnMhDQoJSmFiQ29uZiAyMDAy LCBBdWcuIDIwLTIyLCBLZXlzdG9uZSwgQ08gaHR0cDovL3d3dy5qYWJiZXJjb25mLmNvbS9vc2Ru DQoJX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18NCglIdG1s cGFyc2VyLXVzZXIgbWFpbGluZyBsaXN0DQoJSHRtbHBhcnNlci11c2VyQGxpc3RzLnNvdXJjZWZv cmdlLm5ldA0KCWh0dHBzOi8vbGlzdHMuc291cmNlZm9yZ2UubmV0L2xpc3RzL2xpc3RpbmZvL2h0 bWxwYXJzZXItdXNlcg0KCQ0KDQo= |
From: Somik R. <so...@ya...> - 2002-06-26 02:01:17
|
WRT exception handling vs. feedback, only fatal exceptions should be = thrown and feedback, where you are currently using System.out or = System.err should go through an interface that users can reroute as they = might prefer (to logs, console or ignore them). I have written up the = classes and packaged them under the com.kizna.html.util package. I can = send these to you in any form you like. I agree. The existing System.err.println() statements - I think they all = indicate fatal errors - hence should be converted to an exception = throwing system. The Callback mechanism should also come in so we can start using it in = the rest of the library. Also - another issue I have been thinking of is SAX compliance. I dont = think it will be hard to make callbacks from the parse() method... What = do you think ? The files are: =20 HTMLFeedback DefaultHTMLFeedback FeedbackManager HTMLParserException (a chained exception class). =20 You put them in CVS. Do you think it'd be better to have a = com.kizna.html.exceptions package instead of util, for better naming = conventions ? I am debating whether to keep the ChainedException class as a base class = for more general use and use an HTMLParserException subclass. Any = thoughts? Hmm.. I'd need to see the code before I can comment. Since you are now going to be a developer - here are two important = guidelines (which you might be already following) : [1] all the code that is checked in must come with testcases and should = not break existing tests. As of now the parser is almost 100% covered by = tests. [2] The bug fixing strategy is - write a testcase to simulate the bug, = make the testcase fail, then fix the bug. Cheers, Somik |
From: Claude D. <CD...@ar...> - 2002-06-26 01:50:37
|
MSkgUG9pbnQgbWUgdG8gc29tZXRoaW5nIHRoYXQgd2lsbCB0ZWxsIG1lIGhvdyB0byBzZXR1cCBD VlMgdG8gZ2V0IGFuIHVwZGF0ZSBhbmQgSSB0cnkgdG8gZ2V0IHNldCB1cCB0byBjaGVjayB0aGlu Z3MgaW4uDQogDQoyKSBJIHdpbGwgYWRkIHRoZSBDb21tYW5kTGluZSBjbGFzcyB0byB0aGUgY29t Lmtpem5hLmh0bWwudXRpbCBwYWNrYWdlIGFzIHdlbGwuDQogDQozKSBXUlQgbXkgcXVlc3Rpb24g YWJvdXQgc2VwYXJhdGluZyBDaGFpbmVkRXhjZXB0aW9uIGZyb20gSFRNTFBhcnNlckV4Y2VwdGlv bi4gSSB0aGluayBJIHdpbGwgYWxzbyBkbyB0aGlzIGJlZm9yZSBjaGVja2luZyBhbnl0aGluZyBp bi4NCiANClRvIHN1bW1hcml6ZS4gSSB3aWxsIGNoZWNrIGluIHRoZSBjbGFzc2VzIEkgbWVudGlv bmVkIHRvIHRoZSBjb20ua2l6bmEuaHRtbC51dGlsIHBhY2thZ2UgYW5kIHRoZW4gY29uc2lkZXIg KHdpdGggeW91ciBmZWVkYmFjaykgd2hhdCBzaG91bGQgYmUgZG9uZSBhYm91dCByb3V0aW5nIFN5 c3RlbS5vdXQgYW5kIFN5c3RlbS5lcnIgKEkgd291bGQgcHJvcG9zZSByb3V0aW5nIHRvIEZlZWRi YWNrTWFuYWdlci5pbmZvIGFuZCB0aHJvd2luZyBIVE1MUGFyc2VyRXhjZXB0aW9ucyB3aGVyZSB5 b3UgYXJlIGN1cnJlbnRseSB1c2luZyBTeXN0ZW0uZXJyKS4NCiANCkV4Y2VwdGlvbnMgZGVzZXJ2 ZSBzb21lIGF0dGVudGlvbiBhbmQgc29tZSBpc3N1ZXMgYXJlIHBoaWxvc29waGljYWwuIFN1Y2gg YSBjaGFuY2UgbWlnaHQgbWFrZSBleGlzdGluZyBjb2RlIGluY29tcGF0aWJsZSBpbiB0aGF0IGFu IGV4Y2VwdGlvbiB3b3VsZCBoYXZlIHRvIGJlIGNhdWdodCBieSBhIGNhbGxpbmcgYXBwbGljYXRp b24uIFRoaXMgaXMsIGhvd2V2ZXIsIHRoZSBjb3JyZWN0IHdheSB0dG8gaGFuZGxlIGZhdGFsIGV4 Y2VwdGlvbnMgaW4geW91ciBjb2RlLiBJdCBpcyBhbHNvIHBvc3NpYmxlIHRvIGRlcml2ZSBDaGFp bmVkRXhjZXB0aW9uIGZyb20gUnVudGltZUV4Y2VwdGlvbiwgdGhvdWdoIHRoaXMgaXMgbm90IHR5 cGljYWxseSByZWNvbW1lbmRlZC4NCiANCi0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tIA0KRnJv bTogU29taWsgUmFoYSBbbWFpbHRvOnNvbWlrQHlhaG9vLmNvbV0gDQpTZW50OiBUdWUgNi8yNS8y MDAyIDY6MzIgUE0gDQpUbzogaHRtbHBhcnNlci11c2VyQGxpc3RzLnNvdXJjZWZvcmdlLm5ldCAN CkNjOiBodG1scGFyc2VyLWRldmVsb3BlckBsaXN0cy5zb3VyY2Vmb3JnZS5uZXQgDQpTdWJqZWN0 OiBSZTogW0h0bWxwYXJzZXItdXNlcl0gVGVzdGluZy9mZWVkYmFjaywgcXVlc3Rpb24NCg0KDQoN CgkxKSBUaGVyZSBpcyBjb21tYW5kIGxpbmUgaGFuZGxpbmcgYW5kIGNvbm5lY3Rpb24tb3JpZW50 ZWQgY29kZSBpbg0KCUhUTUxQYXJzZXIuIFRoaXMgY29kZSBzaG91bGQgYmUgdW5jb3VwbGVkLiBQ ZXJoYXBzIGFuIEhUTUxQYXJzZXJNYWluDQoJY2xhc3MgdG8gaGFuZGxlIHRoZSBjb21tYW5kIGxp bmUgd3JhcHBlciwga2VlcGluZyB0aGUgSFRNTFBhcnNlciBjb2RlDQoJZGVkaWNhdGVkIHRvIHBh cnNpbmc/DQoJDQoJR29vZCBzdWdnZXN0aW9uLiBUaGlzIHJlZmFjdG9yaW5nIHNob3VsZCBiZSBk b25lLg0KCSANCgkyKSBUaGFua3MgZm9yIGZpbGxpbmcgaW4gdGhlIHRvU3RyaW5nIG1ldGhvZHMg aW4gMS4yLiBJIGhhZCBub3RpY2VkIG1vc3QNCgltaXNzaW5nIGluIDEuMSBhbmQgd2FzIGNvbmNl cm5lZC4gV2hpbGUgdGhlcmUncyByb29tIGZvciBtaW5vcg0KCWltcHJvdmVtZW50ICh0aGUgdXNl IG9mIFN0cmluZ0J1ZmZlciB0byBidWlsZCBzdHJpbmdzIGFuZCBhIGNvbnNpc3RlbnQNCgluYW1p bmcgY29udmVudGlvbnMpLCB0aGVzZSBhcmUgbWlub3IgcXVpcHMuIEkndmUgZm91bmQgaXQgdXNl ZnVsIHRvIGhhdmUNCglhIHZvaWQgdG9TdHJpbmcoU3RyaW5nQnVmZmVyIGJ1ZmZlcik7IG1ldGhv ZCB2YXJpYW50IGluIGNvbnRhaW5lcg0KCWNsYXNzZXMsIGZvciBidWlsZGluZyB1cCBzdHJpbmdz IGZyb20gY29udGFpbmVkIGNsYXNzZXMgbW9yZQ0KCWVmZmljaWVudGx5Lg0KCQ0KCVdlIG5lZWQg dG8gZ28gdGhydSBhIHBoYXNlIG9mIG9wdGltaXphdGlvbiBsb29raW5nIGF0IHRoZSBzdHJpbmdz IHVzZWQuIFRoZSB0b1N0cmluZyhTdHJpbmdCdWZmZXIpIG1ldGhvZCBhbHNvIHNvdW5kcyB1c2Vm dWwuIA0KCSANCgkzKSBJIGxvdmUgdGhlIGV4aXN0ZW5jZSBvZiB0aGUgdG9IVE1MKCk7IG1ldGhv ZHMuDQoJVGhpcyB3YXMgdGhlIHN1Z2dlc3Rpb24gb2YgU2FtIEpvc2VwaCAoaXQgdXNlZCB0byBi ZSB0b1Jhd1N0cmluZygpIGluIG9sZGVyIGludGVncmF0aW9uIHJlbGVhc2VzKS4gVGhhbmtzIFNh bSENCgkgDQoJNCkgSSBzZWUgaXQncyBub3cgcG9zc2libGUgdG8gZ2V0IHNvbWV0aGluZyBieSBj YWxsaW5nIGdldFRhZy4gVGhpcyB3YXMNCgltaXNzaW5nIGluIDEuMS4gVGhhbmtzLg0KCQ0KCUht bS4uIFRoaXMgbWV0aG9kIHNob3VsZCBhY3R1YWxseSByZWFkIGdldFRhZ05hbWUoKS4gDQoJIA0K CTUpIEkgbm90aWNlZCBhIGxvdCBvZiBjb2RlIGluIHRoZSBIVE1MVGFnIGNsYXNzIHdoaWNoIGlz ICdwcml2YXRlDQoJc3RhdGljJy4gVGhpcyBzdWdnZXN0cyB0aGUgbmVlZCBmb3IgYW4gZXh0ZXJu YWwgY2xhc3MgdG8gaGFuZGxlIHRoaXMNCgl0eXBlIG9mIHdvcmsuIEF0IHBlcmlwaGVyYWwgZ2xh Y2UsIEknbSBwcmVzdW1pbmcgeW91J3JlIGZ1bmN0aW9uaW5nIGFzIGENCglGaW5pdGUgU3RhdGUg TWFjaGluZSAodGh1cyB0aGUgJ2F1dG9tYXRhJyBwcmVmaXgpPw0KCQ0KCUFoIHllcywgSSBoYXZl IGJlZW4gdGhpbmtpbmcgb2YgZG9pbmcgdGhpcyByZWZhY3RvcmluZyBmb3IgYSB3aGlsZSwgYW5k IGFsc28gcmVmYWN0b3IgdGhlIG90aGVyIGZpbml0ZSBzdGF0ZSBtYWNoaW5lcyBmb3Igc3RyaW5n cyBhbmQgcmVtYXJrcy4NCgkgDQoJVGhhbmtzIGZvciB0aGUgYmlnIGludmVzdG1lbnQuIEknZCBi ZSBoYXBweSB0byBzcGVuZCBhIGxpdHRsZSB0aW1lDQoJaGVscGluZyB3aXRoIHNvbWUgb2YgdGhl IGdydW50IHdvcmsuIElmIHlvdSB0aGluayB0aGUgdXNlIG9mIHRoZQ0KCUNhbGxiYWNrIG1lY2hh bmlzbSBpcyBnb29kLCBmb3IgZXhhbXBsZSwgSSBjb3VsZCByZXBsYWNlIGFsbCB0aGUNCglTeXN0 ZW0ub3V0IGFuZCBTeXN0ZW0uZXJyIGZvciB5b3UgYW5kIHNlbmQgeW91IHRoZSBjb2RlLg0KCQ0K CVlvdSBhcmUgbW9zdCB3ZWxjb21lIHRvIGpvaW4gdXMgLSBhcyBJIG1lbnRpb25lZCwgSSdkIGJl IGhhcHB5IHRvIGFkZCB5b3UgYXMgYSBkZXZlbG9wZXIuIA0KCSANCgk2KSBJIG5vdGljZWQgdGhh dCB5b3UgZG9uJ3QgaGF2ZSBhIGN1c3RvbSBleGNlcHRpb24gY2xhc3MuIEkgaGF2ZSBjb2RlDQoJ a2lja2luZyBhcm91bmQgdGhhdCBpbXBsZW1lbnRzIGNoYWluZWQgZXhjZXB0aW9ucyAoYXMgaW4g SmF2YSAxLjQpIGJ1dA0KCWlzIGNvbXBhdGlibGUgd2l0aCBlYXJsaWVyIEphdmEgdmVyc2lvbnMu IENoYWluZWQgZXhjZXB0aW9ucyBhcmUNCglpbmNyZWRpYmx5IHVzZWZ1bCBmb3Igd3JhcGluZyB1 bmRlcmx5aW5nIGV4Y2VwdGlvbnMgaW50byBoaWdoZXItbGV2ZWwNCglleGNlcHRpb25zIHdoaWxl IHJldGFpbmluZyB0aGUgc3RhY2sgdHJhY2UuIFRoaXMgcmVzdWx0cyBpbiBoaWdobHkNCgl1c2Fi bGUgbGlicmFyaWVzIGJlY2F1c2UgaXQgcHJvdmlkZXMgc3VpdGFibGUgaGlnaC1sZXZlbCBleHBs YW5hdGlvbnMgb2YNCglhIHByb2JsZW0sIHdoaWxlIHJldGFpbmluZyBsb3dlciBsZXZlbCBjb250 ZXh0Lg0KCQ0KCVNvdW5kcyBsaWtlIGEgZ3JlYXQgaWRlYS4gUGxzIGdvIGFoZWFkIGFuZCBhZGQg aXQgdG8gdGhlIENWUyB2ZXJzaW9uLg0KCSANCgk3KSBJIGFsc28gaGF2ZSBhIHZlcnkgc2ltcGxl IGJ1dCB2ZXJzYXRpbGUgY29tbWFuZCBsaW5lIGhhbmRsZXIgY2xhc3MNCgl0aGF0IHlvdSBjYW4g dXNlIGlmIHlvdSBsaWtlLiBJdCBsZXRzIHlvdSByZXRyaWV2ZSBhcmd1bWVudHMgYXMgZWl0aGVy DQoJZmxhZ3Mgb3IgcGFyYW1ldGVyLWZvbGxvd2VkIG9wdGlvbnMsIHNpbmdsZSBvciBtdWx0aXBs ZSBsZXR0ZXIgY29tbWFuZHMsDQoJb3JkZXItZGVwZW50ZW50LCBldGMuIFdoaWxlIHNpbXBsZSwg dGhpcyBpcyBvbmUgb2YgdGhvc2UgY2xhc3NlcyB0aGF0DQoJbm9ib2R5IHNob3VsZCBsaXZlIHdp dGhvdXQgOy0pLg0KCQ0KCUl0IHdvdWxkIGJlIGdvb2QgdG8gaGF2ZSB0aGlzIGluIHRoZSBwYXJz ZXIuIEdyZWF0IHRvIGhhdmUgeW91IG9uIGJvYXJkIQ0KCSANCglDaGVlcnMsDQoJU29taWsNCg0K |
From: Somik R. <so...@ya...> - 2002-06-26 02:05:56
|
1) Point me to something that will tell me how to setup CVS to get an = update and I try to get set up to check things in. From your other mail, it seems you got CVS to work. You definitely need = SSH to check in code - http://cdx.sourceforge.net/win-HOWTO.htm=20 I was using Tortoise CVS earlier - its important that you make a = checkout once using SSH from your dos shell. Then you can continue to = update and commit using Tortoise CVS. The better and more elegant option is to use Eclipse - the great free = Open Source IDE supported by IBM - it interfaces very cleanly with CVS = and SSH (extssh), and you dont need to setup anything. Lets continue our tech discussions on the developer list. Cheers, Somik |
From: Claude D. <CD...@ar...> - 2002-06-26 02:02:56
|
SSBoYXZlIHN1Y2Nlc3NmdWxseSBjaGVja2VkIG91dCB0aGUgY29kZSBmcm9tIENWUyB1c2luZyBt eSBTb3VyY2VGb3JnZSBJRCBidXQgd2FzIHVuYWJlIHRvIGNoZWNrIGFueXRoaW5nIGluLiBJIHBy ZXN1bWUgaXQgdGFrZXMgc29tZSB0aW1lIGZvciB0aGUgY2hhbmdlIHlvdSBtYWRlIHRvIHByb3Bh Z2F0ZSwgb3Igc2hvdWxkIEkgYmUgbG9va2luZyBpbnRvIGFuIFNTSCBwcm9ibGVtPyBJIGFtIG9w ZXJhdGluZyB1bmRlciBXaW5kb3dzIFhQIHdpdGggVHVydG9pc2UgYXMgYSBDVlMgdG9vbC4gSSBj YW4gcHJvYmFibHkgc2V0IHVwIGEgdHVubmVsIHVzaW5nIFNTSCBidXQgSSBoYXZlbid0IHRyaWVk IHRoYXQgeWV0LiBsZXQgbWUga25vdyBpZiB5b3UgdGhpbmsgSSBzaG91bGQganVzdCB3YWl0IGFu IHNlZSBpZiB5b3VyIHBlcm1pc3Npb24gY2hhbmdlIGp1c3QgbmVlZHMgdG8gcHJvcGFnYXRlLiBU aGFua3MuDQoNCgktLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLSANCglGcm9tOiBTb21payBSYWhh IFttYWlsdG86c29taWtAeWFob28uY29tXSANCglTZW50OiBUdWUgNi8yNS8yMDAyIDY6NDQgUE0g DQoJVG86IGh0bWxwYXJzZXItdXNlckBsaXN0cy5zb3VyY2Vmb3JnZS5uZXQgDQoJQ2M6IA0KCVN1 YmplY3Q6IFJlOiBbSHRtbHBhcnNlci11c2VyXSBUZXN0aW5nL2ZlZWRiYWNrLCBxdWVzdGlvbg0K CQ0KCQ0KCTEpIEkgaGF2ZSBjaGFuZ2VkIG15IHNjaGVkdWxlIHRvIGFjY29tb2RhdGUgeW91cnMg dG8gc29tZSBkZWdyZWUuIFRoYXQgaXMgdG8gc2F5LCBJJ3ZlIG9ic2VydmVkIHRoYXQgeW91ciBw b3N0cyBhcmUgYWx3YXlzIGJldHdlZW4gNnBtIGFuZCAyYW0gbXkgdGltZSAoU2VhdHRsZSkgYW5k IHRoYXQgeW91ciBjb21wYW55IGlzIGluIFRva3lvLiBBcyBzdWNoLCBJIGludGVuZCB0byB3b3Jr IGJldHdlZW4gNiBhbmQgOSBsb2NhbCB0aW1lIHRvIG1vdmUgdGhpcyBmb3J3YXJkIGZvciBhIGZl dyBkYXlzLCBpZiBJIG1heS4NCgkNCglDb29sISBUaGlzIGNvbGxhYm9yYXRpb24gd2lsbCBiZSBm dW4uDQoJIA0KCTIpIE15IFNvdXJjZUZvcmdlIElEIGlzICdjbGF1ZGVkdWd1YXknIGJ1dCBJIGhh dmUgbmV2ZXIgdXNlZCBpdCAodGhvdWdoIGl0J3MgYWN0aXZlIC0gSSBqdXN0IGNoZWNrZWQpLg0K CQ0KCUkndmUgYWRkZWQgeW91IGFzIGEgZGV2ZWxvcGVyLiBZb3UgY2FuIGNoZWNrIG91dCB0aGUg Y29kZSwgYW5kIHJ1biB0aGUgYnVpbGQgc2NyaXB0cy4gSW5zdHJ1Y3Rpb25zIGFyZSBhdCBodHRw Oi8vc291cmNlZm9yZ2UubmV0L2N2cy8/Z3JvdXBfaWQ9MjQzOTkNCgkgDQoJMykgSSdsbCB0cnkg dG8gY29sbGVjdCB1cCB3aGF0IEkgY2FuIGFzIHdlIGRvIHRoZXNlIHRlc3RzIGJ5IHdyaXRpbmcg dG8gdGhpcyBtYWlsaW5nIGxpc3QuIEkgaG9wZSB0aGF0IHdpbGwgYXQgbGVhc3QgY2FwdHVyZSB0 aGUgaW5mb3JtYXRpb24uIEFmdGVyIHdlJ3ZlIGNvbGxlY3RlZCBlbm91Z2ggdXNlZnVsIGRhdGEs IEknbGwgYmUgaGFwcHkgdG8gd3JpdGUgc29tZXRoaW5nIHVwIHRoYXQgY29hbGxlc3NlcyBhcyBt dWNoIGFzIHBvc3NpYmxlIGludG8gYSBzaW5nbGUgZG9jdW1lbnQgZm9yIHlvdS4NCgkgDQoJR3Jl YXQuDQoJIA0KCTQpIEkgd2lsbCB3b3JrIGZyb20gaG9tZSBpbiB0aGUgZXZlbmluZ3MgYW5kIGZy b20gdGhlIG9mZmljZSBpbiB0aGUgbW9ybmluZ3MgdGhpcyB3ZWVrLCBzbyBJJ2xsIHVwZ3JhZGUg dGhlIHByb2R1Y3QgYXQgd29yayBpbiB0aGUgbW9ybmluZyB0byB1c2UgdGhlIDEuMiBpbnRlZ3Jh dGlvbiByZWxlYXNlIGZvciBzdWJzZXF1ZW50IHRlc3RpbmcuDQoJIA0KCVNvdW5kcyBnb29kLg0K CSANCglOb3RlIHRoYXQgdGhlIHRlc3RzIGFyZSBsYXJlbHkgaW4tcHJvZHVjdCB0ZXN0cyBmb3Ig YSBkb2N1bWVudCBjb252ZXJ0ZXIgd2hpY2ggaXMgcGFydCBvZiBhIGxhcmdlciBzeXN0ZW0uIEFz IHN1Y2gsIHRoZSB0aW1pbmcgbnVtYmVycyBJIHByb3ZpZGUgd2lsbCBiZSBmb3IgY29tcGxldGUg cHJvY2Vzc2luZywgaW5jbHVkaW5nIGRpcmVjdG9yeSB0cmF2ZXJzYWxzLCBmaWxlIGNvbnZlcnNp b25zIGFuZCB0cmFuc21pc3Npb24gdGltZXMsIHRocm91Z2ggdGhlIGJ1bGsgb2YgdGhpcyBwcm9j ZXNzaW5nIGlzIHBhcnNpbmcuIFRyYW5zbWlzc2lvbnMgYXJlIGJlaW5nIGRvbmUgb24gYSBMQU4g c28gdGhleSBhcmUgdmVyeSBmYXN0IGFuZCBmaWxlIHRyYXZlcnNhbCB0aW1lIGlzIG5lZ2xpZ2Vh YmxlLg0KCQ0KCUNvb2wuDQoJIA0KCUJ5dHdheSwgY291bGQgeW91IHN1YnNjcmliZSB0byB0aGUg aHRtbHBhcnNlci1kZXZlbG9wZXIgbWFpbGluZyBsaXN0ID8gV2UgY291bGQgY29udGludWUgb3Vy IGNvbGxhYm9yYXRpb24gb24gdGhhdCBsaXN0LCBhbmQgbGVhdmUgdGhpcyBmb3J1bSBmb3IgdXNl ciBxdWVzdGlvbnMuIGh0dHA6Ly9saXN0cy5zb3VyY2Vmb3JnZS5uZXQvbGlzdHMvbGlzdGluZm8v aHRtbHBhcnNlci1kZXZlbG9wZXINCgkgDQoJQ2hlZXJzLA0KCVNvbWlrDQoNCg== |
From: Claude D. <CD...@ar...> - 2002-06-26 02:13:34
|
MSkgU0FYLWNvbXBsaWFuY2UgaXMgYSBiaWcgZmVhdHVyZSBxdWVzdGlvbi4gV2hhdCBkbyB5b3Ug d2FudCB0byBzdXBwb3J0LCB0aGUgQ29udGVudEhhbmRsZXIgaW50ZXJmYWNlPyBZb3UgbWlnaHQg YWxzbyBjb25zaWRlciB0aGUgbm90aW9uIG9mIHVzaW5nIElucHV0U291cmNlIG9iamVjdHMgdG8g ZmV0Y2ggdGhlIHN0cmVhbSAoYWN0dWFsbHkgYSBSZWFkZXIpIHRvIHByb2Nlc3MuIEkgc3RhcnRl ZCBkb2luZyBzb21lIHRpbmtlcmluZyB3aXRoIHRoaXMgZm9yIGFuICdYTUwgTWFnYXppbmUnIGFy dGljbGUgKEkgd3JpdGUgYSBjb2x1bW4gZm9yIGJvdGggSmF2YVBybyBhbmQgWE1MIE1hZ2F6aW5l KS4gSSB0aGluayBJIGNvdWxkIG9mZmVyIGEgdXNlZnVsIGNsYXNzIHRoYXQgeW91IGNvdWxkIGJ1 aWxkIG9uIGZvciB0aGF0Li4uDQogDQoyKSBSZTogVGVzdCBjYXNlcy4gWW91IG5lZWQgdG8ga25v dyB0aGF0IEkgYW0gb2ZmZXJpbmcgdGltZSB0aGlzIHdlZWsgZm9yIGJ1c2luZXNzIHJlYXNvbnMu IFBhcnQgb2YgbXkgb2JqZWN0aXZlIGl0IHNvIGVuc3VyZSB0aGF0IHRoZSBsaWJyYXJ5IGRvZXMg bm90IGhhdmUgYW55IHNob3J0Y29taW5ncyB0aGF0IHdvdWxkIGltcGFjdCBteSBjb21wYW55J3Mg cHJvamVjdC4gSSBiZWxpZXZlIHRoaXMgdG8gYmUgbXV0dWFsbHkgYmVuZWZpY2lhbCBpbiB0aGF0 IGFueSBzdWNoIGVuaGFuY2VtZW50cyBzaG91bGQgZ2l2ZW4gZnJlZWx5IHRvIHRoZSBIVE1MUGFy c2VyIGNvbW11bml0eS4gSSBjYW5ub3QsIHVuZm9ydHVuYXRlbHksIGNvbW1pdCBtdWNoIHRpbWUs IHNvIHlvdSdsbCBoYXZlIHRvIHRyZWF0IG1lIGFzIGEgbGltaXRlZCByZXNvdXJjZS4NCiANCllv dSB3aWxsIGZpbmQgdGhhdCBJIHdvcmsgcXVpY2tseSBhbmQgcHJvZHVjZSBnb29kIHF1YWxpdHkg Y29kZS4gTXkgZG9jdW1lbnRhdGlvbiBpcyBhbHNvIGdvb2QsIGJ1dCBteSB0ZXN0IGNhc2VzIGFy ZSBmZXcgYW5kIGZhciBiZXR3ZWVuLiBJIGhvcGUgeW91IGNhbiBoZWxwIG1lIGNvbXBlbnNhdGUg Zm9yIHRoaXMgd2Vha25lc3MsIHRob3VnaCBJIGZ1bGx5IHVuZGVyc3RhbmQgdGhlIG5lZWQgZm9y IHRlc3QgY2FzZXMuDQogDQozKSBSZTogcGFja2FnZSB0byBwdXQgdGhlc2UgY2xhc3NlcyBpbi4g VGhlcmUncyBvbmx5IG9uZSBIVE1QUGFyc2VyRXhjZXB0aW9uIGFuZCBhIHBhcmVudCBDaGFpbmVk RXhjZXB0aW9uIGNsYXNzLiBJdCBkb2Vzbid0IHNlZW0gd29ydGggY3JlYXRpbmcgYSBuZXcgcGFj a2FnZSBmb3IgdGhhdCwgWW91IGNvdWxkIGFsd2F5cyBtaWdyYXRlIGl0IGxhdGVyIGlmIHlvdSBz dGFydCBzdWJjbGFzc2luZyB0b28gbWFueSBleGNlcHRpb25zIGJ1dCBJIGRvdWJ0IHRoYXQgYmVj b21lcyBhIHByb2JsZW0gd2l0aCB0aGlzIHByb2plY3QuDQogDQotLS0tLU9yaWdpbmFsIE1lc3Nh Z2UtLS0tLSANCkZyb206IFNvbWlrIFJhaGEgW21haWx0bzpzb21pa0B5YWhvby5jb21dIA0KU2Vu dDogVHVlIDYvMjUvMjAwMiA2OjU2IFBNIA0KVG86IGh0bWxwYXJzZXItdXNlckBsaXN0cy5zb3Vy Y2Vmb3JnZS5uZXQgDQpDYzogaHRtbHBhcnNlci1kZXZlbG9wZXJAbGlzdHMuc291cmNlZm9yZ2Uu bmV0IA0KU3ViamVjdDogUmU6IFtIdG1scGFyc2VyLXVzZXJdIFRlc3RpbmcvZmVlZGJhY2ssIHF1 ZXN0aW9uDQoNCg0KDQoJV1JUIGV4Y2VwdGlvbiBoYW5kbGluZyB2cy4gZmVlZGJhY2ssIG9ubHkg ZmF0YWwgZXhjZXB0aW9ucyBzaG91bGQgYmUgdGhyb3duIGFuZCBmZWVkYmFjaywgd2hlcmUgeW91 IGFyZSBjdXJyZW50bHkgdXNpbmcgU3lzdGVtLm91dCBvciBTeXN0ZW0uZXJyIHNob3VsZCBnbyB0 aHJvdWdoIGFuIGludGVyZmFjZSB0aGF0IHVzZXJzIGNhbiByZXJvdXRlIGFzIHRoZXkgbWlnaHQg cHJlZmVyICh0byBsb2dzLCBjb25zb2xlIG9yIGlnbm9yZSB0aGVtKS4gSSBoYXZlIHdyaXR0ZW4g dXAgdGhlIGNsYXNzZXMgYW5kIHBhY2thZ2VkIHRoZW0gdW5kZXIgdGhlIGNvbS5raXpuYS5odG1s LnV0aWwgcGFja2FnZS4gSSBjYW4gc2VuZCB0aGVzZSB0byB5b3UgaW4gYW55IGZvcm0geW91IGxp a2UuDQoJIA0KCUkgYWdyZWUuIFRoZSBleGlzdGluZyBTeXN0ZW0uZXJyLnByaW50bG4oKSBzdGF0 ZW1lbnRzIC0gSSB0aGluayB0aGV5IGFsbCBpbmRpY2F0ZSBmYXRhbCBlcnJvcnMgLSBoZW5jZSBz aG91bGQgYmUgY29udmVydGVkIHRvIGFuIGV4Y2VwdGlvbiB0aHJvd2luZyBzeXN0ZW0uDQoJIA0K CVRoZSBDYWxsYmFjayBtZWNoYW5pc20gc2hvdWxkIGFsc28gY29tZSBpbiBzbyB3ZSBjYW4gc3Rh cnQgdXNpbmcgaXQgaW4gdGhlIHJlc3Qgb2YgdGhlIGxpYnJhcnkuDQoJIA0KCUFsc28gLSBhbm90 aGVyIGlzc3VlIEkgaGF2ZSBiZWVuIHRoaW5raW5nIG9mIGlzIFNBWCBjb21wbGlhbmNlLiBJIGRv bnQgdGhpbmsgaXQgd2lsbCBiZSBoYXJkIHRvIG1ha2UgY2FsbGJhY2tzIGZyb20gdGhlIHBhcnNl KCkgbWV0aG9kLi4uIFdoYXQgZG8geW91IHRoaW5rID8NCgkgDQoJVGhlIGZpbGVzIGFyZToNCgkg DQoJSFRNTEZlZWRiYWNrDQoJRGVmYXVsdEhUTUxGZWVkYmFjaw0KCUZlZWRiYWNrTWFuYWdlcg0K CUhUTUxQYXJzZXJFeGNlcHRpb24gKGEgY2hhaW5lZCBleGNlcHRpb24gY2xhc3MpLg0KCSANCglZ b3UgcHV0IHRoZW0gaW4gQ1ZTLiBEbyB5b3UgdGhpbmsgaXQnZCBiZSBiZXR0ZXIgdG8gaGF2ZSBh IGNvbS5raXpuYS5odG1sLmV4Y2VwdGlvbnMgcGFja2FnZSBpbnN0ZWFkIG9mIHV0aWwsIGZvciBi ZXR0ZXIgbmFtaW5nIGNvbnZlbnRpb25zID8NCgkNCglJIGFtIGRlYmF0aW5nIHdoZXRoZXIgdG8g a2VlcCB0aGUgQ2hhaW5lZEV4Y2VwdGlvbiBjbGFzcyBhcyBhIGJhc2UgY2xhc3MgZm9yIG1vcmUg Z2VuZXJhbCB1c2UgYW5kIHVzZSBhbiBIVE1MUGFyc2VyRXhjZXB0aW9uIHN1YmNsYXNzLiBBbnkg dGhvdWdodHM/DQoJDQoJSG1tLi4gSSdkIG5lZWQgdG8gc2VlIHRoZSBjb2RlIGJlZm9yZSBJIGNh biBjb21tZW50Lg0KCSANCglTaW5jZSB5b3UgYXJlIG5vdyBnb2luZyB0byBiZSBhIGRldmVsb3Bl ciAtIGhlcmUgYXJlIHR3byBpbXBvcnRhbnQgZ3VpZGVsaW5lcyAod2hpY2ggeW91IG1pZ2h0IGJl IGFscmVhZHkgZm9sbG93aW5nKSA6DQoJIA0KCVsxXSBhbGwgdGhlIGNvZGUgdGhhdCBpcyBjaGVj a2VkIGluIG11c3QgY29tZSB3aXRoIHRlc3RjYXNlcyBhbmQgc2hvdWxkIG5vdCBicmVhayBleGlz dGluZyB0ZXN0cy4gQXMgb2Ygbm93IHRoZSBwYXJzZXIgaXMgYWxtb3N0IDEwMCUgY292ZXJlZCBi eSB0ZXN0cy4NCgkgDQoJWzJdIFRoZSBidWcgZml4aW5nIHN0cmF0ZWd5IGlzIC0gd3JpdGUgYSB0 ZXN0Y2FzZSB0byBzaW11bGF0ZSB0aGUgYnVnLCBtYWtlIHRoZSB0ZXN0Y2FzZSBmYWlsLCB0aGVu IGZpeCB0aGUgYnVnLg0KCSANCglDaGVlcnMsDQoJU29taWsNCg0K |
From: Somik R. <so...@ya...> - 2002-06-26 10:16:43
|
1) SAX-compliance is a big feature question. What do you want to = support, the ContentHandler interface? You might also consider the = notion of using InputSource objects to fetch the stream (actually a = Reader) to process.=20 Actually I was referring to the latter, not the ContentHandler. The idea = is to swap an XMLParser with the HTMLParser without changing anything = else - so you ContentHandler can remain the same. I started doing some tinkering with this for an 'XML Magazine' article = (I write a column for both JavaPro and XML Magazine). I think I could = offer a useful class that you could build on for that... Sounds good. Do go ahead. 2) Re: Test cases. You need to know that I am offering time this week = for business reasons. Part of my objective it so ensure that the library = does not have any shortcomings that would impact my company's project. I = believe this to be mutually beneficial in that any such enhancements = should given freely to the HTMLParser community. I cannot, = unfortunately, commit much time, so you'll have to treat me as a limited = resource. =20 Fair enough. Its really nice of you to offer improvements back to the = project. We all work on this in a limited fashion - thats the OS = movement :) You will find that I work quickly and produce good quality code. My = documentation is also good, but my test cases are few and far between. I = hope you can help me compensate for this weakness, though I fully = understand the need for test cases. I have no doubt that you produce good quality code, the problem is = others (including myself) may not - and if your code doesent have its = own testcase, then someone could mess it and not get to know. Though to = begin with, I can put in whatever free time I have writing test cases = for your code - but that would be kind of slow. But dont let that stop = you - it would be great to have your contributions. Cheers, Somik |
From: Claude D. <CD...@ar...> - 2002-06-26 02:15:47
|
T2ggc3VyZSwgbm93IEkgaGF2ZSB0byBzdWJzY3JpYmUgdG8gYW5vdGhlciBsaXN0IDstKS4gU29y cnkgZm9yIGFsbCB0aGUgbm9pc2Ugb24gdGhlIHVzZXIgbGlzdC4gSSdkIGZvcmdvdHRlbiB0aGlz IHdhcyBpbnRlbmRlZCBmb3IgdXNlcnMuIFNvcnJ5IGZvbGtzIQ0KDQoJLS0tLS1PcmlnaW5hbCBN ZXNzYWdlLS0tLS0gDQoJRnJvbTogU29taWsgUmFoYSBbbWFpbHRvOnNvbWlrQHlhaG9vLmNvbV0g DQoJU2VudDogVHVlIDYvMjUvMjAwMiA3OjAwIFBNIA0KCVRvOiBodG1scGFyc2VyLXVzZXJAbGlz dHMuc291cmNlZm9yZ2UubmV0IA0KCUNjOiBodG1scGFyc2VyLWRldmVsb3BlckBsaXN0cy5zb3Vy Y2Vmb3JnZS5uZXQgDQoJU3ViamVjdDogUmU6IFtIdG1scGFyc2VyLXVzZXJdIFRlc3RpbmcvZmVl ZGJhY2ssIHF1ZXN0aW9uDQoJDQoJDQoJMSkgUG9pbnQgbWUgdG8gc29tZXRoaW5nIHRoYXQgd2ls bCB0ZWxsIG1lIGhvdyB0byBzZXR1cCBDVlMgdG8gZ2V0IGFuIHVwZGF0ZSBhbmQgSSB0cnkgdG8g Z2V0IHNldCB1cCB0byBjaGVjayB0aGluZ3MgaW4uDQoJIA0KCUZyb20geW91ciBvdGhlciBtYWls LCBpdCBzZWVtcyB5b3UgZ290IENWUyB0byB3b3JrLiBZb3UgZGVmaW5pdGVseSBuZWVkIFNTSCB0 byBjaGVjayBpbiBjb2RlIC0gaHR0cDovL2NkeC5zb3VyY2Vmb3JnZS5uZXQvd2luLUhPV1RPLmh0 bSANCglJIHdhcyB1c2luZyBUb3J0b2lzZSBDVlMgZWFybGllciAtIGl0cyBpbXBvcnRhbnQgdGhh dCB5b3UgbWFrZSBhIGNoZWNrb3V0IG9uY2UgdXNpbmcgU1NIIGZyb20geW91ciBkb3Mgc2hlbGwu IFRoZW4geW91IGNhbiBjb250aW51ZSB0byB1cGRhdGUgYW5kIGNvbW1pdCB1c2luZyBUb3J0b2lz ZSBDVlMuDQoJIA0KCVRoZSBiZXR0ZXIgYW5kIG1vcmUgZWxlZ2FudCBvcHRpb24gaXMgdG8gdXNl IEVjbGlwc2UgLSB0aGUgZ3JlYXQgZnJlZSBPcGVuIFNvdXJjZSBJREUgc3VwcG9ydGVkIGJ5IElC TSAtIGl0IGludGVyZmFjZXMgdmVyeSBjbGVhbmx5IHdpdGggQ1ZTIGFuZCBTU0ggKGV4dHNzaCks IGFuZCB5b3UgZG9udCBuZWVkIHRvIHNldHVwIGFueXRoaW5nLg0KCSANCglMZXRzIGNvbnRpbnVl IG91ciB0ZWNoIGRpc2N1c3Npb25zIG9uIHRoZSBkZXZlbG9wZXIgbGlzdC4NCgkgDQoJQ2hlZXJz LA0KCVNvbWlrDQoNCg== |
From: Somik R. <so...@ya...> - 2002-06-26 01:25:54
|
>Looks like the output is on System.out: >com\kizna\html\HTMLParser.java" Line 311: >System.err.println("Error! File "+resourceLocn+" not >found!"); >com\kizna\html\HTMLParser.java" Line 315: >System.err.println("Error! URL "+resourceLocn+" >Malformed!"); >com\kizna\html\HTMLParser.java" Line 319: >System.err.println("I/O Exception occured while reading >"+resourceLocn); Oh yes - but these are pretty bad errors, bcos the parsing cant happen as the resource couldnt be found. Maybe instead of a callback, we should throw exceptions here. The idea of a callback is also very good - we might be able to use it to report warnings when the parser needs to auto-adjust erroneous tags.. But that would add to the weight of the parser. The issue is : CallBack vs. exception. In the latter case, the parser does not have to worry about recovery, thus keeping it really compact. However, in the former case, the parser has to have recovery code in case of fatal errors.. What are your thoughts ? Regards, Somik |