RE: [Htmlparser-user] Final Statistics from Trek Run
Brought to you by:
derrickoswald
From: Claude D. <CD...@ar...> - 2002-07-11 16:29:59
|
The SWT is not a contender for replacing Swing. It may be an alternative, applicable in many circumstaces, but a quick look at the Sun's Swing connection should dissuade you from assuming that few people are using Swing. I would not endorse trying to make HTMLParser Swing-compatible. These are different animals and should stay that way. The notion of providing a SAX-like interface is interesting but you should look instead toward XML pull-parsers, which are the high-performance alternatives now surfacing more widely. There is a JSR (http://www.jcp.org/jsr/detail/173.jsp) that is trying to unify a good interface for pull-parsing (they're calling it a Streaming API). You'll find this link especially intersting (http://www.xmlpull.org/). =20 HTMLParser has two fundamental strengths. 1) It's easy to use and extend. 2) It's lightning fast. =20 Don't lose sight of these distinctions. The whole XML community is strugling to achieve these goals and hasn't quite gotten there yet. There's much to learn from XML, but they are laregely moving in this direction. =20 BTW: JTidy is a serious performance bottleneck in a high-performance application. =20 -----Original Message----- From: Somik Raha [mailto:so...@ya...]=20 Sent: Thursday, July 11, 2002 2:25 AM To: htm...@li... Subject: Re: [Htmlparser-user] Final Statistics from Trek Run Hi Craig, For example, the renderer built into Swing's JEditorPane expects callbacks resulting from well-formed HTML with certain (sometimes arbitrary) characteristics. (For example, a <head><title>X</title></head> section must exist, and X cannot be null). It is possible that the formatting of the input HTML into a structure with these characteristics reduces the parser's performance in order to produce a better render. =20 Indeed - perhaps a good idea would be to rewrite JEditorPane :) - make an open source version, which is better designed. Swing compatibility is a real pain - we gave up on that not so far back :). On the other hand, I was thinking that SAX compliance would be feasible and worth it - I doubt if many people are considering Swing for graphics these days, especially with the SWT being out there. But the SAX mechanism is quite popular and its worth being able to just switch parsers. =20 Of course, whether you need to take these considerations into account depends entirely on your application. The htmlparser seems to lean more toward the extraction of information rather than its representation, and the latter is so fraught with ambiguities as to make it a task of a different order altogether. So true. Like you had mailed sometime back, JTidy does a good job of that. =20 Regards, Somik ----- Original Message -----=20 From: Craig Raw <mailto:cr...@qu...> =20 To: htm...@li...=20 Sent: Thursday, July 11, 2002 5:35 PM Subject: [Htmlparser-user] RE: [Htmlparser-developer] Final Statistics from Trek Run Just a point to notice on these tests. The htmlparser, for all it's merits, is not a direct functional replacement for the Swing parser.=20 For example, the renderer built into Swing's JEditorPane expects callbacks resulting from well-formed HTML with certain (sometimes arbitrary) characteristics. (For example, a <head><title>X</title></head> section must exist, and X cannot be null). It is possible that the formatting of the input HTML into a structure with these characteristics reduces the parser's performance in order to produce a better render. Of course, whether you need to take these considerations into account depends entirely on your application. The htmlparser seems to lean more toward the extraction of information rather than its representation, and the latter is so fraught with ambiguities as to make it a task of a different order altogether. -craig -----Original Message----- From: htm...@li... [mailto:htm...@li...] On Behalf Of Somik Raha Sent: 11 July 2002 02:19 AM To: htm...@li...; htm...@li... Subject: Re: [Htmlparser-user] RE: [Htmlparser-developer] Final Statistics from Trek Run Hi Claude, Thanks a ton for all these tests. Do you think you could write an article on this that we could put up ? Regards Somik ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek PC Mods, Computing goodies, cases & more http://thinkgeek.com/sf _______________________________________________ Htmlparser-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-user |