Re: [Htmlparser-user] Final Statistics from Trek Run
Brought to you by:
derrickoswald
From: Somik R. <so...@ya...> - 2002-07-11 09:31:42
|
Hi Craig, For example, the renderer built into Swing's JEditorPane expects callbacks resulting from well-formed HTML with certain (sometimes arbitrary) characteristics. (For example, a <head><title>X</title></head> section must exist, and X cannot be null). It is possible that the formatting of the input HTML into a structure with these characteristics reduces the parser's performance in order to produce a better render. =20 Indeed - perhaps a good idea would be to rewrite JEditorPane :) - make = an open source version, which is better designed. Swing compatibility is = a real pain - we gave up on that not so far back :). On the other hand, = I was thinking that SAX compliance would be feasible and worth it - I = doubt if many people are considering Swing for graphics these days, = especially with the SWT being out there. But the SAX mechanism is quite = popular and its worth being able to just switch parsers. Of course, whether you need to take these considerations into account depends entirely on your application. The htmlparser seems to lean more toward the extraction of information rather than its representation, and the latter is so fraught with ambiguities as to make it a task of a different order altogether. So true. Like you had mailed sometime back, JTidy does a good job of = that. Regards, Somik ----- Original Message -----=20 From: Craig Raw=20 To: htm...@li...=20 Sent: Thursday, July 11, 2002 5:35 PM Subject: [Htmlparser-user] RE: [Htmlparser-developer] Final Statistics = from Trek Run Just a point to notice on these tests. The htmlparser, for all it's merits, is not a direct functional replacement for the Swing parser.=20 For example, the renderer built into Swing's JEditorPane expects callbacks resulting from well-formed HTML with certain (sometimes arbitrary) characteristics. (For example, a <head><title>X</title></head> section must exist, and X cannot be = null). It is possible that the formatting of the input HTML into a structure with these characteristics reduces the parser's performance in order = to produce a better render. Of course, whether you need to take these considerations into account depends entirely on your application. The htmlparser seems to lean = more toward the extraction of information rather than its representation, = and the latter is so fraught with ambiguities as to make it a task of a different order altogether. -craig -----Original Message----- From: htm...@li... [mailto:htm...@li...] On Behalf Of Somik Raha Sent: 11 July 2002 02:19 AM To: htm...@li...; htm...@li... Subject: Re: [Htmlparser-user] RE: [Htmlparser-developer] Final Statistics from Trek Run Hi Claude, Thanks a ton for all these tests. Do you think you could write an article on this that we could put up ? Regards Somik ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek PC Mods, Computing goodies, cases & more http://thinkgeek.com/sf _______________________________________________ Htmlparser-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-user |