Re: [Htmlparser-user] Change in Layout
Brought to you by:
derrickoswald
From: Somik R. <so...@ya...> - 2002-08-08 07:10:24
|
Hi Dhaval, This is actually a feature. If we try to give the exact same output = as originally parsed, the performance of the parser could be = compromised. Hence, giving a corresponding output with slightly = different formatting was chosen - in order to keep the design of the = parser simple. However, related to this is an interesing issue - for which = community feedback would be valuable. Currently, the formatting of = toHTML() is rather arbitrary (in my opinion). By this I am particularly = referring to the usage of end of line characters. Considering that end = of line characters differ for each operating system - would it be a good = idea to replace the hard-coded end of line characters with a the = detected end of line char for a particular OS ? Regards, Somik ----- Original Message -----=20 From: dha...@or...=20 To: htm...@li...=20 Sent: Thursday, August 08, 2002 3:52 PM Subject: [Htmlparser-user] Change in Layout Hi, I have an HTML page which I am rying to modify. During this process, I have come across a quirk. I don't know whether the problem is browser related or parser related. The following HTML code : <TD align=3D"left" valign=3D"top" width=3D"18"><img = src=3D"images/right_h1.gif" width=3D"18" height=3D"22"></TD> gets converted to <TD align=3D"left" valign=3D"top" width=3D"18"> <img src=3D"images/right_h1.gif" width=3D"18" height=3D"22"> </TD> This happens whenever I print back the parsed data using tag.toHTML(). These 2 seem to be the same but presentation-wise I see different outputs. Is it write on part of tag.toHTML() to printout the EOL character at the end of the tag. Regards,=20 Dhaval Udani=20 Senior Analyst=20 M-Line, QPEG=20 OrbiTech Solutions Ltd.=20 +91-22-8290019 Extn. 1457=20 -----Original Message----- From: somik [mailto:so...@ya...] Sent: Wednesday, August 07, 2002 10:26 AM To: htmlparser-user Cc: somik; htmlparser-developer Subject: Re: [Htmlparser-user] Another Ill-Formed Example =20 =20 =20 Hi Claude, This has been handled, related to the earlier fix. All potential infinite loops have been removed, and there will be no more = hangings - only HTMLParserExceptions from now on. There will be a release having all these fixes this weekend. =20 Regards, Somik ----- Original Message -----=20 From: Claude Duguay=20 To: htm...@li...=20 Sent: Wednesday, August 07, 2002 3:35 AM Subject: [Htmlparser-user] Another Ill-Formed Example Here's some markup we found in another document that causes the HTMLParser to hang. "<TITLE>KRP VALIDATION<PROCESS/TITLE>" So far, we've had 4 documents cause our process to come to a grinding halt. I would much prefer a policy of exception = throwing to hangs asap, followed by consideration of whether unusual = markup can be handled more elegantly in a subsequent phase. Thanks to everyone, as always. =20 =20 |