RE: [Htmlparser-user] Change in Layout
Brought to you by:
derrickoswald
|
From: <dha...@or...> - 2002-08-08 07:37:30
|
Hi,
=A0
I would definitely appreciate converting the hard-coded end-of-line
character with a detected end-of-line character from the system
property. Currently I read the entire file and replace the hard-coded
EOL with the system property EOL.
=A0
I think the last EOL for toHTML() should be removed and instead all "\n"
should be also parsed and reproduced exactly in the same way. Preserving
layout shoudl be as important as performance. Also my feeling is that
this tool will be used mostly by developers during development time and
not at runtime(though it is always possible) and hence performance may
not be an issue here.
=A0
Please feel free to criticize my opinion.
=A0
Typically my predicament is as follows :
=A0
My team is=A0building a framework which is used by many projects in my
organization. All the other projects create HTML with their own
look-and-feel. To use the framework, they need to convert these files
into a JSP(using a tool developed by my team). The tool apart from jsut
changing the extension ;) also adds lots of JSP code and makes certain
modifications to the HTML tags(not the presentation tags though). After
the JSP is created if the layout changes, they will ahve to again spend
time correcting this anomaly and will need to keep doing it everytime
they change their HTML page or the tool is updated. Now I guess you can
understand why I feel so strongly about maintaining layout.
=A0
At the same time I am aware that the parser is here for everyone's need
and will be driven accordingly. Hence am just presenting my point of
view.
Regards,=20
Dhaval Udani=20
Senior Analyst=20
M-Line, QPEG=20
OrbiTech Solutions Ltd.=20
+91-22-8290019 Extn. 1457=20
=A0
-----Original Message-----
From: somik [mailto:so...@ya...]
Sent: Thursday, August 08, 2002 12:33 PM
To: htmlparser-user
Cc: somik
Subject: Re: [Htmlparser-user] Change in Layout
=20
=20
=20
Hi Dhaval,
=A0=A0=A0 This is actually a feature. If we try to give the exact same
output as originally parsed, the performance of the parser could be
compromised. Hence, giving=A0a corresponding output with slightly
different formatting was chosen - in order to keep the design of the
parser simple.
=A0=A0=A0 However, related to this is an interesing issue - for which
community feedback would be valuable. Currently, the formatting of
toHTML() is rather arbitrary (in my opinion). By this I am
particularly referring to the usage of end of line characters.
Considering that=A0end of line characters differ=A0for each operating
system - would it be a good idea to replace the hard-coded end of
line characters with a the detected end of line char for a particular
OS ?
=A0
Regards,
Somik
----- Original Message -----=20
From: dha...@or...=20
To: htm...@li...=20
Sent: Thursday, August 08, 2002 3:52 PM
Subject: [Htmlparser-user] Change in Layout
Hi,
=20
I have an HTML page which I am rying to modify. During this
process, I
have come across a quirk. I don't know whether the problem is
browser
related or parser related.
=20
The following HTML code :
<TD align=3D"left" valign=3D"top" width=3D"18"><img
src=3D"images/right_h1.gif"
width=3D"18" height=3D"22"></TD>
=20
gets converted to
<TD align=3D"left" valign=3D"top" width=3D"18">
<img src=3D"images/right_h1.gif" width=3D"18" height=3D"22">
</TD>
=20
This happens whenever I print back the parsed data using
tag.toHTML().
=20
These 2 seem to be the same but presentation-wise I see different
outputs. Is it write on part of tag.toHTML() to printout the EOL
character at the end of the tag.
=20
Regards,=20
=20
Dhaval Udani=20
Senior Analyst=20
M-Line, QPEG=20
OrbiTech Solutions Ltd.=20
+91-22-8290019 Extn. 1457=20
=20
=20
=20
=A0=A0 -----Original Message-----
=A0=A0 From: somik [ mailto:so...@ya...]
=A0=A0 Sent: Wednesday, August 07, 2002 10:26 AM
=A0=A0 To: htmlparser-user
=A0=A0 Cc: somik; htmlparser-developer
=A0=A0 Subject: Re: [Htmlparser-user] Another Ill-Formed Example
=A0=A0=20
=A0=A0=20
=20
=A0=A0=20
=A0=A0 Hi Claude,
=A0=A0 This has been handled, related to the earlier fix. All
potential
=A0=A0 infinite loops have been removed, and there will be no more
hangings
=A0=A0 - only HTMLParserExceptions from now on.
=A0=A0 There will be a release having all these fixes this weekend.
=A0=A0=20
=A0=A0 Regards,
=A0=A0 Somik
=20
=A0=A0=A0=A0=A0 ----- Original Message -----=20
=A0=A0=A0=A0=A0 From: Claude Duguay=20
=A0=A0=A0=A0=A0 To: htm...@li...=20
=A0=A0=A0=A0=A0 Sent: Wednesday, August 07, 2002 3:35 AM
=A0=A0=A0=A0=A0 Subject: [Htmlparser-user] Another Ill-Formed Examp=
le
=20
=20
=A0=A0=A0=A0=A0 Here's some markup we found in another document tha=
t causes
the
=A0=A0=A0=A0=A0 HTMLParser to hang.
=20
=A0=A0=A0=A0=A0 "<TITLE>KRP VALIDATION<PROCESS/TITLE>"
=20
=A0=A0=A0=A0=A0 So far, we've had 4 documents cause our process to =
come to a
=A0=A0=A0=A0=A0 grinding halt. I would much prefer a policy of exce=
ption
throwing
=A0=A0=A0=A0=A0 to hangs asap, followed by consideration of whether=
unusual
markup
=A0=A0=A0=A0=A0 can be handled more elegantly in a subsequent phase=
. Thanks
to
=A0=A0=A0=A0=A0 everyone, as always.
=20
=A0=A0=A0=A0=A0=20
=20
=A0=A0=20
=20
=20
=20
=20
|