[Htmlparser-developer] RE: [Htmlparser-user] version 1.5
Brought to you by:
derrickoswald
From: Marc N. <ma...@ke...> - 2004-02-17 18:27:06
|
Just to clarify -- the library already does most of the things I list = below (i.e. I've already implemented them using a semi-current version = of HTMLParser). However, I'm listing them here so they may be = considered as one of the many use cases for the library. I also want to commend Derrick for all the work he's put into the = project! Marc -----Original Message----- From: Marc Novakowski=20 Sent: Tuesday, February 17, 2004 10:12 AM To: htm...@li...; htm...@li... Subject: RE: [Htmlparser-user] version 1.5 I'm a big fan of server-side transforms. That is, scanning an HTML = document and transforming parts of it into custom markup and/or DHTML. = I do this using a servlet filter in Tomcat. I'm currently using an older version of the library (from 08/24/2003) -- = before the major code changes were made, mostly because I've been too = busy working on other things to port my code to the new APIs. I hope to = get to it eventually! :) However, if you're looking for feedback, then here's what I would find = useful in the library. It may or may not already do the following to = certain degrees. But if anything in this list can be made easy(ier) = than I'm all for it: - scan an HTML page for "custom" XML/HTML tags embedded within the HTML - maintain both the original HTML and the location of the XML "islands" = within it - provide mechanisms to parse different kinds of custom tags, including = the following: - very simple tags (like <br>) - value-only tags (like <a>value</a>) - composite tags (like <ul>) - tags that contain "anything", which the parser simply skips over (similar to <script>, but even dumber so that all it looks for is the = closing tag) - APIs that allow the definition of the custom tags (above) without = having to create a custom scanner and tag class for each one For illustrative purposes, here's an example of what some of my custom = tags look like: <html> <body> <h2>Here is the chart</h2> <Component name=3D"myChart" incorporates=3D"Chart"> <String name=3D"backgroundColor" value=3D"white"/> <String name=3D"foregroundColor" value=3D"black"/> <Number name=3D"width" value=3D"200"/> <Number name=3D"height" value=3D"400"/> <Reference name=3D"data" value=3D"dataModel"/> <Method name=3D"changeSize"> <Param name=3D"width"/> <Param name=3D"height"/> <Impl> // This is javascript code this.width.set(width); this.height.set(height); this.render(); </Impl> </Method> </Component> <hr> blah blah .... (more HTML) .... </body> </html> Hope this helps! Marc -----Original Message----- From: Derrick Oswald [mailto:Der...@Ro...] Sent: Tuesday, February 17, 2004 4:40 AM To: htm...@li...; htm...@li... Subject: [Htmlparser-user] version 1.5 Now that version 1.4 is nearly put to bed, it's time to look forward=20 into the future to visualize or 'blue sky' the features that could be=20 incorporated in the next version of the parser. There are a small number = of feature requests that have accumulated over the last few months that=20 can serve as a starting point:=20 http://sourceforge.net/tracker/?group_id=3D24399&atid=3D381402 But what is really required are some real use-cases that aren't=20 addressed by the curent parser, which will lead to real requirements,=20 which lead to real features that can be added to the parser for the next = version. What does everyone do with the htmlparser that could be built=20 into it? Or more to the point, what capabilities are lacking that cause=20 a developer to *not* use htmlparser and do it themselves some other way? = Does anybody have any ideas? Does anybody have some applications they=20 would like to add to the htmlparser codebase so that 'out-of-the-box' it = does what they want? In general, what directions should development=20 take, i.e. HTML correction or editing, XML, robots, server side=20 transforms etc.? Has anybody got some pet peeves they want cleared up?=20 Come on, give it up. Now's the time. Derrick ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=3D1356&alloc_id=3D3438&op=3Dclick _______________________________________________ Htmlparser-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-user ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id438&op=3Dick _______________________________________________ Htmlparser-user mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-user |