Re: [Htmlparser-user] IFRAME and LINK tags
Brought to you by:
derrickoswald
From: Ian M. <ian...@gm...> - 2006-02-24 21:46:21
|
Actually, if you read the W3C specs, it really does look like the two would sit quite happily in a single class. In reality, they are semantically the same thing, except one of them is visible and one of them is not. Please have a look over the specs at http://www.w3.org/TR/REC-html40/struct/links.html again and let me know what you think. data: and view-source: are just protocols like http:, ftp:, javascript: etc. data: is used to store the file data in the html source (so an image can be encoded in a web page and only one file gets served), view-source: just means to open up the source code viewer for the URL rather than the HTML renderer. Ian On 24/02/06, Derrick Oswald <Der...@ro...> wrote: > > If you want to reuse the LinkTag name it should wait for 1.7 (or 2.0, > whatever). > That would mean an ATag class? > The boolean seems like overkill.... simplify, simplify, simplify... <A > for links, <LINK for anchors. > > Sorry, I don't know what you mean by data: and view-source: protocols. > > setLink sounds right. The others are legacy stuff that should probably > be cleaned out. > > rel and rev, yes, adding tag specific methods is exactly what a class > for each tag is all about. > > Ian Macfarlane wrote: > > >This project is still alive, if under slow development. There are > >still are number of checkins being made fairly often, and we are > >possibly going to branch for a 1.6 release. > > > >The name LinkTag has indeed been taken for anchor tag, but we can't > >change it now due to backwards compatibility reasons. > > > >I think we might want to make LinkTag support <link> tags, and have a > >boolean method that says if it's an anchor or not. In fact, reading > >the W3C spec on this > >(http://www.w3.org/TR/REC-html40/struct/links.html) this seems like it > >might be the right thing to do. > > > >Can I get some feedback from some of the other devs on this? If it > >seems like a good idea to do it this way? It looks to me like it > >probably is the best way to do it semantically and practically. > > > >Other things that look like they should be done (devs: please shout if > >you don't want any of this done): > > > >- add support for the data: and view-source: protocols > >- deprecate setMailLink and setJavascriptLink in favour of setLink > >- add get/set for rel and rev attributes > > > >Ian > > > >On 23/02/06, Lu=EDs Manuel dos Santos Gomes <lui...@gm...> wrot= e: > > > > > >>Hello, > >> > >>I cannot migrate all my work to the C#/.NET platform, although HTML > >>parsing is a core functionality of my project. > >>I'm coding a crawler to feed our natural language research group with > >>corpus from the web. Currently I'm still evaluating options for the > >>HTML parsing module. I have developed my own HTML scanner based on > >>Java regexps, but it is too much difficult to maintain and extend > >>(after all, it can be a project by itself). > >> > >>My needs are far beyond the simple link extraction/modification. I > >>must handle every single tag that may reference an external resource > >>(and that includes IFrame). This includes parsing embedded CSS > >>imports. Embedded Javascript is still a problem... > >> > >>Anyway, the BIG question is: is this project alive? > >>I know it is an open source project that is supported by people free > >>will, and I find that _very_ _meritorious_. > >>I'm putting this question because I will make a decision now. > >> > >>I still would appreciate some feedback on subject of this thread (the > >>original post follows) > >> > >>Lu=EDs > >> > >>On Feb 15, 2006, at 4:30 PM, Third Eye wrote: > >> > >> > >> > >>>Hi! > >>>We did implement IFrameTag and named the class as IFrameTag. Our > >>>implementation is .Net port of this library and we have added some of > >>>our own enhancements. > >>>If you are interested, you can download it from > >>> > >>>http://www.netomatix.com > >>> > >>>Naveen > >>> > >>>On 2/15/06, Lu=EDs Manuel dos Santos Gomes <lui...@gm...> > >>>wrote: > >>> > >>> > >>>>Hi everybody. > >>>> > >>>>This is my first post to this list. > >>>>I'm replacing my own html processing code (regex based) with > >>>>HTMLParser. > >>>>The examples have been a great help! > >>>> > >>>>I need to handle IFRAME and LINK tags. The link tag is often used to > >>>>include external CSS. > >>>>The name "LinkTag" has already been taken for the anchor tags! How > >>>>should I name the class to handle the LINK tags? > >>>>Have anybody implemented the IframeTag and the "TrueLinkTag" classes? > >>>>I could do this and would be glad to contribute it to the project. > >>>>I'm using the version 20051112. I've not checked out from CVS because > >>>>I need a stable package. > >>>> > >>>>Cheers! > >>>> > >>>>Lu=EDs Gomes > >>>>(from Portugal) > >>>> > >>>> > >>>>------------------------------------------------------- > >>>>This SF.net email is sponsored by: Splunk Inc. Do you grep through > >>>>log files > >>>>for problems? Stop! Download the new AJAX search engine that makes > >>>>searching your log files as easy as surfing the web. DOWNLOAD > >>>>SPLUNK! > >>>>http://sel.as-us.falkag.net/sel?cmdlnk&kid=103432&bid#0486&dat=121642 > >>>>_______________________________________________ > >>>>Htmlparser-user mailing list > >>>>Htm...@li... > >>>>https://lists.sourceforge.net/lists/listinfo/htmlparser-user > >>>> > >>>> > >>>> > >>>-- > >>>Naveen K Kohli > >>>http://www.netomatix.com > >>> > >>> > >>>------------------------------------------------------- > >>>This SF.net email is sponsored by: Splunk Inc. Do you grep through > >>>log files > >>>for problems? Stop! Download the new AJAX search engine that makes > >>>searching your log files as easy as surfing the web. DOWNLOAD > >>>SPLUNK! > >>>http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=103432&bid#0486&dat=1216= 42 > >>>_______________________________________________ > >>>Htmlparser-user mailing list > >>>Htm...@li... > >>>https://lists.sourceforge.net/lists/listinfo/htmlparser-user > >>> > >>> > >>> > >> > >>------------------------------------------------------- > >>This SF.Net email is sponsored by xPML, a groundbreaking scripting lang= uage > >>that extends applications into web and mobile media. Attend the live we= bcast > >>and join the prime developer group breaking into this new coding territ= ory! > >>http://sel.as-us.falkag.net/sel?cmdlnk&kid=110944&bid$1720&dat=121642 > >>_______________________________________________ > >>Htmlparser-user mailing list > >>Htm...@li... > >>https://lists.sourceforge.net/lists/listinfo/htmlparser-user > >> > >> > >> > > > > > >------------------------------------------------------- > >This SF.Net email is sponsored by xPML, a groundbreaking scripting langu= age > >that extends applications into web and mobile media. Attend the live web= cast > >and join the prime developer group breaking into this new coding territo= ry! > >http://sel.as-us.falkag.net/sel?cmd=3Dk&kid=110944&bid$1720&dat=121642 > >_______________________________________________ > >Htmlparser-user mailing list > >Htm...@li... > >https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > > > > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting langua= ge > that extends applications into web and mobile media. Attend the live webc= ast > and join the prime developer group breaking into this new coding territor= y! > http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944&bid=3D241720&dat= =3D121642 > _______________________________________________ > Htmlparser-user mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > |