htmlparser-developer Mailing List for HTML Parser (Page 12)
Brought to you by:
derrickoswald
You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(4) |
Nov
(1) |
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(12) |
Feb
|
Mar
(7) |
Apr
(27) |
May
(14) |
Jun
(16) |
Jul
(27) |
Aug
(74) |
Sep
(1) |
Oct
(23) |
Nov
(12) |
Dec
(119) |
2003 |
Jan
(31) |
Feb
(23) |
Mar
(28) |
Apr
(59) |
May
(119) |
Jun
(10) |
Jul
(3) |
Aug
(17) |
Sep
(8) |
Oct
(38) |
Nov
(6) |
Dec
(1) |
2004 |
Jan
(4) |
Feb
(4) |
Mar
(1) |
Apr
(2) |
May
|
Jun
(7) |
Jul
(6) |
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2005 |
Jan
|
Feb
(1) |
Mar
|
Apr
(8) |
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
(10) |
Oct
(4) |
Nov
(15) |
Dec
|
2006 |
Jan
|
Feb
(1) |
Mar
|
Apr
(4) |
May
(11) |
Jun
|
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
2007 |
Jan
(3) |
Feb
(2) |
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2008 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(5) |
Oct
(1) |
Nov
|
Dec
|
2009 |
Jan
|
Feb
(1) |
Mar
|
Apr
(2) |
May
|
Jun
(4) |
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
(2) |
2010 |
Jan
(1) |
Feb
|
Mar
|
Apr
(8) |
May
|
Jun
|
Jul
|
Aug
|
Sep
(6) |
Oct
|
Nov
(1) |
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
(1) |
2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
(2) |
From: <dha...@or...> - 2003-05-13 08:16:20
|
Hi, I had started using HTMLParser version 1.2 sometime in August of last year. At that time the parser had more of a flat structure unlike today's tree structure with parents, children etc. At that time, I could find all the nodes irrespective of their depth in the following manner: NodeIterator e = lHTMLParser.elements(); while (e.hasMoreNodes()) { Node lNode = (Node)e.nextNode(); <do something> } With the advent of 1.3 the tree structure came in, in which some nodes were inside other nodes. I have registered the scanners whose tags I want. However if these tags are nested within other tags that I have registered, then the above scenario does not work. I need to go deeper. That is not always feasible. Is there any mechanism in 1.3 like the one above using which I can get all the nodes irrespective of their nested level. Regards, Dhaval Udani Senior Analyst M-Line, QPEG OrbiTech Solutions Ltd. +91-22-28290019 Extn. 1457 |
From: <dha...@or...> - 2003-05-13 03:41:54
|
Dhaval Udani wrote: >> A STARTERS array would be useful to tell the scanner that when a particular >> start tag(say another OPTION tag in this case), as opposed to a end tag denoted >> by ENDERS, is encountered also perform end tag correction. I hope I've been >> able to explain the need more clearly. > I still don't follow why you would need a STARTERS array - if you encounter > another OPTION tag, the behavior is determined by the boolean variable - to > add or not to add children, and correction is done automatically. Or am I > missing something ? I put OPTION tag as an example probably a wrong one. Say something like this is there: <P> blah blah blah <TABLE> Now what I am saying that in the P scanner, if TABLE is provided as a member of the STARTERS array then a </P> will be put up before the beginning of <TABLE> tag. In essence the way the ENDERS array looks for a tag of type EndTag, similarly STARTERS array would look for a start tag of the type defined. I hope I've been clearer. Do let me know. Dhaval |
From: Somik R. <so...@ya...> - 2003-05-12 22:26:21
|
Dhaval Udani wrote: > A STARTERS array would be useful to tell the scanner that when a particular > start tag(say another OPTION tag in this case), as opposed to a end tag denoted > by ENDERS, is encountered also perform end tag correction. I hope I've been > able to explain the need more clearly. I still don't follow why you would need a STARTERS array - if you encounter another OPTION tag, the behavior is determined by the boolean variable - to add or not to add children, and correction is done automatically. Or am I missing something ? Regards, Somik |
From: Marc N. <ma...@ke...> - 2003-05-12 16:38:35
|
My $0.02: I don't mind if you make it disallowed by default, as long as = you don't break the ability for it to have nested tags of the same type. = I extend CompositeTagScanner quite a bit in my own code to parse = "custom" XML tags inside of an HTML page, and that code relies heavily = on the current capability of CompositeTagScanner. Marc -----Original Message----- From: dha...@or... [mailto:dha...@or...] Sent: Thursday, May 08, 2003 9:33 PM To: htm...@li... Subject: [Htmlparser-developer] CompositeTagScanner - Some comments Hi, A lot of thought has definitely gone into the design of the=20 CompositeTagScanner. Some absolutely wonderful work has been done here. = Somik,=20 had asked me to have a look at the code and review it. I just have one = point=20 for discussion. The CompositeTagScanner has a provision to allow for nested children. = However I=20 feel there are very few HTML tags which have children of the same type. = By=20 default the scanner allows nesting. I believe this behaviour should be=20 disallowed by default. my $0.02 ;) dhaval ------------------------------------------------------- Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara The only event dedicated to issues related to Linux enterprise solutions www.enterpriselinuxforum.com _______________________________________________ Htmlparser-developer mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-developer |
From: <dha...@or...> - 2003-05-12 11:05:06
|
I've tried the port 6000 from home as well and it fails just as fast. Another aspect is that if it were on my firewall it should affect IE & Netscape uniformly. Nothing of that sort happening. IE works without a hitch. Anyway seems to be my problem only. IE hurray!!! (For once Microsoft seems on top :( ) Dhaval P.S. If anyone else is facing/has faced a simialr problem do let me know what can be done about it. -----Original Message----- From: DerrickOswald [mailto:Der...@ro...] Sent: Monday, May 12, 2003 4:08 PM To: htmlparser-developer Cc: DerrickOswald Subject: Re: [Htmlparser-developer] failing unit tests No problem using Netscape 7 (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0). Your port 6000 problem might be caused by your firewall. That would return pretty fast because it's local. A lot of firwalls can be configured to prohibit outbound connections on unknown ports too as a measure to limit damage from compromised machines. Derrick |
From: Derrick O. <Der...@ro...> - 2003-05-12 10:49:48
|
No problem using Netscape 7 (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0). Your port 6000 problem might be caused by your firewall. That would return pretty fast because it's local. A lot of firwalls can be configured to prohibit outbound connections on unknown ports too as a measure to limit damage from compromised machines. Derrick dha...@or... wrote: >Hey guys, > >Has anyone tried accessing the "Login with SSL" link from a Netscape >browser(used 6.2.1). I tried it and got an error : "www.sourceforge.net cannot >be found." > >Another thing I have noticed with Netscape browsers is that they do not allow >any access to a server running on port 6000. Basically this port itself does >not work. U get an error so fast that it feels as if it is disallowed in the >browser itself. > >Slightly off topic....but still..... > >Regards, > >Dhaval Udani >Senior Analyst >M-Line, QPEG >OrbiTech Solutions Ltd. >+91-22-28290019 Extn. 1457 > > > >-----Original Message----- >From: DerrickOswald [mailto:Der...@ro...] >Sent: Monday, May 12, 2003 2:05 AM >To: htmlparser-developer >Cc: DerrickOswald >Subject: [Htmlparser-developer] failing unit tests > > > >To avoid confusion about what failing unit tests are valid, I've made >all the pending bugs into feature requests (yeah, I can justify that), >and made the failing unit tests execution conditional on the Parser >version being >= 1.4. > >This next integration release should have a clean JUnit run. > > > > >------------------------------------------------------- >Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara >The only event dedicated to issues related to Linux enterprise solutions >www.enterpriselinuxforum.com > >_______________________________________________ >Htmlparser-developer mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlparser-developer > > > >------------------------------------------------------- >Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara >The only event dedicated to issues related to Linux enterprise solutions >www.enterpriselinuxforum.com > >_______________________________________________ >Htmlparser-developer mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlparser-developer > > > |
From: <dha...@or...> - 2003-05-12 08:49:46
|
Hi, What I am saying is my understanding of the scanner. Do forgive me if I don't understand it correctly. The MATCH_IDS array is used to "match" the tags that should be parsed by this scanner as tags of particular types. i.e. for a OPTION scanner the MATCH_IDS tag would have OPTION as its member. This tells the scanning engine that whenever OPTION is encountered create an instance of a OptionTag. At the same time it has SELECT as a member of the ENDERS array. This means that whenever the end tag of SELECT i.e. </SELECT> is encountered, correction to close OPTION tag should take place. A STARTERS array would be useful to tell the scanner that when a particular start tag(say another OPTION tag in this case), as opposed to a end tag denoted by ENDERS, is encountered also perform end tag correction. I hope I've been able to explain the need more clearly. Regards, Dhaval Udani Senior Analyst M-Line, QPEG OrbiTech Solutions Ltd. +91-22-28290019 Extn. 1457 -----Original Message----- From: somik [mailto:so...@ya...] Sent: Saturday, May 10, 2003 8:13 PM To: htmlparser-developer Cc: somik Subject: Re: [Htmlparser-developer] CompositeTagScanner - Some comments Dhaval Udani wrote: > The concept of MATCH_IDS and ENDERS array is great. A STARTERS array could also > be useful in the correction procedure. If any tag from this array is > encountered automatic correction could be done to end the previous tag. STARTERS is actually what MATCH_IDS is for. Why do you want a seperate STARTERS array ? Regards, Somik ------------------------------------------------------- Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara The only event dedicated to issues related to Linux enterprise solutions www.enterpriselinuxforum.com _______________________________________________ Htmlparser-developer mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-developer |
From: <dha...@or...> - 2003-05-12 07:05:16
|
Hey guys, Has anyone tried accessing the "Login with SSL" link from a Netscape browser(used 6.2.1). I tried it and got an error : "www.sourceforge.net cannot be found." Another thing I have noticed with Netscape browsers is that they do not allow any access to a server running on port 6000. Basically this port itself does not work. U get an error so fast that it feels as if it is disallowed in the browser itself. Slightly off topic....but still..... Regards, Dhaval Udani Senior Analyst M-Line, QPEG OrbiTech Solutions Ltd. +91-22-28290019 Extn. 1457 -----Original Message----- From: DerrickOswald [mailto:Der...@ro...] Sent: Monday, May 12, 2003 2:05 AM To: htmlparser-developer Cc: DerrickOswald Subject: [Htmlparser-developer] failing unit tests To avoid confusion about what failing unit tests are valid, I've made all the pending bugs into feature requests (yeah, I can justify that), and made the failing unit tests execution conditional on the Parser version being >= 1.4. This next integration release should have a clean JUnit run. ------------------------------------------------------- Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara The only event dedicated to issues related to Linux enterprise solutions www.enterpriselinuxforum.com _______________________________________________ Htmlparser-developer mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-developer |
From: Derrick O. <Der...@ro...> - 2003-05-11 20:47:46
|
To avoid confusion about what failing unit tests are valid, I've made all the pending bugs into feature requests (yeah, I can justify that), and made the failing unit tests execution conditional on the Parser version being >= 1.4. This next integration release should have a clean JUnit run. |
From: Derrick O. <Der...@ro...> - 2003-05-11 17:29:00
|
I also sent that message to their users.sourceforge.net accounts, so they should get it. Perhaps what's needed is a "Guide to Development" that lays out the environment you need (remember the trouble I had with ssh?), the test-case development idea, lists to listen to, and other stuff in general. Derrick Somik Raha wrote: >Hi Derrick > > >>Philippe - Since Somik used to add developers, if there's something that >>isn't working for you, let me know. >> >> > >Sorry for not explaining more - I am not sure if either Philippe or Raju are >on this list. Apart from adding them and making them developers, you will >need to ask them to join the htmlparser-developer list. > >Regards, >Somik > > > |
From: Somik R. <so...@ya...> - 2003-05-11 13:42:53
|
Hi Derrick > Philippe - Since Somik used to add developers, if there's something that > isn't working for you, let me know. Sorry for not explaining more - I am not sure if either Philippe or Raju are on this list. Apart from adding them and making them developers, you will need to ask them to join the htmlparser-developer list. Regards, Somik |
From: Derrick O. <Der...@ro...> - 2003-05-11 04:55:47
|
Dhaval, When I fixed bug #735183 Problem in Label Scanning, I had to modify the SelectTagTest and LabelScannerTest. I know, that's cheating. Can you check that they still represent the spirit of the tests as you coded them. Derrick |
From: Derrick O. <Der...@ro...> - 2003-05-10 17:48:17
|
Please welcome Philippe Blanc. He joins htmlparser as well as administering his own project, oyoaha look and feel, http://sourceforge.net/projects/oalnf/, a java swing look and feel with theme, animation, sound, and alpha channel support (some cool pics on the home page). He is currently making heavy use of htmlparser in one of his projects and would like to be able to participate at its development. You can find out more about Philippe at his web page http://www.oyoaha.com . Philippe - Since Somik used to add developers, if there's something that isn't working for you, let me know. Derrick |
From: Derrick O. <Der...@ro...> - 2003-05-10 17:30:48
|
Please welcome K.Vamsidhar Raju. He joins htmlparser as well as participating in the JTools project. He is a java programmer and can also work with javascript, HTML and XML. Raju - Since this is the first time I've added a developer (Somik used to do these things) if there's something that isn't working for you, let me know. Derrick |
From: Somik R. <so...@ya...> - 2003-05-10 16:04:37
|
> Wow!!! I think this is a record number of mails I must have sent to HTMLParser > in a day. This level of activity is also a record for the project. Thanks to you, Derrick, Marc... Derrick, you're a terrific project lead, and I wish I had stepped down earlier.. Thanks to you for the idea about the auto generation of the CVS log - I am going to use it in a project at work. Dhaval --> keep your critiques (and thoughts) coming - they will go a long way in improving the parser. Regards, Somik |
From: Somik R. <so...@ya...> - 2003-05-10 15:56:45
|
Derrick, This is good stuff! The changes.html had to be done as some of the logs had html tags in them, and they don't show up on the change log page unless they are formatted to use < and > This could probably be done with the ant script.. or you might review if you need to do it at all. Cheers, Somik ----- Original Message ----- From: "Derrick Oswald" <Der...@ro...> To: <htm...@li...> Sent: Friday, May 09, 2003 12:18 AM Subject: [Htmlparser-developer] changes.txt > > OK, I'm going to try using the cvs2cl script to automatically create the > change log for a release. > > That means you don't have to update changes.txt when you drop code. > yeaahhh! > > But, you do have to make the commit messages as good or better than the > changes.txt message was. boooo! > > For guidelines on what to put in commit messages see: > http://www.red-bean.com/cvs2cl/changelogs.html > Remember, the commit messages will now be visible to end users, so try > to use whole sentences and valid grammar. > > Also, the script uses a time window and identical message text to unify > separate file log messages into a chronological sequence of activity. > So, the rule is, drop everything as close together in time as you can > (one drop is best of course, but sometimes it doesn't work that way), > and use the same log message for all files related to a particular change. > > Derrick > > > > > ------------------------------------------------------- > Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara > The only event dedicated to issues related to Linux enterprise solutions > www.enterpriselinuxforum.com > > _______________________________________________ > Htmlparser-developer mailing list > Htm...@li... > https://lists.sourceforge.net/lists/listinfo/htmlparser-developer |
From: Somik R. <so...@ya...> - 2003-05-10 15:52:55
|
Derrick Oswald wrote: > Changing the two 'true' default constructor values to 'false' only > breaks one test case, testDoubleTitleTag. Hmm, I can see a couple of failing tests, so didnt have the confidence to play around with the code. In any case, your expectation is correct - it will be good if you can produce a testcase in CompositeTagScannerTest class - using the CustomTag/AnotherTag classes.. I've been swamped all week (and out of town on an assignment). This will continue for 3 months, so my contributions will be sporadic. Regards, Somik |
From: Somik R. <so...@ya...> - 2003-05-10 14:45:10
|
Dhaval Udani wrote: > I wanted to know whether end tag correction takes place in the > CompositeTagScanner when end of stream is encountered. If not then I think that > too should happen. Yes it does - I believe there are some tests for this. Regards Somik |
From: Somik R. <so...@ya...> - 2003-05-10 14:44:09
|
Dhaval Udani wrote: > The concept of MATCH_IDS and ENDERS array is great. A STARTERS array could also > be useful in the correction procedure. If any tag from this array is > encountered automatic correction could be done to end the previous tag. STARTERS is actually what MATCH_IDS is for. Why do you want a seperate STARTERS array ? Regards, Somik |
From: Derrick O. <Der...@ro...> - 2003-05-09 22:53:20
|
Changing the two 'true' default constructor values to 'false' only breaks one test case, testDoubleTitleTag. The node count changes from 7 to 10. Correct me if I'm wrong, but with only the title scanner registered, <html><head><TITLE> <html><head><TITLE> Double tags can hang the code </TITLE></head><body> <body><html> should yield 8 tags: <html> <head> <TITLE> containing <html><head> and a generated </TITLE> <TITLE> containing "Double tags can hang the code"</TITLE> </head> <body> <body> <html> which isn't either of those answers, the original or the new one. In the test case, the first TITLE tag is correct, but the second contains the string and /TITLE but doesn't consume them, they are returned separately: <html> <head> <TITLE> containing <html><head> and a generated </TITLE> <TITLE> containing "Double tags can hang the code"</TITLE> "Double tags can hang the code </TITLE> </head> <body> <body> <html> Does it still need work? Derrick dha...@or... wrote: >Hi, > >A lot of thought has definitely gone into the design of the >CompositeTagScanner. Some absolutely wonderful work has been done here. Somik, >had asked me to have a look at the code and review it. I just have one point >for discussion. > >The CompositeTagScanner has a provision to allow for nested children. However I >feel there are very few HTML tags which have children of the same type. By >default the scanner allows nesting. I believe this behaviour should be >disallowed by default. > >my $0.02 ;) > >dhaval > > > > > > >------------------------------------------------------- >Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara >The only event dedicated to issues related to Linux enterprise solutions >www.enterpriselinuxforum.com > >_______________________________________________ >Htmlparser-developer mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlparser-developer > > > |
From: Derrick O. <Der...@ro...> - 2003-05-09 22:08:49
|
Version 1.13 of this file dropped May 2nd fixed this: 'remove JDK 1.4 isms' dha...@or... wrote: >The assertStringValueMatches() method in ParserTestCase contains >reference to String.replaceAll() method which I believe is introduced in >JDK 1.4. Can you please remove the same? > >Regards, > >Dhaval Udani >Senior Analyst >M-Line, QPEG >OrbiTech Solutions Ltd. >+91-22-28290019 Extn. 1457 > > > >-----Original Message----- >From: DerrickOswald [mailto:Der...@ro...] >Sent: Friday, May 09, 2003 9:49 AM >To: htmlparser-developer >Cc: DerrickOswald >Subject: [Htmlparser-developer] changes.txt > > > >OK, I'm going to try using the cvs2cl script to automatically create the > >change log for a release. > >That means you don't have to update changes.txt when you drop code. >yeaahhh! > >But, you do have to make the commit messages as good or better than the >changes.txt message was. boooo! > >For guidelines on what to put in commit messages see: > http://www.red-bean.com/cvs2cl/changelogs.html >Remember, the commit messages will now be visible to end users, so try >to use whole sentences and valid grammar. > >Also, the script uses a time window and identical message text to unify >separate file log messages into a chronological sequence of activity. >So, the rule is, drop everything as close together in time as you can >(one drop is best of course, but sometimes it doesn't work that way), >and use the same log message for all files related to a particular >change. > >Derrick > > > > >------------------------------------------------------- >Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara >The only event dedicated to issues related to Linux enterprise solutions >www.enterpriselinuxforum.com > >_______________________________________________ >Htmlparser-developer mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlparser-developer > > > > |
From: <dha...@or...> - 2003-05-09 11:58:45
|
Hi all, I found a mistake in the LabelScanner while doing some testing. Attached changed code and test case for the same. Derrick can you please include it in the next release for me. Basically a string like <label>John Doe<label>Jane Doe</label> gets parsed as <LABEL>John Doe<LABEL>Jane Doe</LABEL></LABEL> instead of <LABEL>John Doe</LABEL><LABEL>Jane Doe</LABEL> after call to toHtml() on the single LabelTag. Also it is parsed as a single node instead of 2 distinct nodes. I am also turning OptionTag into a CompositeTag but am getting a similar problem out there. Trying to work that out as well. I am also facing a strange problem with a certain piee of code. Probably someone can help me out (I think its a bug and have logged it already). Consider the string: String testHTML = new String( "<LABEL value=\"Google Search\">Google</LABEL>" + "<LABEL value=\"AltaVista Search\">AltaVista" + "<LABEL value=\"Lycos Search\"></LABEL>" + "<LABEL>Yahoo!</LABEL>" + "<LABEL>\nHotmail</LABEL>" + "<LABEL value=\"ICQ Messenger\">" + "<LABEL>Mailcity\n</LABEL>"+ "<LABEL>\nIndiatimes\n</LABEL>"+ "<LABEL>\nRediff\n</LABEL>\n"+ "<LABEL>Cricinfo" + "<LABEL value=\"Microsoft Passport\">" + "<LABEL value=\"AOL\"><SPAN>AOL</SPAN></LABEL>" + "<LABEL value=\"Time Warner\">Time <B>Warner <SPAN>AOL </SPAN>Inc.</B>" ); I added the LabelScanner to the parser and parsed. Strangely instead of returning node count as 13(number of LABEL tags) I get 17. Also when I see output of every node (using toHtml()), uptil "Microsoft Passport" everything is correct and I am getting LABEL tags as well. But the next node that I get is a String node with value as #alue="AOL"># (without the hash) and that entire tag got messed up. Any ideas. I have attached test file for that purpose. U'll also have to use the new LabelScanner.java file. Its quite strange. Regards, Dhaval |
From: <dha...@or...> - 2003-05-09 11:24:58
|
Wow!!! I think this is a record number of mails I must have sent to HTMLParser in a day. But some thoughts are evolving and I was wundering about it. I wanted to know whether end tag correction takes place in the CompositeTagScanner when end of stream is encountered. If not then I think that too should happen. Regards, Dhaval Udani Senior Analyst M-Line, QPEG OrbiTech Solutions Ltd. +91-22-28290019 Extn. 1457 -----Original Message----- From: Udani, Dhaval H. Sent: Friday, May 09, 2003 4:41 PM To: htmlparser-developer Cc: Udani, Dhaval H. Subject: RE: [Htmlparser-developer] CompositeTagScanner - Some comments The concept of MATCH_IDS and ENDERS array is great. A STARTERS array could also be useful in the correction procedure. If any tag from this array is encountered automatic correction could be done to end the previous tag. Regards, Dhaval Udani Senior Analyst M-Line, QPEG OrbiTech Solutions Ltd. +91-22-28290019 Extn. 1457 -----Original Message----- From: Udani, Dhaval H. Sent: Friday, May 09, 2003 10:03 AM To: htmlparser-developer Cc: Udani, Dhaval H. Subject: [Htmlparser-developer] CompositeTagScanner - Some comments Hi, A lot of thought has definitely gone into the design of the CompositeTagScanner. Some absolutely wonderful work has been done here. Somik, had asked me to have a look at the code and review it. I just have one point for discussion. The CompositeTagScanner has a provision to allow for nested children. However I feel there are very few HTML tags which have children of the same type. By default the scanner allows nesting. I believe this behaviour should be disallowed by default. my $0.02 ;) dhaval ------------------------------------------------------- Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara The only event dedicated to issues related to Linux enterprise solutions www.enterpriselinuxforum.com _______________________________________________ Htmlparser-developer mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-developer ------------------------------------------------------- Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara The only event dedicated to issues related to Linux enterprise solutions www.enterpriselinuxforum.com _______________________________________________ Htmlparser-developer mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-developer |
From: <dha...@or...> - 2003-05-09 11:11:46
|
The concept of MATCH_IDS and ENDERS array is great. A STARTERS array could also be useful in the correction procedure. If any tag from this array is encountered automatic correction could be done to end the previous tag. Regards, Dhaval Udani Senior Analyst M-Line, QPEG OrbiTech Solutions Ltd. +91-22-28290019 Extn. 1457 -----Original Message----- From: Udani, Dhaval H. Sent: Friday, May 09, 2003 10:03 AM To: htmlparser-developer Cc: Udani, Dhaval H. Subject: [Htmlparser-developer] CompositeTagScanner - Some comments Hi, A lot of thought has definitely gone into the design of the CompositeTagScanner. Some absolutely wonderful work has been done here. Somik, had asked me to have a look at the code and review it. I just have one point for discussion. The CompositeTagScanner has a provision to allow for nested children. However I feel there are very few HTML tags which have children of the same type. By default the scanner allows nesting. I believe this behaviour should be disallowed by default. my $0.02 ;) dhaval ------------------------------------------------------- Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara The only event dedicated to issues related to Linux enterprise solutions www.enterpriselinuxforum.com _______________________________________________ Htmlparser-developer mailing list Htm...@li... https://lists.sourceforge.net/lists/listinfo/htmlparser-developer |
From: <dha...@or...> - 2003-05-09 04:34:51
|
Hi, A lot of thought has definitely gone into the design of the CompositeTagScanner. Some absolutely wonderful work has been done here. Somik, had asked me to have a look at the code and review it. I just have one point for discussion. The CompositeTagScanner has a provision to allow for nested children. However I feel there are very few HTML tags which have children of the same type. By default the scanner allows nesting. I believe this behaviour should be disallowed by default. my $0.02 ;) dhaval |