Re: [Htmlparser-user] Parsing quoted strings seems to be broken in 1.6
Brought to you by:
derrickoswald
From: sebb <se...@gm...> - 2007-01-10 18:13:01
|
FYI: I've now tested the parser using ScriptScanner.STRICT=false and that solved the "problem". Thanks again. On 08/01/07, sebb <se...@gm...> wrote: > Sorry, my bad - I've now read the document referenced in the scanner > source, and I see that "</" acts as the terminator unless suitably > hidden. > > S. > On 08/01/07, sebb <se...@gm...> wrote: > > Thanks for the quick reply. I'll give it a try. > > > > However, I'm not sure why the script example is bad. > > > > It is not enclosed in "<!--" and "// -->", but AIUI those are only > > needed as a work-round for older browsers that did not understand the > > <script> tag. > > > > > > On 08/01/07, Derrick Oswald <Der...@ro...> wrote: > > > > > > For parsing bad script like this you probably want to set the static > > > boolean value org.htmlparser.scanners.ScriptScanner.STRICT to false. See > > > the explanation in the ScriptScanner.java file. > > > > > > sebb wrote: > > > > > > >The sample script: > > > > > > > ><HTML> > > > > <body> > > > > <script> > > > > fred = "<img src='a.gif'></img>" > > > > </script> > > > > </body> > > > ></HTML> > > > > > > > >generates the following output from parser.cmd: > > > > > > > >Tag (0[0,0],6[0,6]): HTML > > > > Txt (6[0,6],10[1,2]): \n > > > > Tag (10[1,2],16[1,8]): body > > > > Txt (16[1,8],20[2,2]): \n > > > > Tag (20[2,2],28[2,10]): script > > > > Txt (28[2,10],57[3,27]): \n fred = "<img src='a.gif'> > > > > End (57[3,27],57[3,27]): /script > > > > End (57[3,27],63[3,33]): /img > > > > Txt (63[3,33],68[4,2]): "\n > > > > End (68[4,2],77[4,11]): /script > > > > Txt (77[4,11],81[5,2]): \n > > > > End (81[5,2],88[5,9]): /body > > > > Txt (88[5,9],90[6,0]): \n > > > > End (90[6,0],97[6,7]): /HTML > > > >Txt (97[6,7],101[8,0]): \n\n > > > > > > > >It looks like the closing tag is being recognised - though the opening > > > >tag is not. > > > > > > > >Is this a bug, or have I misunderstood something? > > > > > > > >------------------------------------------------------------------------- > > > >Take Surveys. Earn Cash. Influence the Future of IT > > > >Join SourceForge.net's Techsay panel and you'll get the chance to share your > > > >opinions on IT & business topics through brief surveys - and earn cash > > > >http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > > > >_______________________________________________ > > > >Htmlparser-user mailing list > > > >Htm...@li... > > > >https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------- > > > Take Surveys. Earn Cash. Influence the Future of IT > > > Join SourceForge.net's Techsay panel and you'll get the chance to share your > > > opinions on IT & business topics through brief surveys - and earn cash > > > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > > > _______________________________________________ > > > Htmlparser-user mailing list > > > Htm...@li... > > > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > > > > |