Re: [Htmlparser-user] Parsing quoted strings seems to be broken in 1.6
Brought to you by:
derrickoswald
From: Derrick O. <Der...@Ro...> - 2007-01-08 01:53:29
|
For parsing bad script like this you probably want to set the static boolean value org.htmlparser.scanners.ScriptScanner.STRICT to false. See the explanation in the ScriptScanner.java file. sebb wrote: >The sample script: > ><HTML> > <body> > <script> > fred = "<img src='a.gif'></img>" > </script> > </body> ></HTML> > >generates the following output from parser.cmd: > >Tag (0[0,0],6[0,6]): HTML > Txt (6[0,6],10[1,2]): \n > Tag (10[1,2],16[1,8]): body > Txt (16[1,8],20[2,2]): \n > Tag (20[2,2],28[2,10]): script > Txt (28[2,10],57[3,27]): \n fred = "<img src='a.gif'> > End (57[3,27],57[3,27]): /script > End (57[3,27],63[3,33]): /img > Txt (63[3,33],68[4,2]): "\n > End (68[4,2],77[4,11]): /script > Txt (77[4,11],81[5,2]): \n > End (81[5,2],88[5,9]): /body > Txt (88[5,9],90[6,0]): \n > End (90[6,0],97[6,7]): /HTML >Txt (97[6,7],101[8,0]): \n\n > >It looks like the closing tag is being recognised - though the opening >tag is not. > >Is this a bug, or have I misunderstood something? > >------------------------------------------------------------------------- >Take Surveys. Earn Cash. Influence the Future of IT >Join SourceForge.net's Techsay panel and you'll get the chance to share your >opinions on IT & business topics through brief surveys - and earn cash >http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >_______________________________________________ >Htmlparser-user mailing list >Htm...@li... >https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > |