[Htmlparser-user] HTML parser parsing script incorrectly
Brought to you by:
derrickoswald
|
From: Niket A. <nik...@ex...> - 2010-07-07 20:07:05
|
I m parsing a page http://www.healthline.com/search?q1=how+to+improve+prostate+blood+levels using htmlparser api and I m getting content inside a script tag in some other tag and reason for this is html tags are present in a string inside javascript tags and are not escaped .... so htmlparser api is closing on those tags. ================================================================================================================================================================================================ <div id="myHealthlineHeader"> <script> if(isLoggedIn()) { document.write("<a href=\"/action/LogOutServlet\">Sign Off</a> | <a rel=\"nofollow\" href=\"/myhealthline/account_overview.jsp\">My Healthline</a> | Welcome, <strong>" + getNickname() + "</strong>"); document.getElementById("myHealthlineHeader").className = "hl_state_top_signed_in"; } else { document.write("<div style=\"float:right;text-align:right;padding:0 5px 0 0;\"> | <a class=\"underlineless\" rel=\"nofollow\" href=\"/yourfeedback.jsp?url=%2Fsearch%3Fq1%3Dhow%2Bto%2Bimprove%2Bprostate%2Bblood%2Blevels\">Feedback</a></div>"); document.write("<div style=\"float:right\"><a class=\"underlineless\" rel=\"nofollow\" href=\"/signin.jsp\">Sign in</a> | <a class=\"underlineless\" rel=\"nofollow\" href=\"/registration.jsp\">Join Now</a> </div>") document.getElementById("myHealthlineHeader").className = "hl_state_top"; } </script> </div> ================================================================================================================================================================================================ Is there anyway to fix this issue? Regards Niket |