Jericho HTML Parser / Discussion / Open Discussion: Script tag in Javascript body

Script tag in Javascript body

Forum: Open Discussion

Creator: Ejike Ofuonye

Created: 2007-09-04

Updated: 2013-01-03

Ejike Ofuonye - 2007-09-04

Hi all,

I am having problems parsing script tags because of embedded tags like the following
<script>
document.write("<script src='http://localhost/js/prototype.js'> </script>");
//document.write("<script src='http://localhost/js/effects.js'> </script>");
<script>

I basically don't want to the scripts in the document.write javascript code to be returned when I make a call like this
List scriptStartTags=source.findAllStartTags(Tag.SCRIPT);

IS there a better way to do this? Or are there flags I can set to ignore embedded tags of this nature.

Thank you

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Martin Jericho - 2007-09-04
  
  Hi Ejike,
  
  Normally, the parser will automatically ignore any tags inside SCRIPT elements, as long as a full sequential parse has been performed (see Source.fullSequentialParse() for details).
  
  In your case this doesn't work because the HTML in your example is illegal. The HTML specification states that the characters "</" should not appear inside a SCRIPT element.
  
  If you are the author of the HTML, you should enclose the content of the SCRIPT element with a CDATA section or comments (), or split the characters up by substituting "<"+"/script>" for "</script>".
  
  If you are not the author and have to put up with the illegal HTML, you will have to devise your own way of detecting whether each SCRIPT start tag is actually inside another SCRIPT element, which unfortunately isn't a trivial task.
  
  Hope this helps
  Cheers
  Martin
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Script tag in Javascript body

Forums

Help

Script tag in Javascript body

Script tag in Javascript body

Forums

Help

Script tag in Javascript body document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Script tag in Javascript body