in the jscode variable I don't find the full code, but only the beginning of it. The code seems to get truncated at the '<' symbol, as in the following example:
<SCRIPT language=JavaScript1.3>
var news5='<a href=http://www.cue.org.uk/gallery/index.php>Photo Gallery Update</a>'
var news4='<a href=http://www.cue.org.uk/sponsors/index.php>New CUE Sponsors</a>'
var news3='<a href=http://www.admin.cam.ac.uk/news/dp/2005102802 target="_blank">Student Innovation</a>'
where the only code in the jscode variable is:
var news5='<a href=http://www.cue.org.uk/gallery/index.php>Photo Gallery Update
If I do script.toPlainTextString() I obtain the same exact result.
Any hint on this? Am I doing anything wrong or is it a bug?
Cheers,
Luca
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm parsing a set of pages with JavaScript code.
I first extract the script tags with a code like:
extractAllNodesThatMatch(new TagNameFilter("script"))
and then I iterate over all of them. But when I do:
ScriptTag script = (ScriptTag) sni.nextNode();
jscode = script.getScriptCode();
in the jscode variable I don't find the full code, but only the beginning of it. The code seems to get truncated at the '<' symbol, as in the following example:
<SCRIPT language=JavaScript1.3>
var news5='<a href=http://www.cue.org.uk/gallery/index.php>Photo Gallery Update</a>'
var news4='<a href=http://www.cue.org.uk/sponsors/index.php>New CUE Sponsors</a>'
var news3='<a href=http://www.admin.cam.ac.uk/news/dp/2005102802 target="_blank">Student Innovation</a>'
where the only code in the jscode variable is:
var news5='<a href=http://www.cue.org.uk/gallery/index.php>Photo Gallery Update
If I do script.toPlainTextString() I obtain the same exact result.
Any hint on this? Am I doing anything wrong or is it a bug?
Cheers,
Luca