Hi,
I did notice a certain thread on the mailing list that discussed
problems relating to the content-type of pages and how we could solve
this problem by extending DefaultPageCreator.
But, the page that I am trying to access has content-type set to
text/html, which means I can use the DefaultPageCreator as such.
However, in addition to this, the page also has the character set
encoding set to UTF-16. The problem is that, with the UTF-16 charset
in place, HtmlUnit is not able to detect any javascript on the page.
Here is a dummy page that I created to test this out:
<HTML>
<HEAD>
<TITLE> Page Two </TITLE>
<META http-equiv="Content-Type" content="text/html; charset=UTF-16">
<Script language="JavaScript" type="text/javaScript" src="test.js"></Script>
</HEAD>
<BODY>
</BODY>
</HTML>
I found that if I remove the charset specification, the js gets
detected and the file, test.js, is loaded. I really, do not have
control on the source of the actual page that I am accessing.
I have been using the HtmlUnit sources of 10th March 2005. Would
appreciate your response to this.
Regards
Vinay
|