From: Brad C. <yo...@br...> - 2005-03-10 19:32:54
|
It looks like a bug but I couldn't quickly locate anything that reads the content type from the meta tags. Does it do the same thing with different encodings? --- Vinay Murthy <vin...@gm...> wrote: > Hi, > > I did notice a certain thread on the mailing list that discussed > problems relating to the content-type of pages and how we could solve > this problem by extending DefaultPageCreator. > > But, the page that I am trying to access has content-type set to > text/html, which means I can use the DefaultPageCreator as such. > However, in addition to this, the page also has the character set > encoding set to UTF-16. The problem is that, with the UTF-16 charset > in place, HtmlUnit is not able to detect any javascript on the page. > Here is a dummy page that I created to test this out: > > <HTML> > <HEAD> > <TITLE> Page Two </TITLE> > <META http-equiv="Content-Type" content="text/html; charset=UTF-16"> > <Script language="JavaScript" type="text/javaScript" src="test.js"></Script> > </HEAD> > <BODY> > </BODY> > </HTML> > > I found that if I remove the charset specification, the js gets > detected and the file, test.js, is loaded. I really, do not have > control on the source of the actual page that I am accessing. > > I have been using the HtmlUnit sources of 10th March 2005. Would > appreciate your response to this. > > Regards > Vinay |