Hi team,
There is a small parser issue if a numeric character reference is invalid. Imagine a text like "Nimbus™ 3000" that the page author, however, entered as "Nimbus�" (mind the missing semicolon). As a consequence, the numeric character reference is of course invalid. When such a text is parsed, browsers usually handle this by inserting the � symbol.
HtmlUnit may or may not fail in such a scenario. Looks like the parser can gracefully handle this situation if the offending text is in the body of an element. If it is in an attribute value, the page load fails with an IllegalArgumentException thrown by Neko.
See the attached test case that demonstrates this behavior.
Thanks,
J.
Fixed in SVN - you need an updated Neko.
Thanks for reporting...
PS: Solche Fehler können nur Leute mit Umlauten im Namen finden ;-)
Dafür sind wir doch da ... ;-)