From: Christian S. <chr...@ne...> - 2003-06-20 13:59:01
|
Hello, has anybody ever considered implementing a dedicated HTML DOM for use with HtmlUnit? Right now HU uses Xerces/Neko to parse the input and create an internal DOM. However, that DOM is only used by HtmlUnit to lookup structural information. Additionally, a parallel DOM is maintained that keeps the extra information required by Htmlunit, and there is a constant mapping between the two. Heres what I find: 1. theres no need to let Xerces create a HTML DOM, as it currently does. It would suffice to use a simple XML DOM, because that is all HtmlUnit requires (and uses). This would improve effciency during parsing, and could be achieved by configuring Neko accordingly. 2. there would be a really significant performance improvement, and simplification of the code, if the 2 DOMs would be unified into one. Basically this would require implementing the whole DOM interface in HtmlUnit - which isnt that hard. I am considering to use HU in a load test scenario - thats why I am concerned about performance. I realize this may not be the case with many others. comments? Christian |