From: Barnaby C. <Ba...@ac...> - 2003-05-28 20:04:41
|
I am (overall) very happy with the current structure. Almost all information is maintained in the XML DOM and pretty much all the other classes are used as utilities that act on the DOM. This provides a very flexible structure for the future. There are still some places that I could see room for improvement. I propose that the input elements be simplified to 2 elements, the HtmlInputElement and the HTMLInputElement class. For backward compatibility all the sub classes of these 2 classes could be kept as interfaces so existing test cases will not be broken. If this move is supported I will submit a patch to support these changes by next week. The principal effect that this has on our current bug list is to eliminate bug 730915 as it would eliminate the one-to-n mapping that currently exists for input elements. Let me know what you think. Thanks, -Barnaby ---------- Original Message ---------------------------------- From: "David Taylor" <dt...@ac...> Date: Wed, 28 May 2003 10:47:10 -0400 >I am working on adding support for some DOM Level 1 Core features like >Document.createElement (), Node.parentNode, Node.childNodes, and >Node.appendChild (the last with help from Barnaby Court). All of these >return DOM Level 1 objects like Node or HTMLInputElement. > >Basically, HtmlUnit currently implements DOM Level 0. This causes >problems because there is no non-HTML node class (besides Window) so all >elements are assumed to be HTML. For the root document and even text >nodes, this assumption breaks (and causes class cast exceptions). The >problem with HTMLInputElement is captured in bug 730915: you can't >create an input element because the DOM Level 0 classes need an >attribute value to know what class to create and createElement just >gives the tag name. > >Internally, HtmlUnit has 3 sets of objects representing a web page: the >XML DOM returned from the CyberNeko HTML parser, the HTML elements, and >the scriptable objects. The XML DOM provides the tree structure and the >other 2 kinds of objects are lazily constructed as needed. The >scriptable objects and HTML elements each have a reference to each >other. The HTML elements have a reference to their XML node and their >HTML page where there is a map from XML nodes to HTML elements. The key >point here is that the HTML element is the glue that ties these sets of >objects together. > >For non-HTML nodes in the XML DOM and non-HTML scriptable objects there >is no corresponding HTML element. So, there is nothing to glue non-HTML >nodes with non-HTML scriptable objects. > >Since fixing this opens design level questions, I thought we should >discuss the alternatives first. Some alternatives include: 1) extending >the HTML elements to include non-HTML nodes like Node, Document, and >Text, 2) using the XML nodes to glue the objects together, perhaps using >the user data to avoid map lookups on every dereference, 3) >consolidating the HTML elements and scriptable objects into a single set >of classes that support both sets of interfaces. An extreme alternative >would be to consolidate the HTML elements, scriptable objects and XML >DOM objects by extending the XML DOM classes with subclasses that >implement Scriptable. > >Anyway, before I spend much time exploring alternatives in much detail, >I'd appreciate your comments. >Thanks, >-David |