Thread: [Htmlparser-developer] Added methods to NodeList
Brought to you by:
derrickoswald
From: Matthew B. <mat...@ou...> - 2005-09-13 13:25:05
|
First can I thanks for htmlparser, it's really useful. I'm trying to remove a Tag that I am visiting (using a NodeVisitor) from its parent: public void visitTag( Tag tag ) { .... NodeList children = tag.getParent().getChildren(); for (int child = 0; child < children.size(); child++) { if (tag.equals(children.elementAt(child))) { children.remove(child); break; } } and would rather be able todo: public void visitTag( Tag tag ) { .... NodeList children = tag.getParent().getChildren(); children.remove(tag); Comments? I was also wondering if there was a reason why NodeList doesn't implement java.util.List as most Java programmers are already familiar with the semantics of it. I would have attached a patch but I don't seem to be able todo a a CVS diff against sourceforge anonymous CVS at the moment (timeouts) :-( -- Added code-- /** * Check to see if the NodeList contains the supplied Node. * @param node The node to look for. * @return True is the Node is in this NodeList. */ public boolean contains(Node node) { return indexOf(node) != -1; } /** * Finds the index of the supplied Node. * @param node The node to look for. * @return The index of the node in the list or -1 if it isn't found. */ public int indexOf(Node node) { for (int i=0;i<size;i++) { if (nodeData.equals(node)) return i; } return -1; } /** * Remove the supplied Node from the list. * @param node The node to remove. * @return True if the node was found and removed from the list. */ public boolean remove(Node node) { int index = indexOf(node); if (index != -1) { remove(index); return true; } return false; } -- +--Matthew Buckett-----------------------------------------+ | VLE Developer, Learning Technologies Group | | Tel: +44 (0) 1865 283660 http://www.oucs.ox.ac.uk/ | +------------Computing Services, University of Oxford------+ |
From: Matthew B. <mat...@co...> - 2005-09-13 13:30:34
|
Matthew Buckett wrote: Sorry, missed this. > /** > * Finds the index of the supplied Node. > * @param node The node to look for. > * @return The index of the node in the list or -1 if it isn't found. > */ > public int indexOf(Node node) { > for (int i=0;i<size;i++) { > if (nodeData.equals(node)) Should have been: if (nodeData[i].equals(node)) -- +--Matthew Buckett-----------------------------------------+ | VLE Developer, Learning Technologies Group | | Tel: +44 (0) 1865 283660 http://www.oucs.ox.ac.uk/ | +------------Computing Services, University of Oxford------+ |
From: Derrick O. <Der...@Ro...> - 2005-09-13 22:54:34
|
To answer your second question first, it's a legacy thing, trying to keep the base classes compatible with Java 1.x and avoiding the new Java Collections Framework. This can probably be revisited, since the goal of backward compatiblity has less emphasis these days. Removing nodes from an underlying collection while an iterator is active on it is fraught with peril. It might work in some cases (and I'm a little surprised it worked for you), but I think the better approach it to throw all the nodes to be deleted in a 'garbage bin' and remove them all later. Yes, the NodeList could use a remove(Node) call. You could add the patch to the Patches tracker (http://sourceforge.net/tracker/?group_id=24399&atid=381401) or Request For Enhancement tracker (http://sourceforge.net/tracker/?group_id=24399&atid=381402), but it's probably good enough in the mail list here. Matthew Buckett wrote: > First can I thanks for htmlparser, it's really useful. > > I'm trying to remove a Tag that I am visiting (using a NodeVisitor) > from its parent: > > public void visitTag( Tag tag ) > { > .... > NodeList children = tag.getParent().getChildren(); > for (int child = 0; child < children.size(); child++) > { > if (tag.equals(children.elementAt(child))) > { > children.remove(child); > break; > } > } > > and would rather be able todo: > > public void visitTag( Tag tag ) > { > .... > NodeList children = tag.getParent().getChildren(); > children.remove(tag); > > Comments? > > I was also wondering if there was a reason why NodeList doesn't > implement java.util.List as most Java programmers are already familiar > with the semantics of it. > > I would have attached a patch but I don't seem to be able todo a a CVS > diff against sourceforge anonymous CVS at the moment (timeouts) :-( > > -- Added code-- > /** > * Check to see if the NodeList contains the supplied Node. > * @param node The node to look for. > * @return True is the Node is in this NodeList. > */ > public boolean contains(Node node) { > return indexOf(node) != -1; > } > > /** > * Finds the index of the supplied Node. > * @param node The node to look for. > * @return The index of the node in the list or -1 if it isn't found. > */ > public int indexOf(Node node) { > for (int i=0;i<size;i++) { > if (nodeData.equals(node)) > return i; > } > return -1; > } > > /** > * Remove the supplied Node from the list. > * @param node The node to remove. > * @return True if the node was found and removed from the list. > */ > public boolean remove(Node node) { > int index = indexOf(node); > if (index != -1) { > remove(index); > return true; > } > return false; > } > |
From: Matthew B. <mat...@co...> - 2005-09-14 09:06:06
|
Derrick Oswald wrote: > > To answer your second question first, it's a legacy thing, trying to > keep the base classes compatible with Java 1.x and avoiding the new Java > Collections Framework. This can probably be revisited, since the goal of > backward compatiblity has less emphasis these days. Ok. Even if NodeList didn't implement java.util.List having similar methods makes the learning curve smaller for most Java programmers. On a slightly related note, is there a reason for using a array for the Nodes in NodeList but a Vector for the attributes in TagNode? Switching to a Vector in NodeList would make it easy to expose more flexible methods. Was it orginally decided to use an array for performance reasons? > Removing nodes from an underlying collection while an iterator is active > on it is fraught with peril. Nothing like living dangerously ;-) > It might work in some cases (and I'm a little surprised it worked for > you), but I think the better approach it to throw all the nodes to be > deleted in a 'garbage bin' and remove them all later. Ok. I'll probably change to this method. I was just going to use a filter but then I either end up running multiple filters over the same tree or do the same tests on the nodes the filter returns to work out what I should be doing to them, neither seemed very sensible. Using a visitor I at least only traverse the tree once and can perform the alterations. > Yes, the NodeList > could use a remove(Node) call. > You could add the patch to the Patches > tracker (http://sourceforge.net/tracker/?group_id=24399&atid=381401) or > Request For Enhancement tracker > (http://sourceforge.net/tracker/?group_id=24399&atid=381402), but it's > probably good enough in the mail list here. Ok. -- +--Matthew Buckett-----------------------------------------+ | VLE Developer, Learning Technologies Group | | Tel: +44 (0) 1865 283660 http://www.oucs.ox.ac.uk/ | +------------Computing Services, University of Oxford------+ |