Scott,
I was looking at some htmlcleaner code related to a couple of other discussions/tickets and noticed that some getter methods return, e.g., a List field rather than creating a copy; and some setting methods directly assign, e.g., a List object, into a field value rather than creating a copy.
I know that in the past you had already changed some existing methods to copy mutable field values rather than directly returning them. (I believe one issue I reported was an unexpected side effect of one such change.) But there are definitely still some classes that have getter methods that return a potentially mutable field for the return value, and setter methods that assign potentially mutable objects into fields (e.g., TagNode).
As I'm sure you're aware, this can be hazardous:
I suggest that when you have time for a careful audit, you consider replacing instances of this pattern:
private List<Foo> something; public List<? extends Foo> getSomething() { return something; } public void setSomething(List<? extends Foo> newSomething) { this.something = newSomething; }
with this pattern, or a similar pattern of your choosing:
private final List<Foo> something = new ArrayList<Foo>(); public List<? extends Foo> getSomething() { return new ArrayList<Foo>(something); } public void setSomething(List<? extends Foo> newSomething) { this.something.clear(); if (newSomething != null) { this.something.addAll(newSomething); } }
Obviously, depending upon what the internal invariants of the class in question are, and what the contract of the getter method is, such a change may not be appropriate at all. But I imagine that in most cases, it would be appropriate.
And also obviously, I would not advise making this kind of change without having test cases for the code/methods that it affects. I wouldn't want this suggestion to cause other code, which may currently rely upon getting direct access to various other classes' internals, to unexpectedly stop working as expected.
Clearly this is not a high priority item, but I thought that it was worth mentioning.
Thanks for this Dave, I was also wondering about this when looking at ticket #89.
I think having this pattern is a good marker for us to keep returning to as we evolve HtmlCleaner.
As you note, there are parts of HtmlCleaner that do rely on access to internals, but I think its reasonable to adopt this pattern wherever possible when we provide a public interface.
Moving to 2.8 so this stays visible.
To modify my original suggestion, it's probably best to just return the object wrapped via an appropriate Collections method, e.g., Collections.unmodifiableList.
It's definitely something I keep coming back to, though usually there's something more pressing! If you wanted to submit a patch upgrading some of the more problematic classes that would be fantastic. I suspect most users will be directly interacting with TagNode the most, or possibly the serialisers, so those would probably be the first to apply updated patterns to.