Menu

#94 General suggestion: copy mutable field values / arguments instead of returning / using them directly

v2.30
open-accepted
nobody
None
5
2023-06-21
2013-10-26
Dave
No

Scott,
I was looking at some htmlcleaner code related to a couple of other discussions/tickets and noticed that some getter methods return, e.g., a List field rather than creating a copy; and some setting methods directly assign, e.g., a List object, into a field value rather than creating a copy.

I know that in the past you had already changed some existing methods to copy mutable field values rather than directly returning them. (I believe one issue I reported was an unexpected side effect of one such change.) But there are definitely still some classes that have getter methods that return a potentially mutable field for the return value, and setter methods that assign potentially mutable objects into fields (e.g., TagNode).

As I'm sure you're aware, this can be hazardous:

  • When returning a mutable object that is part of the class's internal state, there is a risk of the client code modifying that state in a way that does not preserve the class's internal invariants.
  • The same risk exists when directly using a potentially mutable incoming object as the value of a field -- that is, the client code may still be able to modify that value, thus modifying the internal state of that instance of the class whose field now refers to that mutable object.
  • Some Collection or Map objects that are passed as arguments to setter methods may not even be mutable. If the class's implementation assumes that the object in a particular field is mutable, then obviously immutable instances are not compatible with that requirement, and when the class attempts to modify those objects, RuntimeExceptions may be raised (or in some strange case, perhaps worse, the attempted modifications would fail silently, leading to all kinds of unexpected behaviors, including data loss).

I suggest that when you have time for a careful audit, you consider replacing instances of this pattern:

private List<Foo> something;

public List<? extends Foo> getSomething() {
   return something;
}

public void setSomething(List<? extends Foo> newSomething) {
   this.something = newSomething;
}

with this pattern, or a similar pattern of your choosing:

private final List<Foo> something = new ArrayList<Foo>();

public List<? extends Foo> getSomething() {
   return new ArrayList<Foo>(something);
}

public void setSomething(List<? extends Foo> newSomething) {
   this.something.clear();
   if (newSomething != null) {
       this.something.addAll(newSomething);
   }
}

Obviously, depending upon what the internal invariants of the class in question are, and what the contract of the getter method is, such a change may not be appropriate at all. But I imagine that in most cases, it would be appropriate.

And also obviously, I would not advise making this kind of change without having test cases for the code/methods that it affects. I wouldn't want this suggestion to cause other code, which may currently rely upon getting direct access to various other classes' internals, to unexpectedly stop working as expected.

Clearly this is not a high priority item, but I thought that it was worth mentioning.

Discussion

1 2 > >> (Page 1 of 2)
  • Scott Wilson

    Scott Wilson - 2013-10-30

    Thanks for this Dave, I was also wondering about this when looking at ticket #89.

    I think having this pattern is a good marker for us to keep returning to as we evolve HtmlCleaner.

    As you note, there are parts of HtmlCleaner that do rely on access to internals, but I think its reasonable to adopt this pattern wherever possible when we provide a public interface.

     
  • Scott Wilson

    Scott Wilson - 2013-12-06
    • status: open --> open-accepted
    • Group: v 2.7 --> v 2.8
     
  • Scott Wilson

    Scott Wilson - 2013-12-06

    Moving to 2.8 so this stays visible.

     
  • Scott Wilson

    Scott Wilson - 2014-03-18
    • Group: v 2.8 --> v 2.9
     
  • Scott Wilson

    Scott Wilson - 2014-08-13
    • Group: v 2.9 --> v 2.10
     
  • Scott Wilson

    Scott Wilson - 2014-10-31
    • Group: v 2.10 --> v 2.11
     
  • Scott Wilson

    Scott Wilson - 2015-05-12
    • Group: v 2.11 --> v2.12
     
  • Scott Wilson

    Scott Wilson - 2015-05-15
    • Group: v2.12 --> v2.13
     
  • Scott Wilson

    Scott Wilson - 2015-07-01
    • Group: v2.13 --> v2.14
     
  • Scott Wilson

    Scott Wilson - 2015-08-24
    • Group: v2.14 --> v2.15
     
  • Scott Wilson

    Scott Wilson - 2015-10-01
    • Group: v2.15 --> v2.16
     
  • Scott Wilson

    Scott Wilson - 2015-11-20
    • Group: v2.16 --> v2.17
     
  • Scott Wilson

    Scott Wilson - 2016-10-19
    • Group: v2.17 --> v2.18
     
  • Scott Wilson

    Scott Wilson - 2017-02-06
    • Group: v2.18 --> v2.19
     
  • Scott Wilson

    Scott Wilson - 2017-02-07
    • Group: v2.19 --> v2.20
     
  • Scott Wilson

    Scott Wilson - 2017-05-02
    • Group: v2.20 --> v2.21
     
  • Scott Wilson

    Scott Wilson - 2017-05-11
    • Group: v2.21 --> v2.22
     
  • Scott Wilson

    Scott Wilson - 2018-04-24
    • Group: v2.22 --> v2.23
     
  • Scott Wilson

    Scott Wilson - 2019-09-04
    • Group: v2.23 --> v2.24
     
  • Scott Wilson

    Scott Wilson - 2020-04-29
    • Group: v2.24 --> v2.25
     
  • Scott Wilson

    Scott Wilson - 2021-09-24
    • Group: v2.25 --> v2.26
     
  • Scott Wilson

    Scott Wilson - 2023-04-29
    • Group: v2.26 --> v2.29
     
  • Scott Wilson

    Scott Wilson - 2023-06-19
    • Group: v2.29 --> v2.30
     
  • Dave

    Dave - 2023-06-19

    To modify my original suggestion, it's probably best to just return the object wrapped via an appropriate Collections method, e.g., Collections.unmodifiableList.

     
  • Scott Wilson

    Scott Wilson - 2023-06-20

    It's definitely something I keep coming back to, though usually there's something more pressing! If you wanted to submit a patch upgrading some of the more problematic classes that would be fantastic. I suspect most users will be directly interacting with TagNode the most, or possibly the serialisers, so those would probably be the first to apply updated patterns to.

     
1 2 > >> (Page 1 of 2)

Log in to post a comment.