I've found what I think is a flaw in the deep copy process. Can someone please confirm that this is the way it works?
I have a moderately complex data structure that contains several sets. When I deep copy this structure, some set elements get lost in the copy.
I traced through the code as it seems like the following is happening...
Items that get added to the cloneMap are potentially only partially filled in at the time they get added. So some fields can still be null. When a collection is processed, the add() function is called to add an item to a set. This add() function will call equals() on the current objects in the set to see if the new item needs to be added or is already there. I believe this will cause problems when you have a custom equals() implementation. The problem is, several of the items in the set are only partially filled in, so calling the equals() function is not valid at that time. For example, my equals() function for object calls equals() on one of its components, object B. But object B may not exist yet in the cloneMap entry, or maybe some of its fields haven't been filled in and are still null. So the equals() cannot work properly.
I've tried this on 3.2.10 and the latest 3.3.0 beta 8. Same problem.
Thanks in advance for your comments.
Rob
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for your response. I'll see if I can try to reproduce it in such a way that I don't give away any work secrets...
The pattern of data in the area that is affected is as follows:
public class F {
private Set<A> fa;
...
}
public class A {
private C c;
...
}
public class C {
private Set<A> ca = new HashSet<A>();
...
}
The problem occurs when I copy an F. The original F contained a Set 'fa' with 10 elements. After the copy, the resulting F has a Set 'fa' with only 7 elements. I have traced the following code inside replicateCollection().
I can see that, when some members get added to the set, the size of the set does not increase by 1. I looked at the internals of the A objects that are being added to the set, and they have some fields as null, including some of the fields I am using as my "business key". So the set 'add' operation is not working properly.
I tried an experiment and removed the 'ca' field of C. This field is not strictly necessary. (In our app we only really traverse the links in the direction F -> A -> C, so we regard the Set of A's inside C as 'backward' links. However, the deep copy doesn't really know about 'forward' and 'backward' links - it processes links as it encounters them.) Removing these links fixed the problem.
Rob
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
>I looked at the internals of the A objects that are
>being added to the set, and they have some fields as null, including some of
>the fields I am using as my "business key".
What are the types of those fields in A that are found null but
shouldn't when an A is added to the set ?
An A object is fully replicated before being added to the set. So if
some fields are found null but shouldn't, the root cause would be
related to why those fields are set to null after the replication in
the first place.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Mirroring your data structure, I ran the following using junit 4.1 and the replication seemed just fine.
////////////// A.java /////////////
public class A {
private C c;
private String name;
public A(String name) { this.name = name; }
public A() {}
public C getC() {return c;}
public void setC(C c) {this.c = c;}
public String getName() {return name;}
public void setName(String name) {this.name = name;}
@Override
public String toString() {return name;}
}
public class C {
private Set<A> ca = new HashSet<A>();
private String name;
public C() {}
public C(String name) {this.name = name;}
public Set<A> getCa() {return ca;}
public void setCa(Set<A> ca) {this.ca = ca;}
public String getName() {return name;}
public void setName(String name) {this.name = name;}
}
public class F {
private Set<A> fa;
private String name;
public F() {}
public F(String name) {this.name = name;}
public Set<A> getFa() {return fa;}
public void setFa(Set<A> fa) {this.fa = fa;}
public String getName() {return name;}
public void setName(String name) {this.name = name;}
}
>An A object is fully replicated before being added to the set.
This looks like the crux of it. This is definitely not happening. I've examined the set just before something gets added to it, and several of the members have null fields that haven't been filled in.
It seems to me that items can be added to the cloneMap before they are fully populated: is this true, because that it certainly what I am seeing?
So, in my data structure I must have a pattern of associations that causes something to be found in the cloneMap and added to the set before it's been fully populated. I only showed a fragment of our data structures. In reality, the classes have many more fields, and it's the case that, during processing of an object, replicating one of its fields ultimately causes that object to be visited again, so there are indirect circular references.
Thanks for looking at this, anyway, and maybe we can get to the bottom of it.
As I said, at the moment, I have a workaround that involves removing one of the backward links (and hence removing some of the circularity?). If I get the time, I'll try to product a complete code sample that you can run to see the problem.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
>several of the members have null fields that haven't been filled in.
For the fields that are null, do they have a pair of public getter/setter methods ? If not, they won't get populated. (Remember the Java Bean convention ?)
>It seems to me that items can be added to the cloneMap before they are fully populated: is this true
This is true, but it's irrelevant. cloneMap is used to record the identities of the objects that are being cloned. It doesn't matter what the objects contain or whether they are fully populated.
The best way to demonstrate/prove the issue is to provide a little/minimal test case that exhibits the problematic behavior.
A problem is usually half solved if it can be repeated.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
To explain further, there is a difference between adding to the cloneMap vs adding to the Set member fields defined by your classes.
cloneMap is internally used by Beanlib, and it doesn't matter whether the objects added to it are fully populated beforehand. Indeed, in most cases they won't be, but that doesn't matter.
Objects being added to the Set or other Collection member fields defined by your classes, however, are always fully populated (with a best attempt approach) beforehand. So null member fields result only if the best attempt failed (such as missing getter/setter).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've checked all the getters and setters, and they are all there (and public).
The reason I'm focusing on clonedMap is because an item in clonedMap is picked up and added to the faulty result set, and that item is only partially filled in (in fact, in the debug session I just ran, *none* of its fields were filled in).
Here's my data structure:
class ProductClassSchemeService {
private ProductClass productClass;
private ServiceScheme serviceScheme;
// some other fields
// public getters and setters
}
class ServiceScheme {
private Set<ProductClassSchemeService> productClassSchemeServices = new HashSet<ProductClassSchemeService>();
// some other fields
// public getters and setters
}
My equals() method for ProductClassSchemeService uses both the "productClass" and "serviceScheme" properties.
During my clone operation it gets to a ServiceScheme (ServiceScheme is quite deep in the object hierarchy that I'm cloning), and I reach the beanlib method processSetterMethod() for property "productClassSchemeServices". My source object has 10 ProductClassSchemeService objects in this Set.
I let it run and it gets to replicateCollection().
// recursively populate member objects.
for (Object fromMember : fromCollection) {
Object toMember = replicate(fromMember);
toCollection.add(toMember);
}
I trace through the processing of the first Set element (which is a ProductClassSchemeService object).
Let's focus on the line
Object toMember = replicate(fromMember);
If I examine 'fromMember' before executing this line I can see that its 'productClass' and 'serviceScheme' members are populated - they are not null.
I step through and trace the execution of 'replicate'.
What happens now is that it gets to the function
private <T> T replicate(Object from, Class<T> toClass)
and it gets to the line
T to = (T)clonedMap.get(from);
At this point, it is finding the item in clonedMap, so 'to' gets set to a ProductClassSchemeService object from clonedMap. But the item it finds in clonedMap has its 'productClass' and 'serviceScheme' members set to null instead of being populated. If an item is found in clonedMap, as in this case, the code returns straight away back to replicateCollection() and adds that item to the 'toCollection' set.
So this illustrates the problem. When replicating a source Set member it finds that member is already in clonedMap, and adds it to the destination Set. But that item has not yet had its fields populated, so the item that gets added to the destination Set is invalid (at that point). When it comes to add the second member of the Set, and performs the 'add' function, the first member is still invalid (still has null fields), so 'add' surely cannot work properly in this case. Maybe you could say that by the time that the whole replication has been done, the item will have had all its fields populated. But by that time it's too late - the fields of the first member need to be fully populated before the second member gets added to the set.
So maybe I can generalise the problem to say:
"If, when replicating a member of a set, it finds that that member is already in clonedMap (it presumably got into clonedMap because there was a reference to it elsewhere in the object graph), it will just add that value to the set straight from clonedMap. But at that point, the object in clonedMap might not have been populated, so the set element is invalid, and this can cause problems when another member is added to the set, because equals() might not work properly."
Having just done the debugging session and written this down I actually understand more about the problem myself, so I'll try to replicate it using a small amount of test code.
Rob
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I managed to simulate the problem, and may have a solution for your particular case. The solution is to plug in a different setter method collector, such that the collection/map member properties will be populated last.
Thanks for your help on this. I'm glad you managed to produce a small test to illustrate the problem.
I've just got round to testing your fix, and I've hit a snag with it. It doesn't seem to work in a Hibernate environment. At the moment, my code is just
Hibernate3BeanReplicator replicator = new Hibernate3BeanReplicator();
T result = replicator.deepCopy(from);
I've tried replacing this with your fix, but I get a Hibernate lazy initialisation error due to all the fields of 'from' not yet being loaded.
How do I get it to work with Hibernate?
Thanks,
Rob
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Sorry Hanson, ignore that last message. I was being a bit daft, and not realising that I could call initSetterMethodCollector() on my replicator object.
Rob
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm running it on 3.3.0beta9. The pattern of access to the methods has now changed, but the same issue is occurring, and for the same reason (it's picking up partially filled in items from clonedMap and adding them to the destination set).
I added a check to the end of replicateCollection() to see if the 'from' set size is the same as the 'to' set size, and there are several cases where they differ.
Rob
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I think I can understand why you may still have the problem, when the level of copying is nested with multiple level of object member fields. The solution would be not just changing the ordering of setter/copying to just non-collection members first and then collection members, but in the order of
1) primitive member fields
2) non-collection member fields
3) collection member fields
You can try constructing such setter collector to see if that works. Otherwise, I will quickly build one over the weekend.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Also, (as explained in this forum earlier) if you are using the initSetterMethodCollector on Hibernate3BeanReplicator, you cannot use deepCopy which would override what you set with the defaults. Use the "copy" method instead which is a lower level API that allows fine grained control over the copying process. You can still achieve the deep copy effect with a custom setter method collector. Have a peek inside the deepCopy implementation to see how the other parameters are set.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Sample usage (deep copy) with Hibernate3BeanReplicator:
HibernateBeanReplicator replicator = new Hibernate3BeanReplicator()
.initSetterMethodCollector(new OrderedMethodCollector());
Object from = ...
Object to = replicator.copy(from);
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've found what I think is a flaw in the deep copy process. Can someone please confirm that this is the way it works?
I have a moderately complex data structure that contains several sets. When I deep copy this structure, some set elements get lost in the copy.
I traced through the code as it seems like the following is happening...
Items that get added to the cloneMap are potentially only partially filled in at the time they get added. So some fields can still be null. When a collection is processed, the add() function is called to add an item to a set. This add() function will call equals() on the current objects in the set to see if the new item needs to be added or is already there. I believe this will cause problems when you have a custom equals() implementation. The problem is, several of the items in the set are only partially filled in, so calling the equals() function is not valid at that time. For example, my equals() function for object calls equals() on one of its components, object B. But object B may not exist yet in the cloneMap entry, or maybe some of its fields haven't been filled in and are still null. So the equals() cannot work properly.
I've tried this on 3.2.10 and the latest 3.3.0 beta 8. Same problem.
Thanks in advance for your comments.
Rob
Any chance you can provide a test case to demonstrate the problem ?
Note that "cloneMap" is an identity map (ie the object comparision is performed on the object identity, not affected by the equal method.)
Thanks for your response. I'll see if I can try to reproduce it in such a way that I don't give away any work secrets...
The pattern of data in the area that is affected is as follows:
public class F {
private Set<A> fa;
...
}
public class A {
private C c;
...
}
public class C {
private Set<A> ca = new HashSet<A>();
...
}
The problem occurs when I copy an F. The original F contained a Set 'fa' with 10 elements. After the copy, the resulting F has a Set 'fa' with only 7 elements. I have traced the following code inside replicateCollection().
for (Object fromMember : fromCollection) {
Object toMember = replicate(fromMember);
toCollection.add(toMember);
}
I can see that, when some members get added to the set, the size of the set does not increase by 1. I looked at the internals of the A objects that are being added to the set, and they have some fields as null, including some of the fields I am using as my "business key". So the set 'add' operation is not working properly.
I tried an experiment and removed the 'ca' field of C. This field is not strictly necessary. (In our app we only really traverse the links in the direction F -> A -> C, so we regard the Set of A's inside C as 'backward' links. However, the deep copy doesn't really know about 'forward' and 'backward' links - it processes links as it encounters them.) Removing these links fixed the problem.
Rob
>I looked at the internals of the A objects that are
>being added to the set, and they have some fields as null, including some of
>the fields I am using as my "business key".
What are the types of those fields in A that are found null but
shouldn't when an A is added to the set ?
An A object is fully replicated before being added to the set. So if
some fields are found null but shouldn't, the root cause would be
related to why those fields are set to null after the replication in
the first place.
Mirroring your data structure, I ran the following using junit 4.1 and the replication seemed just fine.
////////////// A.java /////////////
public class A {
private C c;
private String name;
public A(String name) { this.name = name; }
public A() {}
public C getC() {return c;}
public void setC(C c) {this.c = c;}
public String getName() {return name;}
public void setName(String name) {this.name = name;}
@Override
public String toString() {return name;}
}
////////////// C.java /////////////
import java.util.HashSet;
import java.util.Set;
public class C {
private Set<A> ca = new HashSet<A>();
private String name;
public C() {}
public C(String name) {this.name = name;}
public Set<A> getCa() {return ca;}
public void setCa(Set<A> ca) {this.ca = ca;}
public String getName() {return name;}
public void setName(String name) {this.name = name;}
}
////////////// F.java /////////////
import java.util.Set;
public class F {
private Set<A> fa;
private String name;
public F() {}
public F(String name) {this.name = name;}
public Set<A> getFa() {return fa;}
public void setFa(Set<A> fa) {this.fa = fa;}
public String getName() {return name;}
public void setName(String name) {this.name = name;}
}
////////////// FCopyTest.java /////////////
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertNotNull;
import static org.junit.Assert.assertNotSame;
import static org.junit.Assert.assertSame;
import static org.junit.Assert.assertTrue;
import java.util.HashSet;
import java.util.Set;
import junit.framework.JUnit4TestAdapter;
import net.sf.beanlib.provider.replicator.BeanReplicator;
import org.junit.Test;
public class FCopyTest {
@Test
public void test() {
Set<A> aSet = new HashSet<A>();
F f1 = new F("f1");
C c1 = new C("c1");
for (int i=0; i < 10; i++) {
A a = new A("a" + i);
a.setC(c1);
aSet.add(a);
}
f1.setFa(aSet);
c1.setCa(aSet);
F f2 = new BeanReplicator().replicateBean(f1);
assertNotNull(f2);
assertNotSame(f1, f2);
assertEquals(f1.getName(), f2.getName());
assertTrue(f2.getFa().size() == 10);
C c2 = null;
for (A a: f2.getFa()) {
if (c2 == null) {
c2 = a.getC();
assertNotSame(c1, c2);
}
else
assertSame(c2, a.getC());
}
System.out.println(f1.getFa());
System.out.println(f2.getFa());
}
public static junit.framework.Test suite() {
return new JUnit4TestAdapter(FCopyTest.class);
}
}
////////// System.out ////////////////
[a4, a3, a9, a1, a8, a5, a0, a6, a2, a7]
[a0, a3, a9, a6, a5, a8, a1, a7, a2, a4]
>An A object is fully replicated before being added to the set.
This looks like the crux of it. This is definitely not happening. I've examined the set just before something gets added to it, and several of the members have null fields that haven't been filled in.
It seems to me that items can be added to the cloneMap before they are fully populated: is this true, because that it certainly what I am seeing?
So, in my data structure I must have a pattern of associations that causes something to be found in the cloneMap and added to the set before it's been fully populated. I only showed a fragment of our data structures. In reality, the classes have many more fields, and it's the case that, during processing of an object, replicating one of its fields ultimately causes that object to be visited again, so there are indirect circular references.
Thanks for looking at this, anyway, and maybe we can get to the bottom of it.
As I said, at the moment, I have a workaround that involves removing one of the backward links (and hence removing some of the circularity?). If I get the time, I'll try to product a complete code sample that you can run to see the problem.
>several of the members have null fields that haven't been filled in.
For the fields that are null, do they have a pair of public getter/setter methods ? If not, they won't get populated. (Remember the Java Bean convention ?)
>It seems to me that items can be added to the cloneMap before they are fully populated: is this true
This is true, but it's irrelevant. cloneMap is used to record the identities of the objects that are being cloned. It doesn't matter what the objects contain or whether they are fully populated.
The best way to demonstrate/prove the issue is to provide a little/minimal test case that exhibits the problematic behavior.
A problem is usually half solved if it can be repeated.
To explain further, there is a difference between adding to the cloneMap vs adding to the Set member fields defined by your classes.
cloneMap is internally used by Beanlib, and it doesn't matter whether the objects added to it are fully populated beforehand. Indeed, in most cases they won't be, but that doesn't matter.
Objects being added to the Set or other Collection member fields defined by your classes, however, are always fully populated (with a best attempt approach) beforehand. So null member fields result only if the best attempt failed (such as missing getter/setter).
I've checked all the getters and setters, and they are all there (and public).
The reason I'm focusing on clonedMap is because an item in clonedMap is picked up and added to the faulty result set, and that item is only partially filled in (in fact, in the debug session I just ran, *none* of its fields were filled in).
Here's my data structure:
class ProductClassSchemeService {
private ProductClass productClass;
private ServiceScheme serviceScheme;
// some other fields
// public getters and setters
}
class ServiceScheme {
private Set<ProductClassSchemeService> productClassSchemeServices = new HashSet<ProductClassSchemeService>();
// some other fields
// public getters and setters
}
My equals() method for ProductClassSchemeService uses both the "productClass" and "serviceScheme" properties.
During my clone operation it gets to a ServiceScheme (ServiceScheme is quite deep in the object hierarchy that I'm cloning), and I reach the beanlib method processSetterMethod() for property "productClassSchemeServices". My source object has 10 ProductClassSchemeService objects in this Set.
I let it run and it gets to replicateCollection().
// recursively populate member objects.
for (Object fromMember : fromCollection) {
Object toMember = replicate(fromMember);
toCollection.add(toMember);
}
I trace through the processing of the first Set element (which is a ProductClassSchemeService object).
Let's focus on the line
Object toMember = replicate(fromMember);
If I examine 'fromMember' before executing this line I can see that its 'productClass' and 'serviceScheme' members are populated - they are not null.
I step through and trace the execution of 'replicate'.
What happens now is that it gets to the function
private <T> T replicate(Object from, Class<T> toClass)
and it gets to the line
T to = (T)clonedMap.get(from);
At this point, it is finding the item in clonedMap, so 'to' gets set to a ProductClassSchemeService object from clonedMap. But the item it finds in clonedMap has its 'productClass' and 'serviceScheme' members set to null instead of being populated. If an item is found in clonedMap, as in this case, the code returns straight away back to replicateCollection() and adds that item to the 'toCollection' set.
So this illustrates the problem. When replicating a source Set member it finds that member is already in clonedMap, and adds it to the destination Set. But that item has not yet had its fields populated, so the item that gets added to the destination Set is invalid (at that point). When it comes to add the second member of the Set, and performs the 'add' function, the first member is still invalid (still has null fields), so 'add' surely cannot work properly in this case. Maybe you could say that by the time that the whole replication has been done, the item will have had all its fields populated. But by that time it's too late - the fields of the first member need to be fully populated before the second member gets added to the set.
So maybe I can generalise the problem to say:
"If, when replicating a member of a set, it finds that that member is already in clonedMap (it presumably got into clonedMap because there was a reference to it elsewhere in the object graph), it will just add that value to the set straight from clonedMap. But at that point, the object in clonedMap might not have been populated, so the set element is invalid, and this can cause problems when another member is added to the set, because equals() might not work properly."
Having just done the debugging session and written this down I actually understand more about the problem myself, so I'll try to replicate it using a small amount of test code.
Rob
Hi Rob,
I managed to simulate the problem, and may have a solution for your particular case. The solution is to plug in a different setter method collector, such that the collection/map member properties will be populated last.
Can you try to use PublicSetterCollectionLastMethodCollector to see if it works ? Code available from
http://beanlib.svn.sourceforge.net/viewvc/beanlib/trunk/beanlib/src/net/sf/beanlib/provider/collector/PublicSetterCollectionLastMethodCollector.java?revision=215&view=markup
Sample usage:
A fromBean = ...
BeanTransformerSpi beanTransformer = BeanTransformer.newBeanTransformer();
beanTransformer.initSetterMethodCollector(new PublicSetterCollectionLastMethodCollector());
A toBean = BeanReplicator.newBeanReplicatable(beanTransformer)
.replicateBean(fromBean, fromBean.getClass());
Or see:
http://beanlib.svn.sourceforge.net/viewvc/beanlib/trunk/beanlib-test/src/net/sf/beanlib/provider/collector/PublicSetterCollectionLastMethodCollectorTest.java?view=markup
Hello Hanson,
Thanks for your help on this. I'm glad you managed to produce a small test to illustrate the problem.
I've just got round to testing your fix, and I've hit a snag with it. It doesn't seem to work in a Hibernate environment. At the moment, my code is just
Hibernate3BeanReplicator replicator = new Hibernate3BeanReplicator();
T result = replicator.deepCopy(from);
I've tried replacing this with your fix, but I get a Hibernate lazy initialisation error due to all the fields of 'from' not yet being loaded.
How do I get it to work with Hibernate?
Thanks,
Rob
Sorry Hanson, ignore that last message. I was being a bit daft, and not realising that I could call initSetterMethodCollector() on my replicator object.
Rob
Hello Hanson,
Unfortunately I'm still seeing the same problem.
I'm running it on 3.3.0beta9. The pattern of access to the methods has now changed, but the same issue is occurring, and for the same reason (it's picking up partially filled in items from clonedMap and adding them to the destination set).
I added a check to the end of replicateCollection() to see if the 'from' set size is the same as the 'to' set size, and there are several cases where they differ.
Rob
I think I can understand why you may still have the problem, when the level of copying is nested with multiple level of object member fields. The solution would be not just changing the ordering of setter/copying to just non-collection members first and then collection members, but in the order of
1) primitive member fields
2) non-collection member fields
3) collection member fields
You can try constructing such setter collector to see if that works. Otherwise, I will quickly build one over the weekend.
Also, (as explained in this forum earlier) if you are using the initSetterMethodCollector on Hibernate3BeanReplicator, you cannot use deepCopy which would override what you set with the defaults. Use the "copy" method instead which is a lower level API that allows fine grained control over the copying process. You can still achieve the deep copy effect with a custom setter method collector. Have a peek inside the deepCopy implementation to see how the other parameters are set.
Hi Rob,
Can you try the OrderedMethodCollector to see if it works:
http://beanlib.svn.sourceforge.net/viewvc/beanlib/trunk/beanlib/src/net/sf/beanlib/provider/collector/OrderedMethodCollector.java?view=markup
Sample usage (deep copy) with Hibernate3BeanReplicator:
HibernateBeanReplicator replicator = new Hibernate3BeanReplicator()
.initSetterMethodCollector(new OrderedMethodCollector());
Object from = ...
Object to = replicator.copy(from);