From: Carsten N. <car...@gm...> - 2010-08-18 21:07:36
Attachments:
att.container.ref.count.diff
|
Hello Gerrit, we are having one more problem with our app. The setup is like this: editor: Aspect 0: multiple threads implementing the editor logic Aspect 1: one thread sending changes to our cave cave: Aspect 0: one thread rendering Aspect 1: one thread receiving changes The cave part crashes under certain (reproducible) circumstances in AttachmentContainer::resolveLinks on the unlinkParent call. It looks like the attachment is already gone. Looking through the ref counting for the attachments I've noticed that there is the isMTLocal() check and depending on that a recorded or unrecorded ref change is made. I don't think that is correct, since the attachment map essentially behaves like a field and we never record the ref count changes made by a field, so I believe these should all be unrecorded. Indeed the attached patch fixes our problem. What do you think? The isMTLocal() change was done in r1357, maybe the commit message "record attachment refcounts if container is a bundle to fix multi aspect mem leak" allows you to recall what the problem addressed by it was? Thanks & Cheers, Carsten |
From: Gerrit V. <vo...@vo...> - 2010-08-19 00:08:07
|
Hi, On Wed, 2010-08-18 at 16:07 -0500, Carsten Neumann wrote: > Hello Gerrit, > > we are having one more problem with our app. The setup is like this: > > editor: > Aspect 0: multiple threads implementing the editor logic > Aspect 1: one thread sending changes to our cave > > cave: > Aspect 0: one thread rendering > Aspect 1: one thread receiving changes > > The cave part crashes under certain (reproducible) circumstances in > AttachmentContainer::resolveLinks on the unlinkParent call. It looks > like the attachment is already gone. > Looking through the ref counting for the attachments I've noticed that > there is the isMTLocal() check and depending on that a recorded or > unrecorded ref change is made. I don't think that is correct, since the > attachment map essentially behaves like a field and we never record the > ref count changes made by a field, so I believe these should all be > unrecorded. > Indeed the attached patch fixes our problem. What do you think? > The isMTLocal() change was done in r1357, maybe the commit message > "record attachment refcounts if container is a bundle to fix multi > aspect mem leak" allows you to recall what the problem addressed by it was? short question, so I have the full picture, what was the reason behind r2472 because that one changed what was recorded and what was not ? kind regards gerrit |
From: Carsten N. <car...@gm...> - 2010-08-19 00:45:59
|
Hello Gerrit, On 18.08.2010 19:07, Gerrit Voß wrote: > On Wed, 2010-08-18 at 16:07 -0500, Carsten Neumann wrote: > short question, so I have the full picture, what was the reason behind > r2472 because that one changed what was recorded and what was not ? on the cave side in our app attachments were dying when RemoteAspect::receiveSync() cleared the newContainers vector at the end. The change in r2472 helped for that case and seemed to simply correct a mixup (it does not seem to make sense to use recorded ref count changes on the attachments if the AttachmentContainer itself is MTLocal). Having thought a bit more about this now I believe AttachmentContainer::_sfAttachments should behave like any other field, i.e. only make unrecorded changes - if that is not the correct behavior all other containers would have the same issue. Cheers, Carsten |
From: Gerrit V. <vo...@vo...> - 2010-08-19 12:26:37
|
Hi, On Wed, 2010-08-18 at 19:46 -0500, Carsten Neumann wrote: > Hello Gerrit, > > On 18.08.2010 19:07, Gerrit Voß wrote: > > On Wed, 2010-08-18 at 16:07 -0500, Carsten Neumann wrote: > > short question, so I have the full picture, what was the reason behind > > r2472 because that one changed what was recorded and what was not ? > > on the cave side in our app attachments were dying when > RemoteAspect::receiveSync() cleared the newContainers vector at the end. ok, what kind of attachments were these ? All or only a variant (MTLocal, Global) ? > The change in r2472 helped for that case and seemed to simply correct a > mixup (it does not seem to make sense to use recorded ref count changes > on the attachments if the AttachmentContainer itself is MTLocal). hmm, MTLocal does not mean ClusterLocal but I guess in most cases we actually use the combination. I'll try to track down why I changed this, my current guess would be named local containers, because it looks like it was for containers which themselves have multiple aspects but the carrying one does not. > Having thought a bit more about this now I believe > AttachmentContainer::_sfAttachments should behave like any other field, > i.e. only make unrecorded changes - if that is not the correct behavior > all other containers would have the same issue. Not unlikely but sfAttachments is the field where local and non local combinations will be most likely mixed, so finding problems there is far more likely. It is also something most other containers just derive, independent of their own global/local nature so I expect special cases in there. I'm not sure if pushing them up to every single pointer field is necessary so I'm not so worried about having special cases in there for the time being. My current plan would be to rebuild you aspect setup, somewhere in the changed/transmit pipleline something seems to get lost. Short question, is the server side one of the standard servers or a modified version ? kind regards gerrit |
From: Carsten N. <car...@gm...> - 2010-08-19 14:09:20
|
Hello Gerrit, Gerrit Voß wrote: > On Wed, 2010-08-18 at 19:46 -0500, Carsten Neumann wrote: >> On 18.08.2010 19:07, Gerrit Voß wrote: >>> On Wed, 2010-08-18 at 16:07 -0500, Carsten Neumann wrote: >>> short question, so I have the full picture, what was the reason behind >>> r2472 because that one changed what was recorded and what was not ? >> on the cave side in our app attachments were dying when >> RemoteAspect::receiveSync() cleared the newContainers vector at the end. > > ok, what kind of attachments were these ? All or only a variant > (MTLocal, Global) ? it was NameAttachments, they were all global. >> The change in r2472 helped for that case and seemed to simply correct a >> mixup (it does not seem to make sense to use recorded ref count changes >> on the attachments if the AttachmentContainer itself is MTLocal). > > hmm, MTLocal does not mean ClusterLocal but I guess in most cases we > actually use the combination. ok, but I thought MTLocal implies ClusterLocal, no? Otherwise it seems even more inconsistent, the container would exist locally in only one aspect, but still is sent over the network? > I'll try to track down why I changed this, > my current guess would be named local containers, because it looks like > it was for containers which themselves have multiple aspects but the > carrying one does not. > >> Having thought a bit more about this now I believe >> AttachmentContainer::_sfAttachments should behave like any other field, >> i.e. only make unrecorded changes - if that is not the correct behavior >> all other containers would have the same issue. > > Not unlikely but sfAttachments is the field where local and non local > combinations will be most likely mixed, so finding problems there is > far more likely. It is also something most other containers just derive, > independent of their own global/local nature so I expect special cases > in there. I'm not sure if pushing them up to every single pointer field > is necessary so I'm not so worried about having special cases in there > for the time being. hm, ok. > My current plan would be to rebuild you aspect setup, somewhere in the > changed/transmit pipleline something seems to get lost. Short question, > is the server side one of the standard servers or a modified version ? the server is in fact a VRJuggler program, that runs a separate thread to receive the CL and merges it to the juggler render thread in postFrame(). Let me back out r2472 and the AttCon change locally and give you more details on what happens. Cheers, Carsten |
From: Gerrit V. <vo...@vo...> - 2010-08-22 08:17:06
|
Hi, On Thu, 2010-08-19 at 11:10 -0500, Carsten Neumann wrote: > Hello Gerrit, > > small update: > > Carsten Neumann wrote: > > Gerrit Voß wrote: > >> My current plan would be to rebuild you aspect setup, somewhere in the > >> changed/transmit pipleline something seems to get lost. Short question, > >> is the server side one of the standard servers or a modified version ? > > > > the server is in fact a VRJuggler program, that runs a separate thread > > to receive the CL and merges it to the juggler render thread in postFrame(). > > Let me back out r2472 and the AttCon change locally and give you more > > details on what happens. > > hm, interesting, reverting both changes still allows our app to work, > but on the server side I see ok. > WARNING: Attachment::unlinkParent: Child <-> Parent link inconsistent. > > whenever we change the type of Manipulator used in the editor, e.g. when > going from translate to rotate. > The message is printed when the render thread (aspect 0) applies the > changes received by our receive thread (aspect 1): hmm, interesting. As it kind of works, I'll stick to the plan and finish to build a similar test environment and have a look. kind regards gerrit |
From: Carsten N. <car...@gm...> - 2011-01-28 23:35:36
|
Hello Gerrit, On 08/22/2010 03:16 AM, Gerrit Voß wrote: > On Thu, 2010-08-19 at 11:10 -0500, Carsten Neumann wrote: >>> the server is in fact a VRJuggler program, that runs a separate thread >>> to receive the CL and merges it to the juggler render thread in postFrame(). >>> Let me back out r2472 and the AttCon change locally and give you more >>> details on what happens. old thread, but his has come back to haunt me ;) I've just reverted r2472, as with the other fixes I made to ref counting in AttachmentContainer, ChangeList and RemoteAspect it actually did hurt. Thinking about it, since the usual case for an AttachmentContainer is isMTLocal() == false that case should act like all other pointer fields and therefore use unrecorded ref counting - why wasn't that obvious to me 6 month ago? ;) Cheers, Carsten |
From: Carsten N. <car...@gm...> - 2010-08-19 16:10:50
|
Hello Gerrit, small update: Carsten Neumann wrote: > Gerrit Voß wrote: >> My current plan would be to rebuild you aspect setup, somewhere in the >> changed/transmit pipleline something seems to get lost. Short question, >> is the server side one of the standard servers or a modified version ? > > the server is in fact a VRJuggler program, that runs a separate thread > to receive the CL and merges it to the juggler render thread in postFrame(). > Let me back out r2472 and the AttCon change locally and give you more > details on what happens. hm, interesting, reverting both changes still allows our app to work, but on the server side I see WARNING: Attachment::unlinkParent: Child <-> Parent link inconsistent. whenever we change the type of Manipulator used in the editor, e.g. when going from translate to rotate. The message is printed when the render thread (aspect 0) applies the changes received by our receive thread (aspect 1): LOG: Stack trace: thread: 8226 0:/store/home/carsten/devel/opensg2/install/OpenSGSandbox/usr/lib64/debug/libOSGBase.so.2.0.0(OSG::getCallStack()+0x57) [0x7f0a72d1eea5] 1:/store/home/carsten/devel/opensg2/install/OpenSGSandbox/usr/lib64/debug/libOSGBase.so.2.0.0(OSG::Attachment::unlinkParent(OSG::FieldContainer*, unsigned short)+0x2bb) [0x7f0a72d75537] 2:/store/home/carsten/devel/opensg2/install/OpenSGSandbox/usr/lib64/debug/libOSGBase.so.2.0.0(OSG::AttachmentContainer::resolveLinks()+0x6d) [0x7f0a72d865f1] 3:/store/home/carsten/devel/opensg2/install/OpenSGSandbox/usr/lib64/debug/libOSGBase.so.2.0.0(OSG::NodeCoreBase::resolveLinks()+0x15) [0x7f0a72db18f7] 4:/store/home/carsten/devel/opensg2/install/OpenSGSandbox/usr/lib64/debug/libOSGSystem.so.2.0.0(OSG::GroupBase::resolveLinks()+0x15) [0x7f0a6eb8e765] 5:/store/home/carsten/devel/opensg2/install/OpenSGSandbox/usr/lib64/debug/libOSGGroup.so.2.0.0(OSG::TransformBase::resolveLinks()+0x15) [0x7f0a6fd02ec1] 6:/store/home/carsten/devel/opensg2/install/OpenSGSandbox/usr/lib64/debug/libOSGContribGUI.so.2.0.0(OSG::ManipulatorBase::resolveLinks()+0x15) [0x7f0a723bec43] 7:/store/home/carsten/devel/opensg2/install/OpenSGSandbox/usr/lib64/debug/libOSGContribGUI.so.2.0.0(OSG::Manipulator::resolveLinks()+0x15) [0x7f0a723a7813] 8:/store/home/carsten/devel/opensg2/install/OpenSGSandbox/usr/lib64/debug/libOSGContribGUI.so.2.0.0(OSG::RotateManipulatorBase::resolveLinks()+0x15) [0x7f0a723bd631] 9:/store/home/carsten/devel/opensg2/install/OpenSGSandbox/usr/lib64/debug/libOSGBase.so.2.0.0(OSG::FieldContainer::subReferenceUnrecorded()+0x44) [0x7f0a72d66fce] 10:/store/home/carsten/devel/opensg2/install/OpenSGSandbox/usr/lib64/debug/libOSGBase.so.2.0.0(OSG::ChangeList::commitDelayedSubRefs()+0x54) [0x7f0a72d80d24] 11:/store/home/carsten/devel/opensg2/install/OpenSGSandbox/usr/lib64/debug/libOSGBase.so.2.0.0(OSG::ChangeList::doApply(bool)+0x53b) [0x7f0a72d81be3] 12:/store/home/carsten/devel/projects/arl_odt_app.git/build.f10.test/lib/appViewer.so(viewer::opensg_render_server::sync()+0x38) [0x7f0a599685c8] 13:/store/home/carsten/devel/projects/arl_odt_app.git/build.f10.test/lib/appViewer.so(viewer::passive_app::postFrame()+0xa6) [0x7f0a59953536] 14:/opt/vrjuggler/vrjuggler-2.3/usr/lib64/x86_64/libvrj-2_3_18.so(vrj::Kernel::controlLoop()+0x370) [0x7f0a6c2027e0] 15:/opt/vrjuggler/vrjuggler-2.3/usr/lib64/x86_64/libvpr-2_1_15.so(vpr::ThreadPosix::startThread()+0xb3) [0x7f0a6d349403] 16:/store/home/carsten/devel/opensg2/install/OpenSGSandbox/usr/lib64/debug/libOSGBase.so.2.0.0(boost::function0<void>::operator()() const+0x72) [0x7f0a72dac912] 17:/opt/vrjuggler/vrjuggler-2.3/usr/lib64/x86_64/libvpr-2_1_15.so(vprThreadFunctorFunction+0x26) [0x7f0a6d2f9ce6] 18:/lib64/libpthread.so.0 [0x3fb02073da] 19:/lib64/libc.so.6(clone+0x6d) [0x3faf6e62bd] pParent (the RotateManipulator) has this in its AspectStore: RC : 1 AS 0 : 0x7f0a5cedd9d0 AS 1 : (nil) and the NameAttachment: RC : 2 AS 0 : 0x7f0a5cef44a0 AS 1 : 0x19850b0 Apparently the RotateManipulator has already been removed from the parents of the NameAttachment (_mfParents is empty), but has it as one of its children, strange. Cheers, Carsten |