Creating a pointer to a ProfileShiftFit and then calling compute() causes the ProfileShiftFit object to be erroniously deleted (i.e. memory freed), which causes a segfault when later used.
This is because during the compute() call, the this pointer is passed to create a Functor, which then creates a reference to the current object using the Reference::To framework, which almost immediately ends up with zero references, and hence is freed at the end of the call to compute().
Of course, one solution is to create a Reference::To the profileshiftfit object in the external code, but I'm not sure this should be nessicary? In any case, the bug really appears because the python interface generated by swig does not create "Reference::To" wrappers when creating a ProfileShiftFit object (I'm not sure it does this for any objects created with the python interface?), and so this causes a segfault in python code
This is a fairly recent issue - works as of 2024-03-01, seems broken by 2024-06-07. I'm not sure what was different in old versions. If I had to guess, it would be the changes around 0dd890 79722e where changes are made to the deletion of references in Reference::Able. I suspect that whatever bug was fixed was actually mistakenly allowing the references to persist, so I doubt that it's a matter of just reverting that change.
I think either the swig python code wrap new objects as a Reference::To, which I've no idea how to do; OR I think the solution is to avoid wrapping the this pointer in a Referece::To if it is not currently reference counted. I guess this has to be somehow in the Functor code? I'm not sure if there is a generic way to solve it?
Example C++ snippet to trigger the bug:
Pulsar::ProfileShiftFit *shiftFit = new Pulsar::ProfileShiftFit();
shiftFit->set_standard(std_profile);
shiftFit->set_Profile(archive_profile);
shiftFit->compute();
Estimate<double> shift = shiftFit->get_shift();
Example python snippet to trigger the bug:
shiftFit = psrchive.ProfileShiftFit()
shiftFit.set_standard(std_profile)
shiftFit.set_Profile(archive_profile)
shift = shiftFit.get_shift()
In both cases std_profile and archive_profile are Profile objects (well pointers to Pulsar::Profile objects for the C++ code) - I think any profile from any archive will do the trick.
Diff:
Surprisingly, the reference tracking feature already exists in the python interface, but it doesn't track the reference to all objects. Explicitly telling it to track ProfileShiftFit fixes the immediate bug, though likely other objects also need to be explicitly tracked.
Fixed in 209acda05
Thanks Mike. It has been quite a long time since I looked at this stuff, but yes the idea of the "pointer_tracker" stuff in psrchive.i is to automatically create Reference::To instances for python variables when relevant. I think this should happen automatically for any class that inherits Reference::Able (including ProfileShiftFit), and we should not have to explicitly name all such classes (it's basically all of psrchive). This at least used to work, I guess we should look closely at those commits you mentioned and think about how it interacts with this scheme.
Right - I agree that it should work, but I checked the swig generated
wrap_psrchive.cxx file and it really just never made the call. Checking a
few other calls to constructors it also seemed missing. It does appear in
the "factory methods" like load_Archive() etc.
I don't think it's a change in swig though as I'm pretty sure it failed
with one built using an old swig version.
Anyway at least our immediate issue is solved (i.e. we can run our
pipelines again) but for sure it would be good to solve it "properly".
Mike
On Wed, 12 Feb 2025, 15:30 Paul Demorest, demorest@users.sourceforge.net
wrote:
Related
Bugs:
#495