The short program below causes a segmentation violation. If you un-comment the first pmol->EndModify(), the program doesn't crash. The crash occurs on kekulize.cpp line 164, which is being called from the SMARTS matching because the particular SMARTS wants to know the number of bonds ("[v1]") to each atom. The problem seems to be that the LSSR being used isn't valid because of the deleted atom. Calling EndModify() before applying the SMARTS apparently resets something so that the segfault doesn't occur.
~~~~~~~~~~~~~~~~~~~
using namespace std;
using namespace OpenBabel;
int main(int argc, char **argv) {
OBMol *pmol = new OBMol;
OBConversion conv(&cin, &cout);
conv.SetInAndOutFormats("smi", "smi");
conv.ReadString(pmol, "Cl.OC1CN2CCC1CC2");
OBAtom *atom = pmol->GetAtom(1); // Cl atom
pmol->BeginModify();
pmol->DeleteAtom(atom);
OBSmartsPattern smarts;
smarts.Init("[v1]");
// pmol->EndModify(); // uncomment this and the segfault disappears
smarts.Match(*pmol);
pmol->EndModify();
conv.Write(pmol);
}
~~~~~~~~~~~~~~~~~~~~~~
The code as presented works for me under MSVC. What Open Babel codebase are you running against?
It's a stock OB 2.3.2 running on Ubuntu.
If you're running on a different operating system, it's possible that the
invalid pointers just happen to not cause a segfault. Maybe something like
valgrind?
Thanks,
Craig
On Fri, May 10, 2013 at 10:30 AM, Noel O'Boyle baoilleach@users.sf.netwrote:
Related
Bugs: #873
Actually, even without reproducing the segfault, I think I can see the problem. And I don't think it's with OB.
You shouldn't put anything inside a Begin/EndModify block that relies on perceived data. The whole point of the block is to postpone the reperception until the final EndModify. Within the block, if the perception code becomes invalid, it is not updated and so you get segfaults if you use it.
The specific cause here is either:
(a) the fact that DeleteAtom marks the LSSR and SSSR as needed to be reperceived. This would be reperceived automatically next time the user asks for the SSSR but here you do this within a Modify block, and so the recalculation of the SSSR is suppressed.
or
(b) DeleteAtom contains its Begin/EndModify block, indicating that various perceived information has been invalidated. Since you access the information before the outer EndModify is reached, the vengeful Furies descend.
Typo: "contains its own Begin/EndModify block"
I partially agree. I see your point about not relying on perceived data once you've made a modification. (Although even there ... what, exactly, invalidates perceived data? It's not clear at all.)
On the other hand, getting a segfault for code that looks perfectly reasonable is always bad. Was it reasonable for me to guess that doing a SMARTS match for "[Fe]" was going to use the (invalidated) LSSR to recompute the aromaticity of every atom? It cost me a couple days to track this down. (The real code was much more complex than the example posted above ... the atom.Delete() was far, far removed in both time and space from the pattern.Match() call.)
More documentation is needed I guess:
What invalidates an LSSR: DeleteAtom, ..., ...
What does DeleteAtom invalidate: LSSR, ...
This shouldn't be that difficult and it would help improve the code too probably.
Documenting what relies on perceived data is not so easy. To my mind, pretty much any function that does something useful perceives almost everything. The SMARTS matcher probably shouldn't need to do that, it's true - Roger likes to talk about this topic quite a bit :-).