Thread: [Rdkit-discuss] Strange core dump with Morgan fingerprints with Java
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Tim D. <tdu...@gm...> - 2021-01-12 12:56:04
|
I'm struggling to work out a stange core dump I'm getting when calculating Morgan fingerprints from Java. This seems to happen with the Release_2020_09 releases but not with the Release_2019_09 ones. It does not happen when calculating RDKit fingerprints. The exact Java code involved is: RDKFuncs.MorganFingerprintMol(mol, 2); More precisely this is happening when running inside a Docker container which is running the code as a Tomcat webapp, but a simple test of running that same function inside the container directly from Java (e.g. not when running in tomcat) works OK and does not core dump. Building an otherwise identical container with the Release_2019_09 code does not core dump from Tomcat. The core dump looks like this: # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007ff9edc00518, pid=1, tid=111 # # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build 11.0.9.1+1-post-Debian-1deb10u2) # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) # Problematic frame: # [thread 145 also had an error] [thread 149 also had an error] [thread 113 also had an error] [thread 117 also had an error] C [libGraphMolWrap.so+0xa20518] void RDKit::MorganFingerprints::calcFingerprint<RDKit::SparseIntVect<unsigned int> >(RDKit::ROMol const&, unsigned int, std::vector<unsigned int, std::allocator<unsigned int> >*, std::vector<unsigned int, std::allocator<unsigned int> > const*, bool, bool, bool, bool, std::map<unsigned int, std::vector<std::pair<unsigned int, unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> > >, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, std::vector<std::pair<unsigned int, unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> > > > > >*, bool, RDKit::SparseIntVect<unsigned int>&)+0x148 It's difficult to know what's wrong, but thought it might be worth asking if anything in the Morgan fingerprint code has changed over that timeframe? It might be related to threading as the fingerprint generation is being done inside Java streams. Tim |
From: Francois B. <ml...@li...> - 2021-01-14 08:44:54
|
Hello, Please tell me if you understand why the code below is not working and if you know how to change it so that it does. Thanks a lot! :) F. --- #!/usr/bin/env python3 # try to construct a molecule with a Z stereo double bond using RWMol import rdkit from rdkit import Chem wanted_smi = 'C/N=C\\S' rwmol = Chem.RWMol() # create the atoms a0 = Chem.Atom(6) a1 = Chem.Atom(7) a2 = Chem.Atom(6) a3 = Chem.Atom(16) # add the atoms rwmol.AddAtom(a0) rwmol.AddAtom(a1) rwmol.AddAtom(a2) rwmol.AddAtom(a3) # add the bonds rwmol.AddBond(0, 1, rdkit.Chem.rdchem.BondType.SINGLE) rwmol.AddBond(1, 2, rdkit.Chem.rdchem.BondType.DOUBLE) rwmol.AddBond(2, 3, rdkit.Chem.rdchem.BondType.SINGLE) # let's see what we have so far print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; so far so good # try to specify a Z stereo bond db = rwmol.GetBondWithIdx(1) assert(db.GetBondType() == rdkit.Chem.rdchem.BondType.DOUBLE) # just checking db.SetStereo(rdkit.Chem.rdchem.BondStereo.STEREOZ) db.SetStereoAtoms(0, 3) # let's see what we have now print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; not good enough Chem.SanitizeMol(rwmol) # just checking print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; not getting better --- |
From: Greg L. <gre...@gm...> - 2021-01-14 09:37:23
|
Hi Francois, I would do this by setting the stereo to either STEREOCIS or STEREOTRANS and then calling Chem.AssignStereoChemistry(): In [6]: rwmol = Chem.RWMol() ...: # create the atoms ...: a0 = Chem.Atom(6) ...: a1 = Chem.Atom(7) ...: a2 = Chem.Atom(6) ...: a3 = Chem.Atom(16) ...: # add the atoms ...: rwmol.AddAtom(a0) ...: rwmol.AddAtom(a1) ...: rwmol.AddAtom(a2) ...: rwmol.AddAtom(a3) ...: # add the bonds ...: rwmol.AddBond(0, 1, rdkit.Chem.rdchem.BondType.SINGLE) ...: rwmol.AddBond(1, 2, rdkit.Chem.rdchem.BondType.DOUBLE) ...: rwmol.AddBond(2, 3, rdkit.Chem.rdchem.BondType.SINGLE) Out[6]: 3 In [7]: db = rwmol.GetBondWithIdx(1) In [8]: db.SetStereoAtoms(0,3) In [9]: db.SetStereo(Chem.BondStereo.STEREOCIS) In [10]: Chem.MolToSmiles(rwmol) Out[10]: 'CN=CS' In [11]: Chem.AssignStereochemistry(rwmol) In [12]: Chem.MolToSmiles(rwmol) Out[12]: 'C/N=C\\S' On Thu, Jan 14, 2021 at 9:46 AM Francois Berenger <ml...@li...> wrote: > Hello, > > Please tell me if you understand why the code below > is not working and if you know how to change it so that it does. > > Thanks a lot! :) > F. > > --- > #!/usr/bin/env python3 > > # try to construct a molecule with a Z stereo double bond using RWMol > > import rdkit > from rdkit import Chem > > wanted_smi = 'C/N=C\\S' > > rwmol = Chem.RWMol() > # create the atoms > a0 = Chem.Atom(6) > a1 = Chem.Atom(7) > a2 = Chem.Atom(6) > a3 = Chem.Atom(16) > # add the atoms > rwmol.AddAtom(a0) > rwmol.AddAtom(a1) > rwmol.AddAtom(a2) > rwmol.AddAtom(a3) > # add the bonds > rwmol.AddBond(0, 1, rdkit.Chem.rdchem.BondType.SINGLE) > rwmol.AddBond(1, 2, rdkit.Chem.rdchem.BondType.DOUBLE) > rwmol.AddBond(2, 3, rdkit.Chem.rdchem.BondType.SINGLE) > # let's see what we have so far > print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; so far so good > # try to specify a Z stereo bond > db = rwmol.GetBondWithIdx(1) > assert(db.GetBondType() == rdkit.Chem.rdchem.BondType.DOUBLE) # just > checking > db.SetStereo(rdkit.Chem.rdchem.BondStereo.STEREOZ) > db.SetStereoAtoms(0, 3) > # let's see what we have now > print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; not good enough > Chem.SanitizeMol(rwmol) # just checking > print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; not getting better > --- > > > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > |
From: Francois B. <ml...@li...> - 2021-01-15 01:48:42
|
Hi Greg, Thanks a lot for the working example! Indeed, my code missing the 'Chem.AssignStereochemistry(rwmol)' call was the key. I did not know about this function. Regards, F. On 14/01/2021 18:36, Greg Landrum wrote: > Hi Francois, > > I would do this by setting the stereo to either STEREOCIS or > STEREOTRANS and then calling Chem.AssignStereoChemistry(): > > In [6]: rwmol = Chem.RWMol() > ...: # create the atoms > ...: a0 = Chem.Atom(6) > ...: a1 = Chem.Atom(7) > ...: a2 = Chem.Atom(6) > ...: a3 = Chem.Atom(16) > ...: # add the atoms > ...: rwmol.AddAtom(a0) > ...: rwmol.AddAtom(a1) > ...: rwmol.AddAtom(a2) > ...: rwmol.AddAtom(a3) > ...: # add the bonds > ...: rwmol.AddBond(0, 1, rdkit.Chem.rdchem.BondType.SINGLE) > ...: rwmol.AddBond(1, 2, rdkit.Chem.rdchem.BondType.DOUBLE) > ...: rwmol.AddBond(2, 3, rdkit.Chem.rdchem.BondType.SINGLE) > Out[6]: 3 > > In [7]: db = rwmol.GetBondWithIdx(1) > > In [8]: db.SetStereoAtoms(0,3) > > In [9]: db.SetStereo(Chem.BondStereo.STEREOCIS) > > In [10]: Chem.MolToSmiles(rwmol) > Out[10]: 'CN=CS' > > In [11]: Chem.AssignStereochemistry(rwmol) > > In [12]: Chem.MolToSmiles(rwmol) > Out[12]: 'C/N=C\\S' > > On Thu, Jan 14, 2021 at 9:46 AM Francois Berenger <ml...@li...> > wrote: > >> Hello, >> >> Please tell me if you understand why the code below >> is not working and if you know how to change it so that it does. >> >> Thanks a lot! :) >> F. >> >> --- >> #!/usr/bin/env python3 >> >> # try to construct a molecule with a Z stereo double bond using >> RWMol >> >> import rdkit >> from rdkit import Chem >> >> wanted_smi = 'C/N=C\\S' >> >> rwmol = Chem.RWMol() >> # create the atoms >> a0 = Chem.Atom(6) >> a1 = Chem.Atom(7) >> a2 = Chem.Atom(6) >> a3 = Chem.Atom(16) >> # add the atoms >> rwmol.AddAtom(a0) >> rwmol.AddAtom(a1) >> rwmol.AddAtom(a2) >> rwmol.AddAtom(a3) >> # add the bonds >> rwmol.AddBond(0, 1, rdkit.Chem.rdchem.BondType.SINGLE) >> rwmol.AddBond(1, 2, rdkit.Chem.rdchem.BondType.DOUBLE) >> rwmol.AddBond(2, 3, rdkit.Chem.rdchem.BondType.SINGLE) >> # let's see what we have so far >> print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; so far so good >> # try to specify a Z stereo bond >> db = rwmol.GetBondWithIdx(1) >> assert(db.GetBondType() == rdkit.Chem.rdchem.BondType.DOUBLE) # just >> >> checking >> db.SetStereo(rdkit.Chem.rdchem.BondStereo.STEREOZ) >> db.SetStereoAtoms(0, 3) >> # let's see what we have now >> print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; not good enough >> Chem.SanitizeMol(rwmol) # just checking >> print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; not getting better >> --- >> >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdk...@li... >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss |
From: Francois B. <ml...@li...> - 2021-01-15 02:49:10
|
On 14/01/2021 18:36, Greg Landrum wrote: > Hi Francois, > > I would do this by setting the stereo to either STEREOCIS or > STEREOTRANS and then calling Chem.AssignStereoChemistry(): > > In [6]: rwmol = Chem.RWMol() > ...: # create the atoms > ...: a0 = Chem.Atom(6) > ...: a1 = Chem.Atom(7) > ...: a2 = Chem.Atom(6) > ...: a3 = Chem.Atom(16) > ...: # add the atoms > ...: rwmol.AddAtom(a0) > ...: rwmol.AddAtom(a1) > ...: rwmol.AddAtom(a2) > ...: rwmol.AddAtom(a3) > ...: # add the bonds > ...: rwmol.AddBond(0, 1, rdkit.Chem.rdchem.BondType.SINGLE) > ...: rwmol.AddBond(1, 2, rdkit.Chem.rdchem.BondType.DOUBLE) > ...: rwmol.AddBond(2, 3, rdkit.Chem.rdchem.BondType.SINGLE) > Out[6]: 3 > > In [7]: db = rwmol.GetBondWithIdx(1) > > In [8]: db.SetStereoAtoms(0,3) > > In [9]: db.SetStereo(Chem.BondStereo.STEREOCIS) > > In [10]: Chem.MolToSmiles(rwmol) > Out[10]: 'CN=CS' > > In [11]: Chem.AssignStereochemistry(rwmol) > > In [12]: Chem.MolToSmiles(rwmol) > Out[12]: 'C/N=C\\S' Here is the fun part: Chem.SanitizeMol(rwmol) print(Chem.MolToSmiles(rwmol)) # --> CN=CS "Sanitization" of the rwmol got rid of the stereo info that we just inserted. Is this a "feature" of SanitizeMol? I was being a good kid, I thought that someone must always sanitize a RWMol prior to extracting the final resulting molecule (in the end I want a SMILES). Regards, F. > On Thu, Jan 14, 2021 at 9:46 AM Francois Berenger <ml...@li...> > wrote: > >> Hello, >> >> Please tell me if you understand why the code below >> is not working and if you know how to change it so that it does. >> >> Thanks a lot! :) >> F. >> >> --- >> #!/usr/bin/env python3 >> >> # try to construct a molecule with a Z stereo double bond using >> RWMol >> >> import rdkit >> from rdkit import Chem >> >> wanted_smi = 'C/N=C\\S' >> >> rwmol = Chem.RWMol() >> # create the atoms >> a0 = Chem.Atom(6) >> a1 = Chem.Atom(7) >> a2 = Chem.Atom(6) >> a3 = Chem.Atom(16) >> # add the atoms >> rwmol.AddAtom(a0) >> rwmol.AddAtom(a1) >> rwmol.AddAtom(a2) >> rwmol.AddAtom(a3) >> # add the bonds >> rwmol.AddBond(0, 1, rdkit.Chem.rdchem.BondType.SINGLE) >> rwmol.AddBond(1, 2, rdkit.Chem.rdchem.BondType.DOUBLE) >> rwmol.AddBond(2, 3, rdkit.Chem.rdchem.BondType.SINGLE) >> # let's see what we have so far >> print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; so far so good >> # try to specify a Z stereo bond >> db = rwmol.GetBondWithIdx(1) >> assert(db.GetBondType() == rdkit.Chem.rdchem.BondType.DOUBLE) # just >> >> checking >> db.SetStereo(rdkit.Chem.rdchem.BondStereo.STEREOZ) >> db.SetStereoAtoms(0, 3) >> # let's see what we have now >> print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; not good enough >> Chem.SanitizeMol(rwmol) # just checking >> print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; not getting better >> --- >> >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdk...@li... >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss |
From: Greg L. <gre...@gm...> - 2021-01-15 07:22:28
|
Calling AssignStereochemistry() after you have called SantizeMol() should resolve this. On Fri, Jan 15, 2021 at 3:48 AM Francois Berenger <ml...@li...> wrote: > On 14/01/2021 18:36, Greg Landrum wrote: > > Hi Francois, > > > > I would do this by setting the stereo to either STEREOCIS or > > STEREOTRANS and then calling Chem.AssignStereoChemistry(): > > > > In [6]: rwmol = Chem.RWMol() > > ...: # create the atoms > > ...: a0 = Chem.Atom(6) > > ...: a1 = Chem.Atom(7) > > ...: a2 = Chem.Atom(6) > > ...: a3 = Chem.Atom(16) > > ...: # add the atoms > > ...: rwmol.AddAtom(a0) > > ...: rwmol.AddAtom(a1) > > ...: rwmol.AddAtom(a2) > > ...: rwmol.AddAtom(a3) > > ...: # add the bonds > > ...: rwmol.AddBond(0, 1, rdkit.Chem.rdchem.BondType.SINGLE) > > ...: rwmol.AddBond(1, 2, rdkit.Chem.rdchem.BondType.DOUBLE) > > ...: rwmol.AddBond(2, 3, rdkit.Chem.rdchem.BondType.SINGLE) > > Out[6]: 3 > > > > In [7]: db = rwmol.GetBondWithIdx(1) > > > > In [8]: db.SetStereoAtoms(0,3) > > > > In [9]: db.SetStereo(Chem.BondStereo.STEREOCIS) > > > > In [10]: Chem.MolToSmiles(rwmol) > > Out[10]: 'CN=CS' > > > > In [11]: Chem.AssignStereochemistry(rwmol) > > > > In [12]: Chem.MolToSmiles(rwmol) > > Out[12]: 'C/N=C\\S' > > Here is the fun part: > > Chem.SanitizeMol(rwmol) > print(Chem.MolToSmiles(rwmol)) # --> CN=CS > > "Sanitization" of the rwmol got rid of the stereo info that > we just inserted. > > Is this a "feature" of SanitizeMol? > > I was being a good kid, I thought that someone must always sanitize > a RWMol prior to extracting the final resulting molecule (in the end > I want a SMILES). > > Regards, > F. > > > On Thu, Jan 14, 2021 at 9:46 AM Francois Berenger <ml...@li...> > > wrote: > > > >> Hello, > >> > >> Please tell me if you understand why the code below > >> is not working and if you know how to change it so that it does. > >> > >> Thanks a lot! :) > >> F. > >> > >> --- > >> #!/usr/bin/env python3 > >> > >> # try to construct a molecule with a Z stereo double bond using > >> RWMol > >> > >> import rdkit > >> from rdkit import Chem > >> > >> wanted_smi = 'C/N=C\\S' > >> > >> rwmol = Chem.RWMol() > >> # create the atoms > >> a0 = Chem.Atom(6) > >> a1 = Chem.Atom(7) > >> a2 = Chem.Atom(6) > >> a3 = Chem.Atom(16) > >> # add the atoms > >> rwmol.AddAtom(a0) > >> rwmol.AddAtom(a1) > >> rwmol.AddAtom(a2) > >> rwmol.AddAtom(a3) > >> # add the bonds > >> rwmol.AddBond(0, 1, rdkit.Chem.rdchem.BondType.SINGLE) > >> rwmol.AddBond(1, 2, rdkit.Chem.rdchem.BondType.DOUBLE) > >> rwmol.AddBond(2, 3, rdkit.Chem.rdchem.BondType.SINGLE) > >> # let's see what we have so far > >> print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; so far so good > >> # try to specify a Z stereo bond > >> db = rwmol.GetBondWithIdx(1) > >> assert(db.GetBondType() == rdkit.Chem.rdchem.BondType.DOUBLE) # just > >> > >> checking > >> db.SetStereo(rdkit.Chem.rdchem.BondStereo.STEREOZ) > >> db.SetStereoAtoms(0, 3) > >> # let's see what we have now > >> print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; not good enough > >> Chem.SanitizeMol(rwmol) # just checking > >> print(Chem.MolToSmiles(rwmol)) # --> 'CN=CS'; not getting better > >> --- > >> > >> _______________________________________________ > >> Rdkit-discuss mailing list > >> Rdk...@li... > >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > |
From: Tim D. <tdu...@gm...> - 2021-02-02 18:12:52
|
Wondering if anyone had any thoughts on this core dump from Java. What other info would be useful? Tim On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon <tdu...@gm...> wrote: > I'm struggling to work out a stange core dump I'm getting when calculating > Morgan fingerprints from Java. This seems to happen with the > Release_2020_09 releases but not with the Release_2019_09 ones. It does not > happen when calculating RDKit fingerprints. The exact Java code involved is: > > RDKFuncs.MorganFingerprintMol(mol, 2); > > More precisely this is happening when running inside a Docker container > which is running the code as a Tomcat webapp, but a simple test of running > that same function inside the container directly from Java (e.g. not when > running in tomcat) works OK and does not core dump. > Building an otherwise identical container with the Release_2019_09 code > does not core dump from Tomcat. > > The core dump looks like this: > > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x00007ff9edc00518, pid=1, tid=111 > # > # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build > 11.0.9.1+1-post-Debian-1deb10u2) > # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2, > mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) > # Problematic frame: > # [thread 145 also had an error] > [thread 149 also had an error] > [thread 113 also had an error] > [thread 117 also had an error] > C [libGraphMolWrap.so+0xa20518] void > RDKit::MorganFingerprints::calcFingerprint<RDKit::SparseIntVect<unsigned > int> >(RDKit::ROMol const&, unsigned int, std::vector<unsigned int, > std::allocator<unsigned int> >*, std::vector<unsigned int, > std::allocator<unsigned int> > const*, bool, bool, bool, bool, > std::map<unsigned int, std::vector<std::pair<unsigned int, unsigned int>, > std::allocator<std::pair<unsigned int, unsigned int> > >, > std::less<unsigned int>, std::allocator<std::pair<unsigned int const, > std::vector<std::pair<unsigned int, unsigned int>, > std::allocator<std::pair<unsigned int, unsigned int> > > > > >*, bool, > RDKit::SparseIntVect<unsigned int>&)+0x148 > > It's difficult to know what's wrong, but thought it might be worth asking > if anything in the Morgan fingerprint code has changed over that timeframe? > It might be related to threading as the fingerprint generation is being > done inside Java streams. > > Tim > > > > |
From: Greg L. <gre...@gm...> - 2021-02-03 12:36:44
|
Hi Tim, I haven't seen this particular problem myself, nor have we gotten any reports of crashes from the Morgan fingerprinting code. Comparing the fingerprinting code itself across the 2019.09 and 2020.09 branches I also don't see anything which is likely to cause problems, but one never knows. One thing that might help to know is how you construct the molecule's you're generating fingerprints for: are these from one of the RDKit file parsers? Have they been sanitized? Another thing you might have already tried, but it's worth checking anyway: can you force your web app to only run a single thread at a time? That shouldn't be a problem with the morgan fingerprinting code, but it's still worth the experiment. -greg On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon <tdu...@gm...> wrote: > Wondering if anyone had any thoughts on this core dump from Java. > What other info would be useful? > > Tim > > On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon <tdu...@gm...> > wrote: > >> I'm struggling to work out a stange core dump I'm getting when >> calculating Morgan fingerprints from Java. This seems to happen with the >> Release_2020_09 releases but not with the Release_2019_09 ones. It does not >> happen when calculating RDKit fingerprints. The exact Java code involved is: >> >> RDKFuncs.MorganFingerprintMol(mol, 2); >> >> More precisely this is happening when running inside a Docker container >> which is running the code as a Tomcat webapp, but a simple test of running >> that same function inside the container directly from Java (e.g. not when >> running in tomcat) works OK and does not core dump. >> Building an otherwise identical container with the Release_2019_09 code >> does not core dump from Tomcat. >> >> The core dump looks like this: >> >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # SIGSEGV (0xb) at pc=0x00007ff9edc00518, pid=1, tid=111 >> # >> # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build >> 11.0.9.1+1-post-Debian-1deb10u2) >> # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2, >> mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >> # Problematic frame: >> # [thread 145 also had an error] >> [thread 149 also had an error] >> [thread 113 also had an error] >> [thread 117 also had an error] >> C [libGraphMolWrap.so+0xa20518] void >> RDKit::MorganFingerprints::calcFingerprint<RDKit::SparseIntVect<unsigned >> int> >(RDKit::ROMol const&, unsigned int, std::vector<unsigned int, >> std::allocator<unsigned int> >*, std::vector<unsigned int, >> std::allocator<unsigned int> > const*, bool, bool, bool, bool, >> std::map<unsigned int, std::vector<std::pair<unsigned int, unsigned int>, >> std::allocator<std::pair<unsigned int, unsigned int> > >, >> std::less<unsigned int>, std::allocator<std::pair<unsigned int const, >> std::vector<std::pair<unsigned int, unsigned int>, >> std::allocator<std::pair<unsigned int, unsigned int> > > > > >*, bool, >> RDKit::SparseIntVect<unsigned int>&)+0x148 >> >> It's difficult to know what's wrong, but thought it might be worth asking >> if anything in the Morgan fingerprint code has changed over that timeframe? >> It might be related to threading as the fingerprint generation is being >> done inside Java streams. >> >> Tim >> >> >> >> _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > |
From: Tim D. <tdu...@gm...> - 2021-02-03 15:26:06
|
Hi Greg, The problem seems to happen whether running in a single thread or not. It does not seem to depend on the molecule. Seems to always happen AFAICT. The molecules are created from SMILES using RWMol.MolFromSmiles(smiles), which I believe sanatizes by default. Tim On Wed, Feb 3, 2021 at 12:36 PM Greg Landrum <gre...@gm...> wrote: > Hi Tim, > > I haven't seen this particular problem myself, nor have we gotten any > reports of crashes from the Morgan fingerprinting code. > Comparing the fingerprinting code itself across the 2019.09 and 2020.09 > branches I also don't see anything which is likely to cause problems, but > one never knows. > > One thing that might help to know is how you construct the molecule's > you're generating fingerprints for: are these from one of the RDKit file > parsers? Have they been sanitized? > > Another thing you might have already tried, but it's worth checking > anyway: can you force your web app to only run a single thread at a time? > That shouldn't be a problem with the morgan fingerprinting code, but it's > still worth the experiment. > > -greg > > > On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon <tdu...@gm...> wrote: > >> Wondering if anyone had any thoughts on this core dump from Java. >> What other info would be useful? >> >> Tim >> >> On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon <tdu...@gm...> >> wrote: >> >>> I'm struggling to work out a stange core dump I'm getting when >>> calculating Morgan fingerprints from Java. This seems to happen with the >>> Release_2020_09 releases but not with the Release_2019_09 ones. It does not >>> happen when calculating RDKit fingerprints. The exact Java code involved is: >>> >>> RDKFuncs.MorganFingerprintMol(mol, 2); >>> >>> More precisely this is happening when running inside a Docker container >>> which is running the code as a Tomcat webapp, but a simple test of running >>> that same function inside the container directly from Java (e.g. not when >>> running in tomcat) works OK and does not core dump. >>> Building an otherwise identical container with the Release_2019_09 code >>> does not core dump from Tomcat. >>> >>> The core dump looks like this: >>> >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> # SIGSEGV (0xb) at pc=0x00007ff9edc00518, pid=1, tid=111 >>> # >>> # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build >>> 11.0.9.1+1-post-Debian-1deb10u2) >>> # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2, >>> mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>> # Problematic frame: >>> # [thread 145 also had an error] >>> [thread 149 also had an error] >>> [thread 113 also had an error] >>> [thread 117 also had an error] >>> C [libGraphMolWrap.so+0xa20518] void >>> RDKit::MorganFingerprints::calcFingerprint<RDKit::SparseIntVect<unsigned >>> int> >(RDKit::ROMol const&, unsigned int, std::vector<unsigned int, >>> std::allocator<unsigned int> >*, std::vector<unsigned int, >>> std::allocator<unsigned int> > const*, bool, bool, bool, bool, >>> std::map<unsigned int, std::vector<std::pair<unsigned int, unsigned int>, >>> std::allocator<std::pair<unsigned int, unsigned int> > >, >>> std::less<unsigned int>, std::allocator<std::pair<unsigned int const, >>> std::vector<std::pair<unsigned int, unsigned int>, >>> std::allocator<std::pair<unsigned int, unsigned int> > > > > >*, bool, >>> RDKit::SparseIntVect<unsigned int>&)+0x148 >>> >>> It's difficult to know what's wrong, but thought it might be worth >>> asking if anything in the Morgan fingerprint code has changed over that >>> timeframe? >>> It might be related to threading as the fingerprint generation is being >>> done inside Java streams. >>> >>> Tim >>> >>> >>> >>> _______________________________________________ >> Rdkit-discuss mailing list >> Rdk...@li... >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > |
From: Greg L. <gre...@gm...> - 2021-02-03 12:57:26
|
Given the fun that threading is, this isn't necessarily conclusive, but I just created a small C++ multi-threading test for the morgan fingerprinting code and everything looks fine. That remains true when the code is run under valgrind (which is quite good at picking up the usual types of memory corruption that cause threading issues). -greg On Wed, Feb 3, 2021 at 1:36 PM Greg Landrum <gre...@gm...> wrote: > Hi Tim, > > I haven't seen this particular problem myself, nor have we gotten any > reports of crashes from the Morgan fingerprinting code. > Comparing the fingerprinting code itself across the 2019.09 and 2020.09 > branches I also don't see anything which is likely to cause problems, but > one never knows. > > One thing that might help to know is how you construct the molecule's > you're generating fingerprints for: are these from one of the RDKit file > parsers? Have they been sanitized? > > Another thing you might have already tried, but it's worth checking > anyway: can you force your web app to only run a single thread at a time? > That shouldn't be a problem with the morgan fingerprinting code, but it's > still worth the experiment. > > -greg > > > On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon <tdu...@gm...> wrote: > >> Wondering if anyone had any thoughts on this core dump from Java. >> What other info would be useful? >> >> Tim >> >> On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon <tdu...@gm...> >> wrote: >> >>> I'm struggling to work out a stange core dump I'm getting when >>> calculating Morgan fingerprints from Java. This seems to happen with the >>> Release_2020_09 releases but not with the Release_2019_09 ones. It does not >>> happen when calculating RDKit fingerprints. The exact Java code involved is: >>> >>> RDKFuncs.MorganFingerprintMol(mol, 2); >>> >>> More precisely this is happening when running inside a Docker container >>> which is running the code as a Tomcat webapp, but a simple test of running >>> that same function inside the container directly from Java (e.g. not when >>> running in tomcat) works OK and does not core dump. >>> Building an otherwise identical container with the Release_2019_09 code >>> does not core dump from Tomcat. >>> >>> The core dump looks like this: >>> >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> # SIGSEGV (0xb) at pc=0x00007ff9edc00518, pid=1, tid=111 >>> # >>> # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build >>> 11.0.9.1+1-post-Debian-1deb10u2) >>> # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2, >>> mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>> # Problematic frame: >>> # [thread 145 also had an error] >>> [thread 149 also had an error] >>> [thread 113 also had an error] >>> [thread 117 also had an error] >>> C [libGraphMolWrap.so+0xa20518] void >>> RDKit::MorganFingerprints::calcFingerprint<RDKit::SparseIntVect<unsigned >>> int> >(RDKit::ROMol const&, unsigned int, std::vector<unsigned int, >>> std::allocator<unsigned int> >*, std::vector<unsigned int, >>> std::allocator<unsigned int> > const*, bool, bool, bool, bool, >>> std::map<unsigned int, std::vector<std::pair<unsigned int, unsigned int>, >>> std::allocator<std::pair<unsigned int, unsigned int> > >, >>> std::less<unsigned int>, std::allocator<std::pair<unsigned int const, >>> std::vector<std::pair<unsigned int, unsigned int>, >>> std::allocator<std::pair<unsigned int, unsigned int> > > > > >*, bool, >>> RDKit::SparseIntVect<unsigned int>&)+0x148 >>> >>> It's difficult to know what's wrong, but thought it might be worth >>> asking if anything in the Morgan fingerprint code has changed over that >>> timeframe? >>> It might be related to threading as the fingerprint generation is being >>> done inside Java streams. >>> >>> Tim >>> >>> >>> >>> _______________________________________________ >> Rdkit-discuss mailing list >> Rdk...@li... >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > |
From: Stephen R. <s.d...@go...> - 2021-02-03 14:36:27
|
Hi Tim, You mentioned the calculation is done using Java streams. Have you tried calling #sequential() on your stream to force it to run single threaded from the Java side? Steve On Wed, 3 Feb 2021, 12:58 Greg Landrum, <gre...@gm...> wrote: > Given the fun that threading is, this isn't necessarily conclusive, but I > just created a small C++ multi-threading test for the morgan fingerprinting > code and everything looks fine. That remains true when the code is run > under valgrind (which is quite good at picking up the usual types of memory > corruption that cause threading issues). > > -greg > > > On Wed, Feb 3, 2021 at 1:36 PM Greg Landrum <gre...@gm...> > wrote: > >> Hi Tim, >> >> I haven't seen this particular problem myself, nor have we gotten any >> reports of crashes from the Morgan fingerprinting code. >> Comparing the fingerprinting code itself across the 2019.09 and 2020.09 >> branches I also don't see anything which is likely to cause problems, but >> one never knows. >> >> One thing that might help to know is how you construct the molecule's >> you're generating fingerprints for: are these from one of the RDKit file >> parsers? Have they been sanitized? >> >> Another thing you might have already tried, but it's worth checking >> anyway: can you force your web app to only run a single thread at a time? >> That shouldn't be a problem with the morgan fingerprinting code, but it's >> still worth the experiment. >> >> -greg >> >> >> On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon <tdu...@gm...> wrote: >> >>> Wondering if anyone had any thoughts on this core dump from Java. >>> What other info would be useful? >>> >>> Tim >>> >>> On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon <tdu...@gm...> >>> wrote: >>> >>>> I'm struggling to work out a stange core dump I'm getting when >>>> calculating Morgan fingerprints from Java. This seems to happen with the >>>> Release_2020_09 releases but not with the Release_2019_09 ones. It does not >>>> happen when calculating RDKit fingerprints. The exact Java code involved is: >>>> >>>> RDKFuncs.MorganFingerprintMol(mol, 2); >>>> >>>> More precisely this is happening when running inside a Docker container >>>> which is running the code as a Tomcat webapp, but a simple test of running >>>> that same function inside the container directly from Java (e.g. not when >>>> running in tomcat) works OK and does not core dump. >>>> Building an otherwise identical container with the Release_2019_09 code >>>> does not core dump from Tomcat. >>>> >>>> The core dump looks like this: >>>> >>>> # A fatal error has been detected by the Java Runtime Environment: >>>> # >>>> # SIGSEGV (0xb) at pc=0x00007ff9edc00518, pid=1, tid=111 >>>> # >>>> # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build >>>> 11.0.9.1+1-post-Debian-1deb10u2) >>>> # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2, >>>> mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>> # Problematic frame: >>>> # [thread 145 also had an error] >>>> [thread 149 also had an error] >>>> [thread 113 also had an error] >>>> [thread 117 also had an error] >>>> C [libGraphMolWrap.so+0xa20518] void >>>> RDKit::MorganFingerprints::calcFingerprint<RDKit::SparseIntVect<unsigned >>>> int> >(RDKit::ROMol const&, unsigned int, std::vector<unsigned int, >>>> std::allocator<unsigned int> >*, std::vector<unsigned int, >>>> std::allocator<unsigned int> > const*, bool, bool, bool, bool, >>>> std::map<unsigned int, std::vector<std::pair<unsigned int, unsigned int>, >>>> std::allocator<std::pair<unsigned int, unsigned int> > >, >>>> std::less<unsigned int>, std::allocator<std::pair<unsigned int const, >>>> std::vector<std::pair<unsigned int, unsigned int>, >>>> std::allocator<std::pair<unsigned int, unsigned int> > > > > >*, bool, >>>> RDKit::SparseIntVect<unsigned int>&)+0x148 >>>> >>>> It's difficult to know what's wrong, but thought it might be worth >>>> asking if anything in the Morgan fingerprint code has changed over that >>>> timeframe? >>>> It might be related to threading as the fingerprint generation is being >>>> done inside Java streams. >>>> >>>> Tim >>>> >>>> >>>> >>>> _______________________________________________ >>> Rdkit-discuss mailing list >>> Rdk...@li... >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > |
From: Tim D. <tdu...@gm...> - 2021-02-03 16:34:23
|
Steve, It happens whether running in multiple threads or a single thread. Tim On Wed, Feb 3, 2021 at 2:36 PM Stephen Roughley <s.d...@go...> wrote: > Hi Tim, > > You mentioned the calculation is done using Java streams. Have you tried > calling #sequential() on your stream to force it to run single threaded > from the Java side? > > Steve > > On Wed, 3 Feb 2021, 12:58 Greg Landrum, <gre...@gm...> wrote: > >> Given the fun that threading is, this isn't necessarily conclusive, but I >> just created a small C++ multi-threading test for the morgan fingerprinting >> code and everything looks fine. That remains true when the code is run >> under valgrind (which is quite good at picking up the usual types of memory >> corruption that cause threading issues). >> >> -greg >> >> >> On Wed, Feb 3, 2021 at 1:36 PM Greg Landrum <gre...@gm...> >> wrote: >> >>> Hi Tim, >>> >>> I haven't seen this particular problem myself, nor have we gotten any >>> reports of crashes from the Morgan fingerprinting code. >>> Comparing the fingerprinting code itself across the 2019.09 and 2020.09 >>> branches I also don't see anything which is likely to cause problems, but >>> one never knows. >>> >>> One thing that might help to know is how you construct the molecule's >>> you're generating fingerprints for: are these from one of the RDKit file >>> parsers? Have they been sanitized? >>> >>> Another thing you might have already tried, but it's worth checking >>> anyway: can you force your web app to only run a single thread at a time? >>> That shouldn't be a problem with the morgan fingerprinting code, but it's >>> still worth the experiment. >>> >>> -greg >>> >>> >>> On Tue, Feb 2, 2021 at 7:14 PM Tim Dudgeon <tdu...@gm...> >>> wrote: >>> >>>> Wondering if anyone had any thoughts on this core dump from Java. >>>> What other info would be useful? >>>> >>>> Tim >>>> >>>> On Tue, Jan 12, 2021 at 12:55 PM Tim Dudgeon <tdu...@gm...> >>>> wrote: >>>> >>>>> I'm struggling to work out a stange core dump I'm getting when >>>>> calculating Morgan fingerprints from Java. This seems to happen with the >>>>> Release_2020_09 releases but not with the Release_2019_09 ones. It does not >>>>> happen when calculating RDKit fingerprints. The exact Java code involved is: >>>>> >>>>> RDKFuncs.MorganFingerprintMol(mol, 2); >>>>> >>>>> More precisely this is happening when running inside a Docker >>>>> container which is running the code as a Tomcat webapp, but a simple test >>>>> of running that same function inside the container directly from Java (e.g. >>>>> not when running in tomcat) works OK and does not core dump. >>>>> Building an otherwise identical container with the Release_2019_09 >>>>> code does not core dump from Tomcat. >>>>> >>>>> The core dump looks like this: >>>>> >>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>> # >>>>> # SIGSEGV (0xb) at pc=0x00007ff9edc00518, pid=1, tid=111 >>>>> # >>>>> # JRE version: OpenJDK Runtime Environment (11.0.9.1+1) (build >>>>> 11.0.9.1+1-post-Debian-1deb10u2) >>>>> # Java VM: OpenJDK 64-Bit Server VM (11.0.9.1+1-post-Debian-1deb10u2, >>>>> mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>> # Problematic frame: >>>>> # [thread 145 also had an error] >>>>> [thread 149 also had an error] >>>>> [thread 113 also had an error] >>>>> [thread 117 also had an error] >>>>> C [libGraphMolWrap.so+0xa20518] void >>>>> RDKit::MorganFingerprints::calcFingerprint<RDKit::SparseIntVect<unsigned >>>>> int> >(RDKit::ROMol const&, unsigned int, std::vector<unsigned int, >>>>> std::allocator<unsigned int> >*, std::vector<unsigned int, >>>>> std::allocator<unsigned int> > const*, bool, bool, bool, bool, >>>>> std::map<unsigned int, std::vector<std::pair<unsigned int, unsigned int>, >>>>> std::allocator<std::pair<unsigned int, unsigned int> > >, >>>>> std::less<unsigned int>, std::allocator<std::pair<unsigned int const, >>>>> std::vector<std::pair<unsigned int, unsigned int>, >>>>> std::allocator<std::pair<unsigned int, unsigned int> > > > > >*, bool, >>>>> RDKit::SparseIntVect<unsigned int>&)+0x148 >>>>> >>>>> It's difficult to know what's wrong, but thought it might be worth >>>>> asking if anything in the Morgan fingerprint code has changed over that >>>>> timeframe? >>>>> It might be related to threading as the fingerprint generation is >>>>> being done inside Java streams. >>>>> >>>>> Tim >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>> Rdkit-discuss mailing list >>>> Rdk...@li... >>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>>> >>> _______________________________________________ >> Rdkit-discuss mailing list >> Rdk...@li... >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > |