You can subscribe to this list here.
2006 
_{Jan}

_{Feb}

_{Mar}

_{Apr}

_{May}
(1) 
_{Jun}

_{Jul}

_{Aug}

_{Sep}

_{Oct}

_{Nov}
(1) 
_{Dec}


2007 
_{Jan}

_{Feb}

_{Mar}

_{Apr}

_{May}

_{Jun}

_{Jul}

_{Aug}
(1) 
_{Sep}
(27) 
_{Oct}
(4) 
_{Nov}
(20) 
_{Dec}
(4) 
2008 
_{Jan}
(12) 
_{Feb}
(2) 
_{Mar}
(23) 
_{Apr}
(40) 
_{May}
(30) 
_{Jun}
(6) 
_{Jul}
(35) 
_{Aug}
(60) 
_{Sep}
(31) 
_{Oct}
(33) 
_{Nov}
(35) 
_{Dec}
(3) 
2009 
_{Jan}
(16) 
_{Feb}
(77) 
_{Mar}
(88) 
_{Apr}
(57) 
_{May}
(33) 
_{Jun}
(27) 
_{Jul}
(55) 
_{Aug}
(26) 
_{Sep}
(12) 
_{Oct}
(45) 
_{Nov}
(42) 
_{Dec}
(23) 
2010 
_{Jan}
(64) 
_{Feb}
(17) 
_{Mar}
(30) 
_{Apr}
(55) 
_{May}
(30) 
_{Jun}
(65) 
_{Jul}
(112) 
_{Aug}
(26) 
_{Sep}
(67) 
_{Oct}
(20) 
_{Nov}
(67) 
_{Dec}
(23) 
2011 
_{Jan}
(57) 
_{Feb}
(43) 
_{Mar}
(50) 
_{Apr}
(66) 
_{May}
(95) 
_{Jun}
(73) 
_{Jul}
(64) 
_{Aug}
(47) 
_{Sep}
(22) 
_{Oct}
(56) 
_{Nov}
(51) 
_{Dec}
(34) 
2012 
_{Jan}
(64) 
_{Feb}
(45) 
_{Mar}
(65) 
_{Apr}
(85) 
_{May}
(76) 
_{Jun}
(47) 
_{Jul}
(75) 
_{Aug}
(72) 
_{Sep}
(31) 
_{Oct}
(77) 
_{Nov}
(61) 
_{Dec}
(41) 
2013 
_{Jan}
(68) 
_{Feb}
(63) 
_{Mar}
(36) 
_{Apr}
(73) 
_{May}
(61) 
_{Jun}
(69) 
_{Jul}
(98) 
_{Aug}
(60) 
_{Sep}
(74) 
_{Oct}
(102) 
_{Nov}
(92) 
_{Dec}
(63) 
2014 
_{Jan}
(112) 
_{Feb}
(84) 
_{Mar}
(72) 
_{Apr}
(59) 
_{May}
(96) 
_{Jun}
(54) 
_{Jul}
(91) 
_{Aug}
(54) 
_{Sep}
(38) 
_{Oct}
(47) 
_{Nov}
(33) 
_{Dec}
(39) 
2015 
_{Jan}
(41) 
_{Feb}
(115) 
_{Mar}
(66) 
_{Apr}
(87) 
_{May}
(63) 
_{Jun}
(53) 
_{Jul}
(61) 
_{Aug}
(59) 
_{Sep}
(115) 
_{Oct}
(42) 
_{Nov}
(60) 
_{Dec}
(20) 
2016 
_{Jan}
(52) 
_{Feb}
(72) 
_{Mar}
(100) 
_{Apr}
(125) 
_{May}
(61) 
_{Jun}
(106) 
_{Jul}
(62) 
_{Aug}
(74) 
_{Sep}
(151) 
_{Oct}
(151) 
_{Nov}
(117) 
_{Dec}
(148) 
2017 
_{Jan}
(64) 
_{Feb}

_{Mar}

_{Apr}

_{May}

_{Jun}

_{Jul}

_{Aug}

_{Sep}

_{Oct}

_{Nov}

_{Dec}

From: Rafal Roszak <rmrmg.chem@gm...>  20170118 09:28:15

On Tue, 17 Jan 2017 16:52:36 +0000 Chris Arthur <Chris.Arthur@...> wrote: > ValueError: Sanitization error: Can't kekulize mol In most case I have 'Can't kekulize mol' error for hetorocycle with hydrogen on nitrogen and smiles which have not explicite hydrogen on N. Exempli gratia: >>> Chem.MolFromSmiles('c1ccnc1') [10:23:21] Can't kekulize mol >>> Chem.MolFromSmiles('c1cc[nH]c1') <rdkit.Chem.rdchem.Mol object at 0x7f8d6a30a6e0> > I can generate a smiles string from it (I had thought of doing a smiles to > molecule conversion) so if this is the issue, you can convert your Mol object to smiles add missing H and build Mol from this new smiles. Regards, Rafał 
From: Greg Landrum <greg.landrum@gm...>  20170118 04:48:41

I don't have anything to add to this other than to agree with Curt: I think that the existing code should work fine with thiazoles. @Curt: thanks for providing this detailed and thoughtthrough answer! greg On Tue, Jan 17, 2017 at 7:01 PM, Curt Fischer <curt.r.fischer@...> wrote: > To troubleshoot your sanitization problems, I think it would be helpful if > you could share your SMARTS reaction string and the rdkit version you are > using. > > I just simulated the Hantzsch thiazole synthesis shown on Wikipedia, and > everythink worked normally for me. Admittedly, my reaction definition is > overly tailored toward these two reactants, but I think it shows that rdkit > can *Sanitize()* thiazoles correctly. > > # Hantzsch thiazole synthesis > thiourea = Chem.MolFromSmiles('CN(C)C(=S)N') > haloketone = Chem.MolFromSmiles('c1ccccc1C(=O)C(C)Cl') > rxn_smarts = '[NH2:1][C:2](=[S:3])[NH0:4].[C:5](=[O:6])[C:7][Cl:8]>>[N:4] > [c:2]1[s:3][c:5][c:7][n:1]1' > rxn = AllChem.ReactionFromSmarts(rxn_smarts) > product = rxn.RunReactants((thiourea, haloketone))[0][0] > Chem.SanitizeMol(product) > Chem.MolToSmiles(product) > > Out[33]: 'Cc1nc(N(C)C)sc1c1ccccc1' > > > On Tue, Jan 17, 2017 at 9:29 AM, Curt Fischer <curt.r.fischer@...> > wrote: > >> I can't answer your root question, but if you want to go to SMILES and >> then back, I think you want *Chem.MolFromSmiles()*, not >> *Chem.MolToSmiles()*. >> >> Curt >> >> On Tue, Jan 17, 2017 at 8:52 AM, Chris Arthur <Chris.Arthur@... >> > wrote: >> >>> Dear all >>> >>> >>> I have a molecule containing a thiazole ring which has been generated by >>> a reaction in Rdkit. >>> >>> Sanitising the molecule gives kekulization error... >>> >>> Chem.SanitizeMol(forwardProduct_) >>> Traceback (most recent call last): >>> >>> File "<ipythoninput29649525efe840>", line 1, in <module> >>> Chem.SanitizeMol(forwardProduct_) >>> >>> ValueError: Sanitization error: Can't kekulize mol >>> >>> I can generate a smiles string from it (I had thought of doing a smiles >>> to molecule conversion) >>> >>> #Rdkit generated smiles that started us down this rabbithole >>> temp = Chem.MolToSmiles('CC(=O)c1sc(C2CCOCC2)nc1C') >>> >>> But this fails.... >>> >>> ArgumentError: Python argument types in >>> rdkit.Chem.rdmolfiles.MolToSmiles(str) >>> did not match C++ signature: >>> MolToSmiles(class RDKit::ROMol mol, bool isomericSmiles=False, bool >>> kekuleSmiles=False, int rootedAtAtom=1, bool canonical=True, bool >>> allBondsExplicit=False, bool allHsExplicit=False) >>> >>> >>> So I thought I would try with simpler thiazoles.... >>> >>> #ChemDraws smiles representation >>> temp = Chem.MolToSmiles('C1=CN=CS1') >>> >>> #From wikipedias smile for thiazole >>> temp = Chem.MolToSmiles('n1ccsc1') >>> >>> These however also fail. >>> >>> Can anyone suggest how I can proceed in order to sanitize such >>> molecules >>> >>> Thanks >>> >>> Chris >>> >>> >>> >>>  >>> Dr Christopher J. Arthur >>> School of Chemistry >>> University of Bristol >>> BRISTOL, BS8 1TS, UK >>> Email: chris.arthur@... >>> >>> Office: (+44 117) 331 7192 <+44%20117%20331%207192> >>> Mass Spectrometry Lab: (+44 117) 331 7358 <+44%20117%20331%207358>. >>> FAX: (+44 117) 927 7985 <+44%20117%20927%207985> >>> >>> WWW URL: http://www.chm.bris.ac.uk/staff/carthur.htm >>> LinkedIn Profile: https://www.linkedin.com/in/drchrisarthur >>> >>> >>>  >>>  >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >>> _______________________________________________ >>> Rdkitdiscuss mailing list >>> Rdkitdiscuss@... >>> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss >>> >>> >> > >  >  > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > _______________________________________________ > Rdkitdiscuss mailing list > Rdkitdiscuss@... > https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss > > 
From: Curt Fischer <curt.r.fischer@gm...>  20170117 18:01:42

To troubleshoot your sanitization problems, I think it would be helpful if you could share your SMARTS reaction string and the rdkit version you are using. I just simulated the Hantzsch thiazole synthesis shown on Wikipedia, and everythink worked normally for me. Admittedly, my reaction definition is overly tailored toward these two reactants, but I think it shows that rdkit can *Sanitize()* thiazoles correctly. # Hantzsch thiazole synthesis thiourea = Chem.MolFromSmiles('CN(C)C(=S)N') haloketone = Chem.MolFromSmiles('c1ccccc1C(=O)C(C)Cl') rxn_smarts = '[NH2:1][C:2](=[S:3])[NH0:4].[C:5](=[O:6])[C:7][Cl:8]>>[N:4][c:2]1[s:3][c:5][c:7][n:1]1' rxn = AllChem.ReactionFromSmarts(rxn_smarts) product = rxn.RunReactants((thiourea, haloketone))[0][0] Chem.SanitizeMol(product) Chem.MolToSmiles(product) Out[33]: 'Cc1nc(N(C)C)sc1c1ccccc1' On Tue, Jan 17, 2017 at 9:29 AM, Curt Fischer <curt.r.fischer@...> wrote: > I can't answer your root question, but if you want to go to SMILES and > then back, I think you want *Chem.MolFromSmiles()*, not > *Chem.MolToSmiles()*. > > Curt > > On Tue, Jan 17, 2017 at 8:52 AM, Chris Arthur <Chris.Arthur@...> > wrote: > >> Dear all >> >> >> I have a molecule containing a thiazole ring which has been generated by >> a reaction in Rdkit. >> >> Sanitising the molecule gives kekulization error... >> >> Chem.SanitizeMol(forwardProduct_) >> Traceback (most recent call last): >> >> File "<ipythoninput29649525efe840>", line 1, in <module> >> Chem.SanitizeMol(forwardProduct_) >> >> ValueError: Sanitization error: Can't kekulize mol >> >> I can generate a smiles string from it (I had thought of doing a smiles >> to molecule conversion) >> >> #Rdkit generated smiles that started us down this rabbithole >> temp = Chem.MolToSmiles('CC(=O)c1sc(C2CCOCC2)nc1C') >> >> But this fails.... >> >> ArgumentError: Python argument types in >> rdkit.Chem.rdmolfiles.MolToSmiles(str) >> did not match C++ signature: >> MolToSmiles(class RDKit::ROMol mol, bool isomericSmiles=False, bool >> kekuleSmiles=False, int rootedAtAtom=1, bool canonical=True, bool >> allBondsExplicit=False, bool allHsExplicit=False) >> >> >> So I thought I would try with simpler thiazoles.... >> >> #ChemDraws smiles representation >> temp = Chem.MolToSmiles('C1=CN=CS1') >> >> #From wikipedias smile for thiazole >> temp = Chem.MolToSmiles('n1ccsc1') >> >> These however also fail. >> >> Can anyone suggest how I can proceed in order to sanitize such molecules >> >> Thanks >> >> Chris >> >> >> >>  >> Dr Christopher J. Arthur >> School of Chemistry >> University of Bristol >> BRISTOL, BS8 1TS, UK >> Email: chris.arthur@... >> >> Office: (+44 117) 331 7192 <+44%20117%20331%207192> >> Mass Spectrometry Lab: (+44 117) 331 7358 <+44%20117%20331%207358>. >> FAX: (+44 117) 927 7985 <+44%20117%20927%207985> >> >> WWW URL: http://www.chm.bris.ac.uk/staff/carthur.htm >> LinkedIn Profile: https://www.linkedin.com/in/drchrisarthur >> >> >>  >>  >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >> _______________________________________________ >> Rdkitdiscuss mailing list >> Rdkitdiscuss@... >> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss >> >> > 
From: Curt Fischer <curt.r.fischer@gm...>  20170117 17:29:30

I can't answer your root question, but if you want to go to SMILES and then back, I think you want *Chem.MolFromSmiles()*, not *Chem.MolToSmiles()*. Curt On Tue, Jan 17, 2017 at 8:52 AM, Chris Arthur <Chris.Arthur@...> wrote: > Dear all > > > I have a molecule containing a thiazole ring which has been generated by a > reaction in Rdkit. > > Sanitising the molecule gives kekulization error... > > Chem.SanitizeMol(forwardProduct_) > Traceback (most recent call last): > > File "<ipythoninput29649525efe840>", line 1, in <module> > Chem.SanitizeMol(forwardProduct_) > > ValueError: Sanitization error: Can't kekulize mol > > I can generate a smiles string from it (I had thought of doing a smiles to > molecule conversion) > > #Rdkit generated smiles that started us down this rabbithole > temp = Chem.MolToSmiles('CC(=O)c1sc(C2CCOCC2)nc1C') > > But this fails.... > > ArgumentError: Python argument types in > rdkit.Chem.rdmolfiles.MolToSmiles(str) > did not match C++ signature: > MolToSmiles(class RDKit::ROMol mol, bool isomericSmiles=False, bool > kekuleSmiles=False, int rootedAtAtom=1, bool canonical=True, bool > allBondsExplicit=False, bool allHsExplicit=False) > > > So I thought I would try with simpler thiazoles.... > > #ChemDraws smiles representation > temp = Chem.MolToSmiles('C1=CN=CS1') > > #From wikipedias smile for thiazole > temp = Chem.MolToSmiles('n1ccsc1') > > These however also fail. > > Can anyone suggest how I can proceed in order to sanitize such molecules > > Thanks > > Chris > > > >  > Dr Christopher J. Arthur > School of Chemistry > University of Bristol > BRISTOL, BS8 1TS, UK > Email: chris.arthur@... > > Office: (+44 117) 331 7192 <+44%20117%20331%207192> > Mass Spectrometry Lab: (+44 117) 331 7358 <+44%20117%20331%207358>. > FAX: (+44 117) 927 7985 <+44%20117%20927%207985> > > WWW URL: http://www.chm.bris.ac.uk/staff/carthur.htm > LinkedIn Profile: https://www.linkedin.com/in/drchrisarthur > > >  >  > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > _______________________________________________ > Rdkitdiscuss mailing list > Rdkitdiscuss@... > https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss > > 
From: Chris Arthur <Chris.Arthur@br...>  20170117 17:22:12

Dear all I have a molecule containing a thiazole ring which has been generated by a reaction in Rdkit. Sanitising the molecule gives kekulization error... Chem.SanitizeMol(forwardProduct_) Traceback (most recent call last): File "<ipythoninput29649525efe840>", line 1, in <module> Chem.SanitizeMol(forwardProduct_) ValueError: Sanitization error: Can't kekulize mol I can generate a smiles string from it (I had thought of doing a smiles to molecule conversion) #Rdkit generated smiles that started us down this rabbithole temp = Chem.MolToSmiles('CC(=O)c1sc(C2CCOCC2)nc1C') But this fails.... ArgumentError: Python argument types in rdkit.Chem.rdmolfiles.MolToSmiles(str) did not match C++ signature: MolToSmiles(class RDKit::ROMol mol, bool isomericSmiles=False, bool kekuleSmiles=False, int rootedAtAtom=1, bool canonical=True, bool allBondsExplicit=False, bool allHsExplicit=False) So I thought I would try with simpler thiazoles.... #ChemDraws smiles representation temp = Chem.MolToSmiles('C1=CN=CS1') #From wikipedias smile for thiazole temp = Chem.MolToSmiles('n1ccsc1') These however also fail. Can anyone suggest how I can proceed in order to sanitize such molecules Thanks Chris  Dr Christopher J. Arthur School of Chemistry University of Bristol BRISTOL, BS8 1TS, UK Email: chris.arthur@... Office: (+44 117) 331 7192 Mass Spectrometry Lab: (+44 117) 331 7358. FAX: (+44 117) 927 7985 WWW URL: http://www.chm.bris.ac.uk/staff/carthur.htm LinkedIn Profile: https://www.linkedin.com/in/drchrisarthur 
From: Guillaume GODIN <Guillaume.GODIN@fi...>  20170117 13:24:04

Thanks Brian, PBF = 0 <=> 2D & PBF >0 <=> 3D. I forget that point. BR, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 ________________________________ De : Brian Kelley <fustigator@...> Envoyé : mardi 17 janvier 2017 14:06 À : Guillaume GODIN Cc : cgearnshaw@...; Rdkitdiscuss@...; Greg Landrum Objet : Re: [Rdkitdiscuss] PMI API In the inertial frame this is trivial, however, with the current RDKit can't you just use the plane of best fit here for the planar/3D? For a linear molecule, you can use the PMI descriptors. See PBF in RDKit http://pubs.acs.org/doi/abs/10.1021/ci300293f Cheers, Brian On Tue, Jan 17, 2017 at 7:58 AM, Guillaume GODIN <Guillaume.GODIN@...<mailto:Guillaume.GODIN@...>> wrote: Great! I also notice confusing usage of moment of Inertia in those descriptors. For exemple in WHIM case, we need to know if the molecule is linear, planar or 3D in order to compute the descriptors. I did not find a easy way to determine this yet. BR, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645<tel:+41%2022%20780%2036%2045> MOBILE +41 (0)79 536 1039<tel:+41%2079%20536%2010%2039> Firmenich SA RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 ________________________________ De : Brian Kelley <fustigator@...<mailto:fustigator@...>> Envoyé : mardi 17 janvier 2017 13:44 À : Chris Earnshaw Cc : Rdkitdiscuss@...<mailto:Rdkitdiscuss@...>; Greg Landrum Objet : Re: [Rdkitdiscuss] PMI API I think we agree here. Here I was talking about the raw Moment (M1z) not the moment of interia (MI1), I should have made the disctinction more explicit. Moments are not necessarily Moments of inertia. The terminology gets confusing. After a brief discussion with Greg, the Moments.py does the correct calculation which indirectly verifies MOE and the newer RDKit implementation. Cheers, Brian On Tue, Jan 17, 2017 at 7:39 AM, Chris Earnshaw <cgearnshaw@...<mailto:cgearnshaw@...>> wrote: The dimensions along one of the axes of a planar molecule in its inertial frame will be zero, but the principal moments of inertia will all be nonzero. The moment of inertia about an axis can only be zero if all the atoms in the molecule are precisely aligned on that axis. That's only possible for linear molecules. There's no way to draw a straight line axis through all the atoms in a nonlinear molecule, which would be a requirement for the corresponding moment of inertia to be zero. Chris On 17 January 2017 at 12:29, Brian Kelley <fustigator@...<mailto:fustigator@...>> wrote: Looks like I'm late to the game. I don't know about the PMI descriptors perse, but if a planar molecule is in it's inertial frame, one of the axes should be zero (whether it is x, y or z) which means that the one of the M1x, M1y or M1z should be zero. We had some good experimentation with multipole expansion of moments (essentially based on the description of electrostatic multipoles) that might be nice to add to the PMI framework. Greg, I'm assuming that the Moments.py we opensourced a while back is similarly broken? I'm attaching it here for posterity but it does appear to match the moe PMI's. On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw <cgearnshaw@...<mailto:cgearnshaw@...>> wrote: The new version looks good to me as far as I can test it. PMI and NPR are still fine, the radius of gyration is right (for an extremely artificial test system) and the asphericity index also seems right (despite my best efforts to confuse things further  sorry about that!). Also highlights even more confusion in the Todeschini article  the approximate asphericity values for prolate and oblate molecules are reversed. The only (very trivial) thing I've spotted is the comment in the inertialShapeFactor function. 'planar or no coordinates' should be 'linear or no coordinates' to avoid confusion. Chris On 16 January 2017 at 09:30, Greg Landrum <greg.landrum@...<mailto:greg.landrum@...>> wrote: On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw <chris@...<mailto:chris@...>> wrote: Either way, it makes it rather hard to trust their derivations generally  especially as there appear to be other errors (e.g. the denominator in eq. 16 should be the square root of the given sum of squares, according to their reference). Indeed. Given the problems encountered, I went back and checked some additional references to find definitions of the descriptors. The results are in this PR, which I'd love feedback on if you have time to take a look: https://github.com/rdkit/rdkit/pull/1265 I didn't manage to find any information about "inertial shape factor" and don't have access to the references cited in the Todeschini paper, but I think the others are now reasonably reliable. greg  Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Rdkitdiscuss mailing list Rdkitdiscuss@...<mailto:Rdkitdiscuss@...> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss ********************************************************************** DISCLAIMER This email and any files transmitted with it, including replies and forwarded copies (which may contain alterations) subsequently transmitted from Firmenich, are confidential and solely for the use of the intended recipient. The contents do not represent the opinion of Firmenich except to the extent that it relates to their official business. ********************************************************************** 
From: Brian Kelley <fustigator@gm...>  20170117 13:06:35

In the inertial frame this is trivial, however, with the current RDKit can't you just use the plane of best fit here for the planar/3D? For a linear molecule, you can use the PMI descriptors. See PBF in RDKit http://pubs.acs.org/doi/abs/10.1021/ci300293f Cheers, Brian On Tue, Jan 17, 2017 at 7:58 AM, Guillaume GODIN < Guillaume.GODIN@...> wrote: > Great! I also notice confusing usage of moment of Inertia in those > descriptors. > > > For exemple in WHIM case, we need to know if the molecule is linear, > planar or 3D in order to compute the descriptors. > > > I did not find a easy way to determine this yet. > > > BR, > > *Dr. Guillaume GODIN* > Principal Scientist > Chemoinformatic & Datamining > Innovation > CORPORATE R&D DIVISION > DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045> > MOBILE +41 (0)79 536 1039 <+41%2079%20536%2010%2039> > Firmenich SA > RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 > >  > *De :* Brian Kelley <fustigator@...> > *Envoyé :* mardi 17 janvier 2017 13:44 > *À :* Chris Earnshaw > *Cc :* Rdkitdiscuss@...; Greg Landrum > *Objet :* Re: [Rdkitdiscuss] PMI API > > I think we agree here. Here I was talking about the raw Moment (M1z) not > the moment of interia (MI1), I should have made the disctinction more > explicit. Moments are not necessarily Moments of inertia. The terminology > gets confusing. > > After a brief discussion with Greg, the Moments.py does the correct > calculation which indirectly verifies MOE and the newer RDKit > implementation. > > Cheers, > Brian > > On Tue, Jan 17, 2017 at 7:39 AM, Chris Earnshaw <cgearnshaw@...> > wrote: > >> The dimensions along one of the axes of a planar molecule in its inertial >> frame will be zero, but the principal moments of inertia will all be >> nonzero. The moment of inertia about an axis can only be zero if all the >> atoms in the molecule are precisely aligned on that axis. That's only >> possible for linear molecules. There's no way to draw a straight line axis >> through all the atoms in a nonlinear molecule, which would be a >> requirement for the corresponding moment of inertia to be zero. >> >> Chris >> >> On 17 January 2017 at 12:29, Brian Kelley <fustigator@...> wrote: >> >>> Looks like I'm late to the game. I don't know about the PMI descriptors >>> perse, but if a planar molecule is in it's inertial frame, one of the axes >>> should be zero (whether it is x, y or z) which means that the one of the >>> M1x, M1y or M1z should be zero. >>> >>> We had some good experimentation with multipole expansion of moments >>> (essentially based on the description of electrostatic multipoles) that >>> might be nice to add to the PMI framework. >>> >>> Greg, I'm assuming that the Moments.py we opensourced a while back is >>> similarly broken? I'm attaching it here for posterity but it does appear >>> to match the moe PMI's. >>> >>> >>> >>> On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw <cgearnshaw@...> >>> wrote: >>> >>>> The new version looks good to me as far as I can test it. PMI and NPR >>>> are still fine, the radius of gyration is right (for an extremely >>>> artificial test system) and the asphericity index also seems right (despite >>>> my best efforts to confuse things further  sorry about that!). Also >>>> highlights even more confusion in the Todeschini article  the approximate >>>> asphericity values for prolate and oblate molecules are reversed. >>>> >>>> The only (very trivial) thing I've spotted is the comment in the >>>> inertialShapeFactor function. 'planar or no coordinates' should be 'linear >>>> or no coordinates' to avoid confusion. >>>> >>>> Chris >>>> >>>> On 16 January 2017 at 09:30, Greg Landrum <greg.landrum@...> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw < >>>>> chris@...> wrote: >>>>> >>>>>> >>>>>> Either way, it makes it rather hard to trust their derivations >>>>>> generally  especially as there appear to be other errors (e.g. the >>>>>> denominator in eq. 16 should be the square root of the given sum of >>>>>> squares, according to their reference). >>>>>> >>>>> >>>>> Indeed. Given the problems encountered, I went back and checked some >>>>> additional references to find definitions of the descriptors. The results >>>>> are in this PR, which I'd love feedback on if you have time to take a look: >>>>> https://github.com/rdkit/rdkit/pull/1265 >>>>> >>>>> I didn't manage to find any information about "inertial shape factor" >>>>> and don't have access to the references cited in the Todeschini paper, but >>>>> I think the others are now reasonably reliable. >>>>> >>>>> greg >>>>> >>>>> >>>>> >>>> >>>>  >>>>  >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >>>> _______________________________________________ >>>> Rdkitdiscuss mailing list >>>> Rdkitdiscuss@... >>>> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss >>>> >>>> >>> >> > > ********************************************************************** > DISCLAIMER > This email and any files transmitted with it, including replies and > forwarded copies (which may contain alterations) subsequently transmitted > from Firmenich, are confidential and solely for the use of the intended > recipient. The contents do not represent the opinion of Firmenich except to > the extent that it relates to their official business. > ********************************************************************** > 
From: Guillaume GODIN <Guillaume.GODIN@fi...>  20170117 12:58:42

Great! I also notice confusing usage of moment of Inertia in those descriptors. For exemple in WHIM case, we need to know if the molecule is linear, planar or 3D in order to compute the descriptors. I did not find a easy way to determine this yet. BR, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 ________________________________ De : Brian Kelley <fustigator@...> Envoyé : mardi 17 janvier 2017 13:44 À : Chris Earnshaw Cc : Rdkitdiscuss@...; Greg Landrum Objet : Re: [Rdkitdiscuss] PMI API I think we agree here. Here I was talking about the raw Moment (M1z) not the moment of interia (MI1), I should have made the disctinction more explicit. Moments are not necessarily Moments of inertia. The terminology gets confusing. After a brief discussion with Greg, the Moments.py does the correct calculation which indirectly verifies MOE and the newer RDKit implementation. Cheers, Brian On Tue, Jan 17, 2017 at 7:39 AM, Chris Earnshaw <cgearnshaw@...<mailto:cgearnshaw@...>> wrote: The dimensions along one of the axes of a planar molecule in its inertial frame will be zero, but the principal moments of inertia will all be nonzero. The moment of inertia about an axis can only be zero if all the atoms in the molecule are precisely aligned on that axis. That's only possible for linear molecules. There's no way to draw a straight line axis through all the atoms in a nonlinear molecule, which would be a requirement for the corresponding moment of inertia to be zero. Chris On 17 January 2017 at 12:29, Brian Kelley <fustigator@...<mailto:fustigator@...>> wrote: Looks like I'm late to the game. I don't know about the PMI descriptors perse, but if a planar molecule is in it's inertial frame, one of the axes should be zero (whether it is x, y or z) which means that the one of the M1x, M1y or M1z should be zero. We had some good experimentation with multipole expansion of moments (essentially based on the description of electrostatic multipoles) that might be nice to add to the PMI framework. Greg, I'm assuming that the Moments.py we opensourced a while back is similarly broken? I'm attaching it here for posterity but it does appear to match the moe PMI's. On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw <cgearnshaw@...<mailto:cgearnshaw@...>> wrote: The new version looks good to me as far as I can test it. PMI and NPR are still fine, the radius of gyration is right (for an extremely artificial test system) and the asphericity index also seems right (despite my best efforts to confuse things further  sorry about that!). Also highlights even more confusion in the Todeschini article  the approximate asphericity values for prolate and oblate molecules are reversed. The only (very trivial) thing I've spotted is the comment in the inertialShapeFactor function. 'planar or no coordinates' should be 'linear or no coordinates' to avoid confusion. Chris On 16 January 2017 at 09:30, Greg Landrum <greg.landrum@...<mailto:greg.landrum@...>> wrote: On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw <chris@...<mailto:chris@...>> wrote: Either way, it makes it rather hard to trust their derivations generally  especially as there appear to be other errors (e.g. the denominator in eq. 16 should be the square root of the given sum of squares, according to their reference). Indeed. Given the problems encountered, I went back and checked some additional references to find definitions of the descriptors. The results are in this PR, which I'd love feedback on if you have time to take a look: https://github.com/rdkit/rdkit/pull/1265 I didn't manage to find any information about "inertial shape factor" and don't have access to the references cited in the Todeschini paper, but I think the others are now reasonably reliable. greg  Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Rdkitdiscuss mailing list Rdkitdiscuss@...<mailto:Rdkitdiscuss@...> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss ********************************************************************** DISCLAIMER This email and any files transmitted with it, including replies and forwarded copies (which may contain alterations) subsequently transmitted from Firmenich, are confidential and solely for the use of the intended recipient. The contents do not represent the opinion of Firmenich except to the extent that it relates to their official business. ********************************************************************** 
From: Brian Kelley <fustigator@gm...>  20170117 12:44:48

I think we agree here. Here I was talking about the raw Moment (M1z) not the moment of interia (MI1), I should have made the disctinction more explicit. Moments are not necessarily Moments of inertia. The terminology gets confusing. After a brief discussion with Greg, the Moments.py does the correct calculation which indirectly verifies MOE and the newer RDKit implementation. Cheers, Brian On Tue, Jan 17, 2017 at 7:39 AM, Chris Earnshaw <cgearnshaw@...> wrote: > The dimensions along one of the axes of a planar molecule in its inertial > frame will be zero, but the principal moments of inertia will all be > nonzero. The moment of inertia about an axis can only be zero if all the > atoms in the molecule are precisely aligned on that axis. That's only > possible for linear molecules. There's no way to draw a straight line axis > through all the atoms in a nonlinear molecule, which would be a > requirement for the corresponding moment of inertia to be zero. > > Chris > > On 17 January 2017 at 12:29, Brian Kelley <fustigator@...> wrote: > >> Looks like I'm late to the game. I don't know about the PMI descriptors >> perse, but if a planar molecule is in it's inertial frame, one of the axes >> should be zero (whether it is x, y or z) which means that the one of the >> M1x, M1y or M1z should be zero. >> >> We had some good experimentation with multipole expansion of moments >> (essentially based on the description of electrostatic multipoles) that >> might be nice to add to the PMI framework. >> >> Greg, I'm assuming that the Moments.py we opensourced a while back is >> similarly broken? I'm attaching it here for posterity but it does appear >> to match the moe PMI's. >> >> >> >> On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw <cgearnshaw@...> >> wrote: >> >>> The new version looks good to me as far as I can test it. PMI and NPR >>> are still fine, the radius of gyration is right (for an extremely >>> artificial test system) and the asphericity index also seems right (despite >>> my best efforts to confuse things further  sorry about that!). Also >>> highlights even more confusion in the Todeschini article  the approximate >>> asphericity values for prolate and oblate molecules are reversed. >>> >>> The only (very trivial) thing I've spotted is the comment in the >>> inertialShapeFactor function. 'planar or no coordinates' should be 'linear >>> or no coordinates' to avoid confusion. >>> >>> Chris >>> >>> On 16 January 2017 at 09:30, Greg Landrum <greg.landrum@...> >>> wrote: >>> >>>> >>>> >>>> On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw < >>>> chris@...> wrote: >>>> >>>>> >>>>> Either way, it makes it rather hard to trust their derivations >>>>> generally  especially as there appear to be other errors (e.g. the >>>>> denominator in eq. 16 should be the square root of the given sum of >>>>> squares, according to their reference). >>>>> >>>> >>>> Indeed. Given the problems encountered, I went back and checked some >>>> additional references to find definitions of the descriptors. The results >>>> are in this PR, which I'd love feedback on if you have time to take a look: >>>> https://github.com/rdkit/rdkit/pull/1265 >>>> >>>> I didn't manage to find any information about "inertial shape factor" >>>> and don't have access to the references cited in the Todeschini paper, but >>>> I think the others are now reasonably reliable. >>>> >>>> greg >>>> >>>> >>>> >>> >>>  >>>  >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >>> _______________________________________________ >>> Rdkitdiscuss mailing list >>> Rdkitdiscuss@... >>> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss >>> >>> >> > 
From: Chris Earnshaw <cgearnshaw@gm...>  20170117 12:39:13

The dimensions along one of the axes of a planar molecule in its inertial frame will be zero, but the principal moments of inertia will all be nonzero. The moment of inertia about an axis can only be zero if all the atoms in the molecule are precisely aligned on that axis. That's only possible for linear molecules. There's no way to draw a straight line axis through all the atoms in a nonlinear molecule, which would be a requirement for the corresponding moment of inertia to be zero. Chris On 17 January 2017 at 12:29, Brian Kelley <fustigator@...> wrote: > Looks like I'm late to the game. I don't know about the PMI descriptors > perse, but if a planar molecule is in it's inertial frame, one of the axes > should be zero (whether it is x, y or z) which means that the one of the > M1x, M1y or M1z should be zero. > > We had some good experimentation with multipole expansion of moments > (essentially based on the description of electrostatic multipoles) that > might be nice to add to the PMI framework. > > Greg, I'm assuming that the Moments.py we opensourced a while back is > similarly broken? I'm attaching it here for posterity but it does appear > to match the moe PMI's. > > > > On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw <cgearnshaw@...> > wrote: > >> The new version looks good to me as far as I can test it. PMI and NPR are >> still fine, the radius of gyration is right (for an extremely artificial >> test system) and the asphericity index also seems right (despite my best >> efforts to confuse things further  sorry about that!). Also highlights >> even more confusion in the Todeschini article  the approximate asphericity >> values for prolate and oblate molecules are reversed. >> >> The only (very trivial) thing I've spotted is the comment in the >> inertialShapeFactor function. 'planar or no coordinates' should be 'linear >> or no coordinates' to avoid confusion. >> >> Chris >> >> On 16 January 2017 at 09:30, Greg Landrum <greg.landrum@...> wrote: >> >>> >>> >>> On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw < >>> chris@...> wrote: >>> >>>> >>>> Either way, it makes it rather hard to trust their derivations >>>> generally  especially as there appear to be other errors (e.g. the >>>> denominator in eq. 16 should be the square root of the given sum of >>>> squares, according to their reference). >>>> >>> >>> Indeed. Given the problems encountered, I went back and checked some >>> additional references to find definitions of the descriptors. The results >>> are in this PR, which I'd love feedback on if you have time to take a look: >>> https://github.com/rdkit/rdkit/pull/1265 >>> >>> I didn't manage to find any information about "inertial shape factor" >>> and don't have access to the references cited in the Todeschini paper, but >>> I think the others are now reasonably reliable. >>> >>> greg >>> >>> >>> >> >>  >>  >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >> _______________________________________________ >> Rdkitdiscuss mailing list >> Rdkitdiscuss@... >> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss >> >> > 
From: Brian Kelley <fustigator@gm...>  20170117 12:29:16

Looks like I'm late to the game. I don't know about the PMI descriptors perse, but if a planar molecule is in it's inertial frame, one of the axes should be zero (whether it is x, y or z) which means that the one of the M1x, M1y or M1z should be zero. We had some good experimentation with multipole expansion of moments (essentially based on the description of electrostatic multipoles) that might be nice to add to the PMI framework. Greg, I'm assuming that the Moments.py we opensourced a while back is similarly broken? I'm attaching it here for posterity but it does appear to match the moe PMI's. On Tue, Jan 17, 2017 at 4:55 AM, Chris Earnshaw <cgearnshaw@...> wrote: > The new version looks good to me as far as I can test it. PMI and NPR are > still fine, the radius of gyration is right (for an extremely artificial > test system) and the asphericity index also seems right (despite my best > efforts to confuse things further  sorry about that!). Also highlights > even more confusion in the Todeschini article  the approximate asphericity > values for prolate and oblate molecules are reversed. > > The only (very trivial) thing I've spotted is the comment in the > inertialShapeFactor function. 'planar or no coordinates' should be 'linear > or no coordinates' to avoid confusion. > > Chris > > On 16 January 2017 at 09:30, Greg Landrum <greg.landrum@...> wrote: > >> >> >> On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw < >> chris@...> wrote: >> >>> >>> Either way, it makes it rather hard to trust their derivations generally >>>  especially as there appear to be other errors (e.g. the denominator in >>> eq. 16 should be the square root of the given sum of squares, according to >>> their reference). >>> >> >> Indeed. Given the problems encountered, I went back and checked some >> additional references to find definitions of the descriptors. The results >> are in this PR, which I'd love feedback on if you have time to take a look: >> https://github.com/rdkit/rdkit/pull/1265 >> >> I didn't manage to find any information about "inertial shape factor" and >> don't have access to the references cited in the Todeschini paper, but I >> think the others are now reasonably reliable. >> >> greg >> >> >> > >  >  > Check out the vibrant tech community on one of the world's most > engaging tech sites, SlashDot.org! http://sdm.link/slashdot > _______________________________________________ > Rdkitdiscuss mailing list > Rdkitdiscuss@... > https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss > > 
From: Chris Earnshaw <cgearnshaw@gm...>  20170117 09:55:30

The new version looks good to me as far as I can test it. PMI and NPR are still fine, the radius of gyration is right (for an extremely artificial test system) and the asphericity index also seems right (despite my best efforts to confuse things further  sorry about that!). Also highlights even more confusion in the Todeschini article  the approximate asphericity values for prolate and oblate molecules are reversed. The only (very trivial) thing I've spotted is the comment in the inertialShapeFactor function. 'planar or no coordinates' should be 'linear or no coordinates' to avoid confusion. Chris On 16 January 2017 at 09:30, Greg Landrum <greg.landrum@...> wrote: > > > On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw <chris@... > > wrote: > >> >> Either way, it makes it rather hard to trust their derivations generally >>  especially as there appear to be other errors (e.g. the denominator in >> eq. 16 should be the square root of the given sum of squares, according to >> their reference). >> > > Indeed. Given the problems encountered, I went back and checked some > additional references to find definitions of the descriptors. The results > are in this PR, which I'd love feedback on if you have time to take a look: > https://github.com/rdkit/rdkit/pull/1265 > > I didn't manage to find any information about "inertial shape factor" and > don't have access to the references cited in the Todeschini paper, but I > think the others are now reasonably reliable. > > greg > > > 
From: Markus Sitzmann <markus.sitzmann@gm...>  20170116 21:20:40

... I just suffered this: https://github.com/conda/conda/issues/4309 Going back to a previous conda version (4.2.12) helps. Other than that: Happy New Year (a late one :) 
From: Greg Landrum <greg.landrum@gm...>  20170116 09:31:05

On Mon, Jan 16, 2017 at 10:22 AM, Chris Earnshaw <chris@...> wrote: > > Either way, it makes it rather hard to trust their derivations generally  > especially as there appear to be other errors (e.g. the denominator in eq. > 16 should be the square root of the given sum of squares, according to > their reference). > Indeed. Given the problems encountered, I went back and checked some additional references to find definitions of the descriptors. The results are in this PR, which I'd love feedback on if you have time to take a look: https://github.com/rdkit/rdkit/pull/1265 I didn't manage to find any information about "inertial shape factor" and don't have access to the references cited in the Todeschini paper, but I think the others are now reasonably reliable. greg 
From: Chris Earnshaw <chris@cg...>  20170116 09:22:23

Dear Guillaume Thanks  looks like we agree about reality (good!) and that Todeschini et al. are wrong in their discussion about planar molecules. Whether this is a simple mistaken assertion, or if they've mixed up another quantity (e.g. the eigenvalues of the covariance matrix) with the PMIs is impossible to say. Either way, it makes it rather hard to trust their derivations generally  especially as there appear to be other errors (e.g. the denominator in eq. 16 should be the square root of the given sum of squares, according to their reference). Best regards, Chris Dr Chris Earnshaw CGE Computational Chemistry Phone: +44(0) 1223 426000 Mobile: 07944 707773 Email: chris@... On 16 January 2017 at 08:54, Guillaume GODIN <Guillaume.GODIN@...> wrote: > Dear Chris, > > > No prob let me explain: > > > I Aggree on monoatomics center of mass is the atom so (for all x axis: > Ix= 0) > > > > Now I consider the mathematics only not the physics. > > > I suggest that they (Todeschini) are not really computing the "real > physical" PMi on the 3 axis but arbitrary said that for 2D molecules the > 3nd axis PMi is zero. > > > BR > > > > *Dr. Guillaume GODIN* > Principal Scientist > Chemoinformatic & Datamining > Innovation > CORPORATE R&D DIVISION > DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045> > MOBILE +41 (0)79 536 1039 <+41%2079%20536%2010%2039> > Firmenich SA > RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 > >  > *De :* Chris Earnshaw <cgearnshaw@...> > *Envoyé :* lundi 16 janvier 2017 09:36 > *À :* Guillaume GODIN > *Cc :* Greg Landrum; RDKit Discuss > > *Objet :* Re: [Rdkitdiscuss] PMI API > > > > On 16 January 2017 at 06:25, Guillaume GODIN < > Guillaume.GODIN@...> wrote: > >> reading carefully the Todeschini article, them said that Ic,Ib,Ia are >> determine as max & min values of I other all 3D axis passing throught the >> center of mass! >> > I don't quite understand this comment. The inequality Ia <= Ib <= Ic is > one of the errors in the Todeschini article pointed out by Greg yesterday. > By definition, the Principal Moment of Inertia axes pass through the centre > of mass. > > The "global PM" is never zero (sum of mi*ri*ri) idem for PMi even for >> planar molecule. >> > The global Moment of Inertia is only zero for monatomics. > > >> But When you have a planar molecule, the matrix is no more 3D but 2D! so >> it's normal to consider that the 3nd PM is zero. >> > I really don't understand this  it's simply wrong. The molecule may be 2D > but the three principal moments of inertia are most definitely nonzero for > a planar structure. For a fully symmetrical molecule like benzene the > largest PMI is around the axis perpendicular to the plane of the molecule > and there are two equivalent, smaller, PMIs perpendicular to each other in > the plane of the molecule. For a less symmetrical molecule like > naphthalene, the largest PMI is again around the axis perpendicular to the > plane, the intermediate PMI is along the fusion bond between the rings and > the smallest PMI is around the long axis of the molecule. There's no way it > can be correct to consider the 3rd PMI as zero in any planar molecule  > it's never equal to zero and is only degenerate with the 2nd PMI for fully > symmetric molecules. Only in the special case of a completely linear > molecule (e.g. acetylene, HCN) is the 3rd PMI (along the axis of the > molecule) equal to zero. > > Apologies  I appear to have opened a can of worms here... > > Chris > >>  >> *De :* Greg Landrum <greg.landrum@...> >> *Envoyé :* dimanche 15 janvier 2017 17:42 >> *À :* Guillaume GODIN; RDKit Discuss >> >> *Objet :* Re: [Rdkitdiscuss] PMI API >> >> Thanks Guillaume! >> >> On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN < >> Guillaume.GODIN@...> wrote: >> >>> Here, Dragon results for the 3 molecules: I've included both Whim and >>> 3D descriptors but I don't have access to PMi! >>> >>> >>> I found the second document in agreement with Peter answer... >>> >>> >>> BR, >>> >>> *Dr. Guillaume GODIN* >>> Principal Scientist >>> Chemoinformatic & Datamining >>> Innovation >>> CORPORATE R&D DIVISION >>> DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045> >>> MOBILE +41 (0)79 536 1039 <+41%2079%20536%2010%2039> >>> Firmenich SA >>> RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 >>> >>>  >>> *De :* Peter Gedeck <peter.gedeck@...> >>> *Envoyé :* dimanche 15 janvier 2017 15:07 >>> *À :* Greg Landrum; RDKit Discuss; Guillaume GODIN >>> >>> *Objet :* Re: [Rdkitdiscuss] PMI API >>> >>> According to this: >>> https://en.wikipedia.org/wiki/List_of_moments_of_inertia >>> The moments of inertia of a disk (something like benzene) are: >>> >>> Iz = mr^2/2 >>> Ix = Iy = mr^2/4 >>> >>> None of them is zero. The smallest moment of inertia of a rodlike >>> molecule (e.g. C#C) is zero. >>> >>> Best, >>> >>> Peter >>> >>> >>> >>> On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum <greg.landrum@...> >>> wrote: >>> >>>> Hi Guillaume, >>>> >>>> I think it this case it's something else. According to the Todeschini >>>> article the smallest moment of inertia of a planar molecule like benzene >>>> should be zero. The eigenvalues of the inertia matrix for benzene, however, >>>> are definitely not zero (and not close enough that it's likely to be >>>> roundoff error). >>>> It would be very nice if you could run the three files I mention >>>> through Dragon and let me know what it calculates for those descriptors. >>>> >>>> greg >>>> >>>> >>>> _____________________________ >>>> From: Guillaume GODIN <guillaume.godin@...> >>>> Sent: Sunday, January 15, 2017 1:11 PM >>>> Subject: RE: [Rdkitdiscuss] PMI API >>>> To: Greg Landrum <greg.landrum@...>, RDKit Discuss < >>>> rdkitdiscuss@...>, Chris Earnshaw < >>>> cgearnshaw@...> >>>> >>>> >>>> >>>> Dear Greg, >>>> >>>> >>>> I suspect that it's a precision error or eigen algorithm shift between >>>> rdkit c++ & dragon. >>>> >>>> >>>> To obtain good value, I suggest to try to implement a test on the eigen >>>> values like i did in gateway.cpp implementation. >>>> >>>> >>>> >>>> JacobiSVD<MatrixXd> getSVD(MatrixXd A) { >>>> >>>> JacobiSVD<MatrixXd> mysvd(A, ComputeThinU  ComputeThinV); >>>> >>>> return mysvd; >>>> >>>> } >>>> >>>> >>>> // get the A1 matrix using >>>> >>>> MatrixXd GetPinv(MatrixXd A){ >>>> >>>> JacobiSVD<MatrixXd> svd = getSVD(A); >>>> >>>> double pinvtoler=1.e2;// choose your tolerance wisely! >>>> >>>> VectorXd vs=svd.singularValues(); >>>> >>>> VectorXd vsinv=svd.singularValues(); >>>> >>>> >>>> for (unsignedint i=0; i<A.cols(); ++i) { >>>> >>>> if ( vs(i) > pinvtoler ) >>>> >>>> vsinv(i)=1.0/vs(i); >>>> >>>> else vsinv(i)=0.0; >>>> >>>> } >>>> >>>> >>>> MatrixXd S = vsinv.asDiagonal(); >>>> >>>> MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose(); >>>> >>>> return Ap; >>>> >>>> } >>>> >>>> >>>> If it's not solve the problem, I would like to test it in Matlab. can >>>> you provide me the 3 (3d xyz matrix) of your example please ? >>>> >>>> >>>> I also have Dragon 6 >>>> >>>> >>>> best regards, >>>> >>>> *Dr. Guillaume GODIN* >>>> Principal Scientist >>>> Chemoinformatic & Datamining >>>> Innovation >>>> CORPORATE R&D DIVISION >>>> DIRECT LINE +41 (0)22 780 3645 <022%20780%2036%2045> >>>> MOBILE +41 (0)79 536 1039 <079%20536%2010%2039> >>>> Firmenich SA >>>> RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 >>>> >>>>  >>>> *De :* Greg Landrum <greg.landrum@...> >>>> *Envoyé :* dimanche 15 janvier 2017 11:50 >>>> *À :* Chris Earnshaw; RDKit Discuss >>>> *Objet :* Re: [Rdkitdiscuss] PMI API >>>> >>>> I managed to make some time to look into this this weekend and I've >>>> found a bug and something I don't understand. Hopefully the community can >>>> help out here. >>>> On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw <cgearnshaw@...> >>>> wrote: >>>> >>>> 4) The big one! The returned results look very odd. They appear to >>>> relate more to the dimensions of the molecule than the moments of inertia. >>>> For a rodlike molecule (dimethylacetylene) I'd expect two large and one >>>> small PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: >>>> 0.0439828 NPR2: 0.999998) but actually get PMI1: 0.061647 PMI2: 0.061652 >>>> PMI3: 25.3699 NPR1: 0.002430 NPR2: 0.002430. >>>> For disklike (benzene) the result should be one large and two medium >>>> (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: >>>> 0.500013) but get PMI1: 2.37457e10 PMI2: 11.0844 PMI3: 11.0851 NPR1: >>>> 2.14213e11 NPR2: 0.999933. >>>> Finally for a roughly spherical molecule (neopentane) the NPR values >>>> look reasonable (no great surprise) but the absolute PMI values may be too >>>> small: old program  PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 >>>> NPR1: 0.999966 NPR2: 0.999988, new program  PMI1: 6.59466 PMI2: >>>> 6.59488 PMI3: 6.59531 NPR1: 0.999902 NPR2: 0.999935 >>>> >>>> >>>> Your expectations are correct: the current RDKit implementation is >>>> wrong. The corresponding github entry is here: https://github.com/rdkit >>>> /rdkit/issues/1262 >>>> This is due to a mistake in the way the principal moments are >>>> calculated (which is due to the fact that I don't spend a lot of time >>>> working with/thinking about 3D descriptors). Instead of using the >>>> eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the >>>> RDKit is currently using the covariance matrix. There's some more on the >>>> relationship between these two here: http://numbernone.com/b >>>> low/inertia/deriving_i.html >>>> >>>> The problem is easy to fix (and I have something working here: >>>> https://github.com/greglandrum/rdkit/tree/fix/github1262), but it >>>> screws up the values of the descriptors that are derived from here: >>>> Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook >>>> of Chemoinformaticshttp://dx.doi.org/10.1002/9783527618279.ch37 >>>> These include the radius of gyration, inertial shape factor, etc. >>>> Within that article they state that Ic = 0 for planar molecules. >>>> Ignoring the inequality on page 1010, which says that Ic is the largest >>>> moment and is contradicted by the rest of the text (particularly the >>>> inequalities on page 1011), Ic corresponds to the smallest principal moment >>>> : PMI1. >>>> >>>> So now I'm confused, but I'm hoping this is obvious to someone versed >>>> in the field: I'd like to reproduce the descriptors described in the >>>> Todeschini article, but I clearly can't do that using the actual moments of >>>> inertia. I could keep using the eigenvalues of the covariance matrix there, >>>> but that doesn't match what's described in the text. >>>> >>>> Two things that would be extremely helpful: >>>> 1) an explanation of the disconnect here from someone who knows this >>>> stuff, I would guess that it's pretty simple >>>> 2) The results of running the files github1262_1.mol, github1262_2.mol, >>>> and github1262_3.mol from here: https://github.com/gregl >>>> andrum/rdkit/tree/fix/github1262/Code/GraphMol/MolTransforms/test_data >>>> through Dragon and calculating the radius of gyration, inertial shape >>>> factor, eccentricity, molecular asphericity, and spherocity index. >>>> >>>> Best, >>>> greg >>>> >>>> >>>> >>>> >>>> >>>> ********************************************************************** >>>> DISCLAIMER >>>> This email and any files transmitted with it, including replies and >>>> forwarded copies (which may contain alterations) subsequently transmitted >>>> from Firmenich, are confidential and solely for the use of the intended >>>> recipient. The contents do not represent the opinion of Firmenich except to >>>> the extent that it relates to their official business. >>>> ********************************************************************** >>>> >>>>  >>>>  >>>> Developer Access Program for Intel Xeon Phi Processors >>>> Access to Intel Xeon Phi processorbased developer platforms. >>>> With one year of Intel Parallel Studio XE. >>>> Training and support from Colfax. >>>> Order your platform today. http://sdm.link/xeonphi_______ >>>> ________________________________________ >>>> Rdkitdiscuss mailing list >>>> Rdkitdiscuss@... >>>> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss >>>> >>> >> >>  >>  >> Developer Access Program for Intel Xeon Phi Processors >> Access to Intel Xeon Phi processorbased developer platforms. >> With one year of Intel Parallel Studio XE. >> Training and support from Colfax. >> Order your platform today. http://sdm.link/xeonphi >> _______________________________________________ >> Rdkitdiscuss mailing list >> Rdkitdiscuss@... >> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss >> >> > 
From: Greg Landrum <greg.landrum@gm...>  20170116 08:57:56

On Mon, Jan 16, 2017 at 9:36 AM, Chris Earnshaw <cgearnshaw@...> wrote: > > Apologies  I appear to have opened a can of worms here... > No need whatsoever to apologize. You identified and pointed out a bug in the implementation of the new 3D descriptors, which is something very much appreciated. The fact that I picked a seemingly unreliable source for the definitions of those descriptors and that it's turning out to be difficult than I might like to find reliable definitions for some of them is just the way things are. I'll have an updated version checked in (hopefully) in the next couple hours. It would be great if you could take a look at it and let me know if it looks right. greg 
From: Guillaume GODIN <Guillaume.GODIN@fi...>  20170116 08:54:33

Dear Chris, No prob let me explain: I Aggree on monoatomics center of mass is the atom so (for all x axis: Ix= 0) Now I consider the mathematics only not the physics. I suggest that they (Todeschini) are not really computing the "real physical" PMi on the 3 axis but arbitrary said that for 2D molecules the 3nd axis PMi is zero. BR Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 ________________________________ De : Chris Earnshaw <cgearnshaw@...> Envoyé : lundi 16 janvier 2017 09:36 À : Guillaume GODIN Cc : Greg Landrum; RDKit Discuss Objet : Re: [Rdkitdiscuss] PMI API On 16 January 2017 at 06:25, Guillaume GODIN <Guillaume.GODIN@...<mailto:Guillaume.GODIN@...>> wrote: reading carefully the Todeschini article, them said that Ic,Ib,Ia are determine as max & min values of I other all 3D axis passing throught the center of mass! I don't quite understand this comment. The inequality Ia <= Ib <= Ic is one of the errors in the Todeschini article pointed out by Greg yesterday. By definition, the Principal Moment of Inertia axes pass through the centre of mass. The "global PM" is never zero (sum of mi*ri*ri) idem for PMi even for planar molecule. The global Moment of Inertia is only zero for monatomics. But When you have a planar molecule, the matrix is no more 3D but 2D! so it's normal to consider that the 3nd PM is zero. I really don't understand this  it's simply wrong. The molecule may be 2D but the three principal moments of inertia are most definitely nonzero for a planar structure. For a fully symmetrical molecule like benzene the largest PMI is around the axis perpendicular to the plane of the molecule and there are two equivalent, smaller, PMIs perpendicular to each other in the plane of the molecule. For a less symmetrical molecule like naphthalene, the largest PMI is again around the axis perpendicular to the plane, the intermediate PMI is along the fusion bond between the rings and the smallest PMI is around the long axis of the molecule. There's no way it can be correct to consider the 3rd PMI as zero in any planar molecule  it's never equal to zero and is only degenerate with the 2nd PMI for fully symmetric molecules. Only in the special case of a completely linear molecule (e.g. acetylene, HCN) is the 3rd PMI (along the axis of the molecule) equal to zero. Apologies  I appear to have opened a can of worms here... Chris ________________________________ De : Greg Landrum <greg.landrum@...<mailto:greg.landrum@...>> Envoyé : dimanche 15 janvier 2017 17:42 À : Guillaume GODIN; RDKit Discuss Objet : Re: [Rdkitdiscuss] PMI API Thanks Guillaume! On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN <Guillaume.GODIN@...<mailto:Guillaume.GODIN@...>> wrote: Here, Dragon results for the 3 molecules: I've included both Whim and 3D descriptors but I don't have access to PMi! I found the second document in agreement with Peter answer... BR, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645<tel:+41%2022%20780%2036%2045> MOBILE +41 (0)79 536 1039<tel:+41%2079%20536%2010%2039> Firmenich SA RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 ________________________________ De : Peter Gedeck <peter.gedeck@...<mailto:peter.gedeck@...>> Envoyé : dimanche 15 janvier 2017 15:07 À : Greg Landrum; RDKit Discuss; Guillaume GODIN Objet : Re: [Rdkitdiscuss] PMI API According to this: https://en.wikipedia.org/wiki/List_of_moments_of_inertia The moments of inertia of a disk (something like benzene) are: Iz = mr^2/2 Ix = Iy = mr^2/4 None of them is zero. The smallest moment of inertia of a rodlike molecule (e.g. C#C) is zero. Best, Peter On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum <greg.landrum@...<mailto:greg.landrum@...>> wrote: Hi Guillaume, I think it this case it's something else. According to the Todeschini article the smallest moment of inertia of a planar molecule like benzene should be zero. The eigenvalues of the inertia matrix for benzene, however, are definitely not zero (and not close enough that it's likely to be roundoff error). It would be very nice if you could run the three files I mention through Dragon and let me know what it calculates for those descriptors. greg _____________________________ From: Guillaume GODIN <guillaume.godin@...<mailto:guillaume.godin@...>> Sent: Sunday, January 15, 2017 1:11 PM Subject: RE: [Rdkitdiscuss] PMI API To: Greg Landrum <greg.landrum@...<mailto:greg.landrum@...>>, RDKit Discuss <rdkitdiscuss@...<mailto:rdkitdiscuss@...>>, Chris Earnshaw <cgearnshaw@...<mailto:cgearnshaw@...>> Dear Greg, I suspect that it's a precision error or eigen algorithm shift between rdkit c++ & dragon. To obtain good value, I suggest to try to implement a test on the eigen values like i did in gateway.cpp implementation. JacobiSVD<MatrixXd> getSVD(MatrixXd A) { JacobiSVD<MatrixXd> mysvd(A, ComputeThinU  ComputeThinV); return mysvd; } // get the A1 matrix using MatrixXd GetPinv(MatrixXd A){ JacobiSVD<MatrixXd> svd = getSVD(A); double pinvtoler=1.e2;// choose your tolerance wisely! VectorXd vs=svd.singularValues(); VectorXd vsinv=svd.singularValues(); for (unsignedint i=0; i<A.cols(); ++i) { if ( vs(i) > pinvtoler ) vsinv(i)=1.0/vs(i); else vsinv(i)=0.0; } MatrixXd S = vsinv.asDiagonal(); MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose(); return Ap; } If it's not solve the problem, I would like to test it in Matlab. can you provide me the 3 (3d xyz matrix) of your example please ? I also have Dragon 6 best regards, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645<tel:022%20780%2036%2045> MOBILE +41 (0)79 536 1039<tel:079%20536%2010%2039> Firmenich SA RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 ________________________________ De : Greg Landrum <greg.landrum@...<mailto:greg.landrum@...>> Envoyé : dimanche 15 janvier 2017 11:50 À : Chris Earnshaw; RDKit Discuss Objet : Re: [Rdkitdiscuss] PMI API I managed to make some time to look into this this weekend and I've found a bug and something I don't understand. Hopefully the community can help out here. On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw <cgearnshaw@...<mailto:cgearnshaw@...>> wrote: 4) The big one! The returned results look very odd. They appear to relate more to the dimensions of the molecule than the moments of inertia. For a rodlike molecule (dimethylacetylene) I'd expect two large and one small PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 NPR2: 0.999998) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: 25.3699 NPR1: 0.002430 NPR2: 0.002430. For disklike (benzene) the result should be one large and two medium (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: 0.500013) but get PMI1: 2.37457e10 PMI2: 11.0844 PMI3: 11.0851 NPR1: 2.14213e11 NPR2: 0.999933. Finally for a roughly spherical molecule (neopentane) the NPR values look reasonable (no great surprise) but the absolute PMI values may be too small: old program  PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 NPR1: 0.999966 NPR2: 0.999988, new program  PMI1: 6.59466 PMI2: 6.59488 PMI3: 6.59531 NPR1: 0.999902 NPR2: 0.999935 Your expectations are correct: the current RDKit implementation is wrong. The corresponding github entry is here: https://github.com/rdkit/rdkit/issues/1262 This is due to a mistake in the way the principal moments are calculated (which is due to the fact that I don't spend a lot of time working with/thinking about 3D descriptors). Instead of using the eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the RDKit is currently using the covariance matrix. There's some more on the relationship between these two here: http://numbernone.com/blow/inertia/deriving_i.html The problem is easy to fix (and I have something working here: https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws up the values of the descriptors that are derived from here: Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of Chemoinformaticshttp://dx.doi.org/10.1002/9783527618279.ch37 These include the radius of gyration, inertial shape factor, etc. Within that article they state that Ic = 0 for planar molecules. Ignoring the inequality on page 1010, which says that Ic is the largest moment and is contradicted by the rest of the text (particularly the inequalities on page 1011), Ic corresponds to the smallest principal moment : PMI1. So now I'm confused, but I'm hoping this is obvious to someone versed in the field: I'd like to reproduce the descriptors described in the Todeschini article, but I clearly can't do that using the actual moments of inertia. I could keep using the eigenvalues of the covariance matrix there, but that doesn't match what's described in the text. Two things that would be extremely helpful: 1) an explanation of the disconnect here from someone who knows this stuff, I would guess that it's pretty simple 2) The results of running the files github1262_1.mol, github1262_2.mol, and github1262_3.mol from here: https://github.com/greglandrum/rdkit/tree/fix/github1262/Code/GraphMol/MolTransforms/test_data through Dragon and calculating the radius of gyration, inertial shape factor, eccentricity, molecular asphericity, and spherocity index. Best, greg ********************************************************************** DISCLAIMER This email and any files transmitted with it, including replies and forwarded copies (which may contain alterations) subsequently transmitted from Firmenich, are confidential and solely for the use of the intended recipient. The contents do not represent the opinion of Firmenich except to the extent that it relates to their official business. **********************************************************************  Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processorbased developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi_______________________________________________ Rdkitdiscuss mailing list Rdkitdiscuss@...<mailto:Rdkitdiscuss@...> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss  Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processorbased developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi _______________________________________________ Rdkitdiscuss mailing list Rdkitdiscuss@...<mailto:Rdkitdiscuss@...> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss 
From: Chris Earnshaw <cgearnshaw@gm...>  20170116 08:37:02

On 16 January 2017 at 06:25, Guillaume GODIN <Guillaume.GODIN@...> wrote: > reading carefully the Todeschini article, them said that Ic,Ib,Ia are > determine as max & min values of I other all 3D axis passing throught the > center of mass! > I don't quite understand this comment. The inequality Ia <= Ib <= Ic is one of the errors in the Todeschini article pointed out by Greg yesterday. By definition, the Principal Moment of Inertia axes pass through the centre of mass. The "global PM" is never zero (sum of mi*ri*ri) idem for PMi even for > planar molecule. > The global Moment of Inertia is only zero for monatomics. > But When you have a planar molecule, the matrix is no more 3D but 2D! so > it's normal to consider that the 3nd PM is zero. > I really don't understand this  it's simply wrong. The molecule may be 2D but the three principal moments of inertia are most definitely nonzero for a planar structure. For a fully symmetrical molecule like benzene the largest PMI is around the axis perpendicular to the plane of the molecule and there are two equivalent, smaller, PMIs perpendicular to each other in the plane of the molecule. For a less symmetrical molecule like naphthalene, the largest PMI is again around the axis perpendicular to the plane, the intermediate PMI is along the fusion bond between the rings and the smallest PMI is around the long axis of the molecule. There's no way it can be correct to consider the 3rd PMI as zero in any planar molecule  it's never equal to zero and is only degenerate with the 2nd PMI for fully symmetric molecules. Only in the special case of a completely linear molecule (e.g. acetylene, HCN) is the 3rd PMI (along the axis of the molecule) equal to zero. Apologies  I appear to have opened a can of worms here... Chris >  > *De :* Greg Landrum <greg.landrum@...> > *Envoyé :* dimanche 15 janvier 2017 17:42 > *À :* Guillaume GODIN; RDKit Discuss > > *Objet :* Re: [Rdkitdiscuss] PMI API > > Thanks Guillaume! > > On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN < > Guillaume.GODIN@...> wrote: > >> Here, Dragon results for the 3 molecules: I've included both Whim and 3D >> descriptors but I don't have access to PMi! >> >> >> I found the second document in agreement with Peter answer... >> >> >> BR, >> >> *Dr. Guillaume GODIN* >> Principal Scientist >> Chemoinformatic & Datamining >> Innovation >> CORPORATE R&D DIVISION >> DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045> >> MOBILE +41 (0)79 536 1039 <+41%2079%20536%2010%2039> >> Firmenich SA >> RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 >> >>  >> *De :* Peter Gedeck <peter.gedeck@...> >> *Envoyé :* dimanche 15 janvier 2017 15:07 >> *À :* Greg Landrum; RDKit Discuss; Guillaume GODIN >> >> *Objet :* Re: [Rdkitdiscuss] PMI API >> >> According to this: >> https://en.wikipedia.org/wiki/List_of_moments_of_inertia >> The moments of inertia of a disk (something like benzene) are: >> >> Iz = mr^2/2 >> Ix = Iy = mr^2/4 >> >> None of them is zero. The smallest moment of inertia of a rodlike >> molecule (e.g. C#C) is zero. >> >> Best, >> >> Peter >> >> >> >> On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum <greg.landrum@...> >> wrote: >> >>> Hi Guillaume, >>> >>> I think it this case it's something else. According to the Todeschini >>> article the smallest moment of inertia of a planar molecule like benzene >>> should be zero. The eigenvalues of the inertia matrix for benzene, however, >>> are definitely not zero (and not close enough that it's likely to be >>> roundoff error). >>> It would be very nice if you could run the three files I mention through >>> Dragon and let me know what it calculates for those descriptors. >>> >>> greg >>> >>> >>> _____________________________ >>> From: Guillaume GODIN <guillaume.godin@...> >>> Sent: Sunday, January 15, 2017 1:11 PM >>> Subject: RE: [Rdkitdiscuss] PMI API >>> To: Greg Landrum <greg.landrum@...>, RDKit Discuss < >>> rdkitdiscuss@...>, Chris Earnshaw < >>> cgearnshaw@...> >>> >>> >>> >>> Dear Greg, >>> >>> >>> I suspect that it's a precision error or eigen algorithm shift between >>> rdkit c++ & dragon. >>> >>> >>> To obtain good value, I suggest to try to implement a test on the eigen >>> values like i did in gateway.cpp implementation. >>> >>> >>> >>> JacobiSVD<MatrixXd> getSVD(MatrixXd A) { >>> >>> JacobiSVD<MatrixXd> mysvd(A, ComputeThinU  ComputeThinV); >>> >>> return mysvd; >>> >>> } >>> >>> >>> // get the A1 matrix using >>> >>> MatrixXd GetPinv(MatrixXd A){ >>> >>> JacobiSVD<MatrixXd> svd = getSVD(A); >>> >>> double pinvtoler=1.e2;// choose your tolerance wisely! >>> >>> VectorXd vs=svd.singularValues(); >>> >>> VectorXd vsinv=svd.singularValues(); >>> >>> >>> for (unsignedint i=0; i<A.cols(); ++i) { >>> >>> if ( vs(i) > pinvtoler ) >>> >>> vsinv(i)=1.0/vs(i); >>> >>> else vsinv(i)=0.0; >>> >>> } >>> >>> >>> MatrixXd S = vsinv.asDiagonal(); >>> >>> MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose(); >>> >>> return Ap; >>> >>> } >>> >>> >>> If it's not solve the problem, I would like to test it in Matlab. can >>> you provide me the 3 (3d xyz matrix) of your example please ? >>> >>> >>> I also have Dragon 6 >>> >>> >>> best regards, >>> >>> *Dr. Guillaume GODIN* >>> Principal Scientist >>> Chemoinformatic & Datamining >>> Innovation >>> CORPORATE R&D DIVISION >>> DIRECT LINE +41 (0)22 780 3645 <022%20780%2036%2045> >>> MOBILE +41 (0)79 536 1039 <079%20536%2010%2039> >>> Firmenich SA >>> RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 >>> >>>  >>> *De :* Greg Landrum <greg.landrum@...> >>> *Envoyé :* dimanche 15 janvier 2017 11:50 >>> *À :* Chris Earnshaw; RDKit Discuss >>> *Objet :* Re: [Rdkitdiscuss] PMI API >>> >>> I managed to make some time to look into this this weekend and I've >>> found a bug and something I don't understand. Hopefully the community can >>> help out here. >>> On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw <cgearnshaw@...> >>> wrote: >>> >>> 4) The big one! The returned results look very odd. They appear to >>> relate more to the dimensions of the molecule than the moments of inertia. >>> For a rodlike molecule (dimethylacetylene) I'd expect two large and one >>> small PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: >>> 0.0439828 NPR2: 0.999998) but actually get PMI1: 0.061647 PMI2: 0.061652 >>> PMI3: 25.3699 NPR1: 0.002430 NPR2: 0.002430. >>> For disklike (benzene) the result should be one large and two medium >>> (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: >>> 0.500013) but get PMI1: 2.37457e10 PMI2: 11.0844 PMI3: 11.0851 NPR1: >>> 2.14213e11 NPR2: 0.999933. >>> Finally for a roughly spherical molecule (neopentane) the NPR values >>> look reasonable (no great surprise) but the absolute PMI values may be too >>> small: old program  PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 >>> NPR1: 0.999966 NPR2: 0.999988, new program  PMI1: 6.59466 PMI2: >>> 6.59488 PMI3: 6.59531 NPR1: 0.999902 NPR2: 0.999935 >>> >>> >>> Your expectations are correct: the current RDKit implementation is >>> wrong. The corresponding github entry is here: https://github.com/rdkit >>> /rdkit/issues/1262 >>> This is due to a mistake in the way the principal moments are calculated >>> (which is due to the fact that I don't spend a lot of time working >>> with/thinking about 3D descriptors). Instead of using the >>> eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the >>> RDKit is currently using the covariance matrix. There's some more on the >>> relationship between these two here: http://numbernone.com/b >>> low/inertia/deriving_i.html >>> >>> The problem is easy to fix (and I have something working here: >>> https://github.com/greglandrum/rdkit/tree/fix/github1262), but it >>> screws up the values of the descriptors that are derived from here: >>> Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of >>> Chemoinformaticshttp://dx.doi.org/10.1002/9783527618279.ch37 >>> These include the radius of gyration, inertial shape factor, etc. >>> Within that article they state that Ic = 0 for planar molecules. >>> Ignoring the inequality on page 1010, which says that Ic is the largest >>> moment and is contradicted by the rest of the text (particularly the >>> inequalities on page 1011), Ic corresponds to the smallest principal moment >>> : PMI1. >>> >>> So now I'm confused, but I'm hoping this is obvious to someone versed in >>> the field: I'd like to reproduce the descriptors described in the >>> Todeschini article, but I clearly can't do that using the actual moments of >>> inertia. I could keep using the eigenvalues of the covariance matrix there, >>> but that doesn't match what's described in the text. >>> >>> Two things that would be extremely helpful: >>> 1) an explanation of the disconnect here from someone who knows this >>> stuff, I would guess that it's pretty simple >>> 2) The results of running the files github1262_1.mol, github1262_2.mol, >>> and github1262_3.mol from here: https://github.com/gregl >>> andrum/rdkit/tree/fix/github1262/Code/GraphMol/MolTransforms/test_data >>> through Dragon and calculating the radius of gyration, inertial shape >>> factor, eccentricity, molecular asphericity, and spherocity index. >>> >>> Best, >>> greg >>> >>> >>> >>> >>> >>> ********************************************************************** >>> DISCLAIMER >>> This email and any files transmitted with it, including replies and >>> forwarded copies (which may contain alterations) subsequently transmitted >>> from Firmenich, are confidential and solely for the use of the intended >>> recipient. The contents do not represent the opinion of Firmenich except to >>> the extent that it relates to their official business. >>> ********************************************************************** >>> >>>  >>>  >>> Developer Access Program for Intel Xeon Phi Processors >>> Access to Intel Xeon Phi processorbased developer platforms. >>> With one year of Intel Parallel Studio XE. >>> Training and support from Colfax. >>> Order your platform today. http://sdm.link/xeonphi_______ >>> ________________________________________ >>> Rdkitdiscuss mailing list >>> Rdkitdiscuss@... >>> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss >>> >> > >  >  > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processorbased developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today. http://sdm.link/xeonphi > _______________________________________________ > Rdkitdiscuss mailing list > Rdkitdiscuss@... > https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss > > 
From: Guillaume GODIN <Guillaume.GODIN@fi...>  20170116 06:25:20

No problem Greg, reading carefully the Todeschini article, them said that Ic,Ib,Ia are determine as max & min values of I other all 3D axis passing throught the center of mass! The "global PM" is never zero (sum of mi*ri*ri) idem for PMi even for planar molecule. But When you have a planar molecule, the matrix is no more 3D but 2D! so it's normal to consider that the 3nd PM is zero. BR, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 ________________________________ De : Greg Landrum <greg.landrum@...> Envoyé : dimanche 15 janvier 2017 17:42 À : Guillaume GODIN; RDKit Discuss Objet : Re: [Rdkitdiscuss] PMI API Thanks Guillaume! On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN <Guillaume.GODIN@...<mailto:Guillaume.GODIN@...>> wrote: Here, Dragon results for the 3 molecules: I've included both Whim and 3D descriptors but I don't have access to PMi! I found the second document in agreement with Peter answer... BR, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645<tel:+41%2022%20780%2036%2045> MOBILE +41 (0)79 536 1039<tel:+41%2079%20536%2010%2039> Firmenich SA RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 ________________________________ De : Peter Gedeck <peter.gedeck@...<mailto:peter.gedeck@...>> Envoyé : dimanche 15 janvier 2017 15:07 À : Greg Landrum; RDKit Discuss; Guillaume GODIN Objet : Re: [Rdkitdiscuss] PMI API According to this: https://en.wikipedia.org/wiki/List_of_moments_of_inertia The moments of inertia of a disk (something like benzene) are: Iz = mr^2/2 Ix = Iy = mr^2/4 None of them is zero. The smallest moment of inertia of a rodlike molecule (e.g. C#C) is zero. Best, Peter On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum <greg.landrum@...<mailto:greg.landrum@...>> wrote: Hi Guillaume, I think it this case it's something else. According to the Todeschini article the smallest moment of inertia of a planar molecule like benzene should be zero. The eigenvalues of the inertia matrix for benzene, however, are definitely not zero (and not close enough that it's likely to be roundoff error). It would be very nice if you could run the three files I mention through Dragon and let me know what it calculates for those descriptors. greg _____________________________ From: Guillaume GODIN <guillaume.godin@...<mailto:guillaume.godin@...>> Sent: Sunday, January 15, 2017 1:11 PM Subject: RE: [Rdkitdiscuss] PMI API To: Greg Landrum <greg.landrum@...<mailto:greg.landrum@...>>, RDKit Discuss <rdkitdiscuss@...<mailto:rdkitdiscuss@...>>, Chris Earnshaw <cgearnshaw@...<mailto:cgearnshaw@...>> Dear Greg, I suspect that it's a precision error or eigen algorithm shift between rdkit c++ & dragon. To obtain good value, I suggest to try to implement a test on the eigen values like i did in gateway.cpp implementation. JacobiSVD<MatrixXd> getSVD(MatrixXd A) { JacobiSVD<MatrixXd> mysvd(A, ComputeThinU  ComputeThinV); return mysvd; } // get the A1 matrix using MatrixXd GetPinv(MatrixXd A){ JacobiSVD<MatrixXd> svd = getSVD(A); double pinvtoler=1.e2;// choose your tolerance wisely! VectorXd vs=svd.singularValues(); VectorXd vsinv=svd.singularValues(); for (unsignedint i=0; i<A.cols(); ++i) { if ( vs(i) > pinvtoler ) vsinv(i)=1.0/vs(i); else vsinv(i)=0.0; } MatrixXd S = vsinv.asDiagonal(); MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose(); return Ap; } If it's not solve the problem, I would like to test it in Matlab. can you provide me the 3 (3d xyz matrix) of your example please ? I also have Dragon 6 best regards, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645<tel:022%20780%2036%2045> MOBILE +41 (0)79 536 1039<tel:079%20536%2010%2039> Firmenich SA RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 ________________________________ De : Greg Landrum <greg.landrum@...<mailto:greg.landrum@...>> Envoyé : dimanche 15 janvier 2017 11:50 À : Chris Earnshaw; RDKit Discuss Objet : Re: [Rdkitdiscuss] PMI API I managed to make some time to look into this this weekend and I've found a bug and something I don't understand. Hopefully the community can help out here. On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw <cgearnshaw@...<mailto:cgearnshaw@...>> wrote: 4) The big one! The returned results look very odd. They appear to relate more to the dimensions of the molecule than the moments of inertia. For a rodlike molecule (dimethylacetylene) I'd expect two large and one small PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 NPR2: 0.999998) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: 25.3699 NPR1: 0.002430 NPR2: 0.002430. For disklike (benzene) the result should be one large and two medium (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: 0.500013) but get PMI1: 2.37457e10 PMI2: 11.0844 PMI3: 11.0851 NPR1: 2.14213e11 NPR2: 0.999933. Finally for a roughly spherical molecule (neopentane) the NPR values look reasonable (no great surprise) but the absolute PMI values may be too small: old program  PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 NPR1: 0.999966 NPR2: 0.999988, new program  PMI1: 6.59466 PMI2: 6.59488 PMI3: 6.59531 NPR1: 0.999902 NPR2: 0.999935 Your expectations are correct: the current RDKit implementation is wrong. The corresponding github entry is here: https://github.com/rdkit/rdkit/issues/1262 This is due to a mistake in the way the principal moments are calculated (which is due to the fact that I don't spend a lot of time working with/thinking about 3D descriptors). Instead of using the eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the RDKit is currently using the covariance matrix. There's some more on the relationship between these two here: http://numbernone.com/blow/inertia/deriving_i.html The problem is easy to fix (and I have something working here: https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws up the values of the descriptors that are derived from here: Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of Chemoinformaticshttp://dx.doi.org/10.1002/9783527618279.ch37 These include the radius of gyration, inertial shape factor, etc. Within that article they state that Ic = 0 for planar molecules. Ignoring the inequality on page 1010, which says that Ic is the largest moment and is contradicted by the rest of the text (particularly the inequalities on page 1011), Ic corresponds to the smallest principal moment : PMI1. So now I'm confused, but I'm hoping this is obvious to someone versed in the field: I'd like to reproduce the descriptors described in the Todeschini article, but I clearly can't do that using the actual moments of inertia. I could keep using the eigenvalues of the covariance matrix there, but that doesn't match what's described in the text. Two things that would be extremely helpful: 1) an explanation of the disconnect here from someone who knows this stuff, I would guess that it's pretty simple 2) The results of running the files github1262_1.mol, github1262_2.mol, and github1262_3.mol from here: https://github.com/greglandrum/rdkit/tree/fix/github1262/Code/GraphMol/MolTransforms/test_data through Dragon and calculating the radius of gyration, inertial shape factor, eccentricity, molecular asphericity, and spherocity index. Best, greg ********************************************************************** DISCLAIMER This email and any files transmitted with it, including replies and forwarded copies (which may contain alterations) subsequently transmitted from Firmenich, are confidential and solely for the use of the intended recipient. The contents do not represent the opinion of Firmenich except to the extent that it relates to their official business. **********************************************************************  Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processorbased developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi_______________________________________________ Rdkitdiscuss mailing list Rdkitdiscuss@...<mailto:Rdkitdiscuss@...> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss 
From: Greg Landrum <greg.landrum@gm...>  20170115 16:42:32

Thanks Guillaume! On Sun, Jan 15, 2017 at 5:01 PM, Guillaume GODIN < Guillaume.GODIN@...> wrote: > Here, Dragon results for the 3 molecules: I've included both Whim and 3D > descriptors but I don't have access to PMi! > > > I found the second document in agreement with Peter answer... > > > BR, > > *Dr. Guillaume GODIN* > Principal Scientist > Chemoinformatic & Datamining > Innovation > CORPORATE R&D DIVISION > DIRECT LINE +41 (0)22 780 3645 <+41%2022%20780%2036%2045> > MOBILE +41 (0)79 536 1039 <+41%2079%20536%2010%2039> > Firmenich SA > RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 > >  > *De :* Peter Gedeck <peter.gedeck@...> > *Envoyé :* dimanche 15 janvier 2017 15:07 > *À :* Greg Landrum; RDKit Discuss; Guillaume GODIN > > *Objet :* Re: [Rdkitdiscuss] PMI API > > According to this: > https://en.wikipedia.org/wiki/List_of_moments_of_inertia > The moments of inertia of a disk (something like benzene) are: > > Iz = mr^2/2 > Ix = Iy = mr^2/4 > > None of them is zero. The smallest moment of inertia of a rodlike > molecule (e.g. C#C) is zero. > > Best, > > Peter > > > > On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum <greg.landrum@...> > wrote: > >> Hi Guillaume, >> >> I think it this case it's something else. According to the Todeschini >> article the smallest moment of inertia of a planar molecule like benzene >> should be zero. The eigenvalues of the inertia matrix for benzene, however, >> are definitely not zero (and not close enough that it's likely to be >> roundoff error). >> It would be very nice if you could run the three files I mention through >> Dragon and let me know what it calculates for those descriptors. >> >> greg >> >> >> _____________________________ >> From: Guillaume GODIN <guillaume.godin@...> >> Sent: Sunday, January 15, 2017 1:11 PM >> Subject: RE: [Rdkitdiscuss] PMI API >> To: Greg Landrum <greg.landrum@...>, RDKit Discuss < >> rdkitdiscuss@...>, Chris Earnshaw < >> cgearnshaw@...> >> >> >> >> Dear Greg, >> >> >> I suspect that it's a precision error or eigen algorithm shift between >> rdkit c++ & dragon. >> >> >> To obtain good value, I suggest to try to implement a test on the eigen >> values like i did in gateway.cpp implementation. >> >> >> >> JacobiSVD<MatrixXd> getSVD(MatrixXd A) { >> >> JacobiSVD<MatrixXd> mysvd(A, ComputeThinU  ComputeThinV); >> >> return mysvd; >> >> } >> >> >> // get the A1 matrix using >> >> MatrixXd GetPinv(MatrixXd A){ >> >> JacobiSVD<MatrixXd> svd = getSVD(A); >> >> double pinvtoler=1.e2;// choose your tolerance wisely! >> >> VectorXd vs=svd.singularValues(); >> >> VectorXd vsinv=svd.singularValues(); >> >> >> for (unsignedint i=0; i<A.cols(); ++i) { >> >> if ( vs(i) > pinvtoler ) >> >> vsinv(i)=1.0/vs(i); >> >> else vsinv(i)=0.0; >> >> } >> >> >> MatrixXd S = vsinv.asDiagonal(); >> >> MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose(); >> >> return Ap; >> >> } >> >> >> If it's not solve the problem, I would like to test it in Matlab. can you >> provide me the 3 (3d xyz matrix) of your example please ? >> >> >> I also have Dragon 6 >> >> >> best regards, >> >> *Dr. Guillaume GODIN* >> Principal Scientist >> Chemoinformatic & Datamining >> Innovation >> CORPORATE R&D DIVISION >> DIRECT LINE +41 (0)22 780 3645 <022%20780%2036%2045> >> MOBILE +41 (0)79 536 1039 <079%20536%2010%2039> >> Firmenich SA >> RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 >> >>  >> *De :* Greg Landrum <greg.landrum@...> >> *Envoyé :* dimanche 15 janvier 2017 11:50 >> *À :* Chris Earnshaw; RDKit Discuss >> *Objet :* Re: [Rdkitdiscuss] PMI API >> >> I managed to make some time to look into this this weekend and I've found >> a bug and something I don't understand. Hopefully the community can help >> out here. >> On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw <cgearnshaw@...> >> wrote: >> >> 4) The big one! The returned results look very odd. They appear to relate >> more to the dimensions of the molecule than the moments of inertia. For a >> rodlike molecule (dimethylacetylene) I'd expect two large and one small >> PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 >> NPR2: 0.999998) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: >> 25.3699 NPR1: 0.002430 NPR2: 0.002430. >> For disklike (benzene) the result should be one large and two medium >> (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: >> 0.500013) but get PMI1: 2.37457e10 PMI2: 11.0844 PMI3: 11.0851 NPR1: >> 2.14213e11 NPR2: 0.999933. >> Finally for a roughly spherical molecule (neopentane) the NPR values look >> reasonable (no great surprise) but the absolute PMI values may be too >> small: old program  PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 >> NPR1: 0.999966 NPR2: 0.999988, new program  PMI1: 6.59466 PMI2: >> 6.59488 PMI3: 6.59531 NPR1: 0.999902 NPR2: 0.999935 >> >> >> Your expectations are correct: the current RDKit implementation is wrong. >> The corresponding github entry is here: https://github.com/ >> rdkit/rdkit/issues/1262 >> This is due to a mistake in the way the principal moments are calculated >> (which is due to the fact that I don't spend a lot of time working >> with/thinking about 3D descriptors). Instead of using the >> eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the >> RDKit is currently using the covariance matrix. There's some more on the >> relationship between these two here: http://numbernone.com/ >> blow/inertia/deriving_i.html >> >> The problem is easy to fix (and I have something working here: >> https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws >> up the values of the descriptors that are derived from here: >> Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of >> Chemoinformaticshttp://dx.doi.org/10.1002/9783527618279.ch37 >> These include the radius of gyration, inertial shape factor, etc. >> Within that article they state that Ic = 0 for planar molecules. Ignoring >> the inequality on page 1010, which says that Ic is the largest moment and >> is contradicted by the rest of the text (particularly the inequalities on >> page 1011), Ic corresponds to the smallest principal moment : PMI1. >> >> So now I'm confused, but I'm hoping this is obvious to someone versed in >> the field: I'd like to reproduce the descriptors described in the >> Todeschini article, but I clearly can't do that using the actual moments of >> inertia. I could keep using the eigenvalues of the covariance matrix there, >> but that doesn't match what's described in the text. >> >> Two things that would be extremely helpful: >> 1) an explanation of the disconnect here from someone who knows this >> stuff, I would guess that it's pretty simple >> 2) The results of running the files github1262_1.mol, github1262_2.mol, >> and github1262_3.mol from here: https://github.com/ >> greglandrum/rdkit/tree/fix/github1262/Code/GraphMol/ >> MolTransforms/test_data through Dragon and calculating the radius of >> gyration, inertial shape factor, eccentricity, molecular asphericity, and >> spherocity index. >> >> Best, >> greg >> >> >> >> >> >> ********************************************************************** >> DISCLAIMER >> This email and any files transmitted with it, including replies and >> forwarded copies (which may contain alterations) subsequently transmitted >> from Firmenich, are confidential and solely for the use of the intended >> recipient. The contents do not represent the opinion of Firmenich except to >> the extent that it relates to their official business. >> ********************************************************************** >> >>  >>  >> Developer Access Program for Intel Xeon Phi Processors >> Access to Intel Xeon Phi processorbased developer platforms. >> With one year of Intel Parallel Studio XE. >> Training and support from Colfax. >> Order your platform today. http://sdm.link/xeonphi_______ >> ________________________________________ >> Rdkitdiscuss mailing list >> Rdkitdiscuss@... >> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss >> > 
From: Greg Landrum <greg.landrum@gm...>  20170115 16:42:02

On Sun, Jan 15, 2017 at 5:15 PM, Chris Earnshaw <cgearnshaw@...> wrote: > > I've built a version of RDKit with fixes from https://github.com/ > greglandrum/rdkit/tree/fix/github1262 and can confirm that it gives > exactly the same values of PMI and NPR that I got with the RDKit fork by > 'hahnda6'. I can't say for certain that the PMI values are correct in > absolute terms, but the NPR values are certainly what would be expected for > those test molecules. > Glad to hear it. > I'm worried about the Todeschini paper  I think there are errors in some > of the equations and inconsistencies in the discussion, some of which may > involve mixing up PMIs with eigenvalues of the covariance matrix. > Unfortunately I don't have access to the original references so can't check > in detail, but I'd be disinclined to take any of the equations at face > value. > Ok. I'm going to have to see if I can track down some additional references and work from there. <sigh> greg 
From: Chris Earnshaw <cgearnshaw@gm...>  20170115 16:16:06

Thanks Greg I've built a version of RDKit with fixes from https://github.com/ greglandrum/rdkit/tree/fix/github1262 and can confirm that it gives exactly the same values of PMI and NPR that I got with the RDKit fork by 'hahnda6'. I can't say for certain that the PMI values are correct in absolute terms, but the NPR values are certainly what would be expected for those test molecules. I'm worried about the Todeschini paper  I think there are errors in some of the equations and inconsistencies in the discussion, some of which may involve mixing up PMIs with eigenvalues of the covariance matrix. Unfortunately I don't have access to the original references so can't check in detail, but I'd be disinclined to take any of the equations at face value. Chris On 15 January 2017 at 10:50, Greg Landrum <greg.landrum@...> wrote: > I managed to make some time to look into this this weekend and I've found > a bug and something I don't understand. Hopefully the community can help > out here. > > On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw <cgearnshaw@...> > wrote: > >> 4) The big one! The returned results look very odd. They appear to relate >> more to the dimensions of the molecule than the moments of inertia. For a >> rodlike molecule (dimethylacetylene) I'd expect two large and one small >> PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 >> NPR2: 0.999998) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: >> 25.3699 NPR1: 0.002430 NPR2: 0.002430. >> For disklike (benzene) the result should be one large and two medium >> (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: >> 0.500013) but get PMI1: 2.37457e10 PMI2: 11.0844 PMI3: 11.0851 NPR1: >> 2.14213e11 NPR2: 0.999933. >> Finally for a roughly spherical molecule (neopentane) the NPR values look >> reasonable (no great surprise) but the absolute PMI values may be too >> small: old program  PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 >> NPR1: 0.999966 NPR2: 0.999988, new program  PMI1: 6.59466 PMI2: >> 6.59488 PMI3: 6.59531 NPR1: 0.999902 NPR2: 0.999935 >> > > Your expectations are correct: the current RDKit implementation is wrong. > The corresponding github entry is here: https://github.com/ > rdkit/rdkit/issues/1262 > This is due to a mistake in the way the principal moments are calculated > (which is due to the fact that I don't spend a lot of time working > with/thinking about 3D descriptors). Instead of using the > eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the > RDKit is currently using the covariance matrix. There's some more on the > relationship between these two here: http://numbernone.com/ > blow/inertia/deriving_i.html > > The problem is easy to fix (and I have something working here: > https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws > up the values of the descriptors that are derived from here: > Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of > Chemoinformatics http://dx.doi.org/10.1002/9783527618279.ch37 > These include the radius of gyration, inertial shape factor, etc. > Within that article they state that Ic = 0 for planar molecules. Ignoring > the inequality on page 1010, which says that Ic is the largest moment and > is contradicted by the rest of the text (particularly the inequalities on > page 1011), Ic corresponds to the smallest principal moment : PMI1. > > So now I'm confused, but I'm hoping this is obvious to someone versed in > the field: I'd like to reproduce the descriptors described in the > Todeschini article, but I clearly can't do that using the actual moments of > inertia. I could keep using the eigenvalues of the covariance matrix there, > but that doesn't match what's described in the text. > > Two things that would be extremely helpful: > 1) an explanation of the disconnect here from someone who knows this > stuff, I would guess that it's pretty simple > 2) The results of running the files github1262_1.mol, github1262_2.mol, > and github1262_3.mol from here: https://github.com/ > greglandrum/rdkit/tree/fix/github1262/Code/GraphMol/ > MolTransforms/test_data through Dragon and calculating the radius of > gyration, inertial shape factor, eccentricity, molecular asphericity, and > spherocity index. > > Best, > greg > > > >> > 
From: Guillaume GODIN <Guillaume.GODIN@fi...>  20170115 16:01:37

Here, Dragon results for the 3 molecules: I've included both Whim and 3D descriptors but I don't have access to PMi! I found the second document in agreement with Peter answer... BR, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645 MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 ________________________________ De : Peter Gedeck <peter.gedeck@...> Envoyé : dimanche 15 janvier 2017 15:07 À : Greg Landrum; RDKit Discuss; Guillaume GODIN Objet : Re: [Rdkitdiscuss] PMI API According to this: https://en.wikipedia.org/wiki/List_of_moments_of_inertia The moments of inertia of a disk (something like benzene) are: Iz = mr^2/2 Ix = Iy = mr^2/4 None of them is zero. The smallest moment of inertia of a rodlike molecule (e.g. C#C) is zero. Best, Peter On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum <greg.landrum@...<mailto:greg.landrum@...>> wrote: Hi Guillaume, I think it this case it's something else. According to the Todeschini article the smallest moment of inertia of a planar molecule like benzene should be zero. The eigenvalues of the inertia matrix for benzene, however, are definitely not zero (and not close enough that it's likely to be roundoff error). It would be very nice if you could run the three files I mention through Dragon and let me know what it calculates for those descriptors. greg _____________________________ From: Guillaume GODIN <guillaume.godin@...<mailto:guillaume.godin@...>> Sent: Sunday, January 15, 2017 1:11 PM Subject: RE: [Rdkitdiscuss] PMI API To: Greg Landrum <greg.landrum@...<mailto:greg.landrum@...>>, RDKit Discuss <rdkitdiscuss@...<mailto:rdkitdiscuss@...>>, Chris Earnshaw <cgearnshaw@...<mailto:cgearnshaw@...>> Dear Greg, I suspect that it's a precision error or eigen algorithm shift between rdkit c++ & dragon. To obtain good value, I suggest to try to implement a test on the eigen values like i did in gateway.cpp implementation. JacobiSVD<MatrixXd> getSVD(MatrixXd A) { JacobiSVD<MatrixXd> mysvd(A, ComputeThinU  ComputeThinV); return mysvd; } // get the A1 matrix using MatrixXd GetPinv(MatrixXd A){ JacobiSVD<MatrixXd> svd = getSVD(A); double pinvtoler=1.e2;// choose your tolerance wisely! VectorXd vs=svd.singularValues(); VectorXd vsinv=svd.singularValues(); for (unsignedint i=0; i<A.cols(); ++i) { if ( vs(i) > pinvtoler ) vsinv(i)=1.0/vs(i); else vsinv(i)=0.0; } MatrixXd S = vsinv.asDiagonal(); MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose(); return Ap; } If it's not solve the problem, I would like to test it in Matlab. can you provide me the 3 (3d xyz matrix) of your example please ? I also have Dragon 6 best regards, Dr. Guillaume GODIN Principal Scientist Chemoinformatic & Datamining Innovation CORPORATE R&D DIVISION DIRECT LINE +41 (0)22 780 3645<tel:022%20780%2036%2045> MOBILE +41 (0)79 536 1039<tel:079%20536%2010%2039> Firmenich SA RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 ________________________________ De : Greg Landrum <greg.landrum@...<mailto:greg.landrum@...>> Envoyé : dimanche 15 janvier 2017 11:50 À : Chris Earnshaw; RDKit Discuss Objet : Re: [Rdkitdiscuss] PMI API I managed to make some time to look into this this weekend and I've found a bug and something I don't understand. Hopefully the community can help out here. On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw <cgearnshaw@...<mailto:cgearnshaw@...>> wrote: 4) The big one! The returned results look very odd. They appear to relate more to the dimensions of the molecule than the moments of inertia. For a rodlike molecule (dimethylacetylene) I'd expect two large and one small PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 NPR2: 0.999998) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: 25.3699 NPR1: 0.002430 NPR2: 0.002430. For disklike (benzene) the result should be one large and two medium (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: 0.500013) but get PMI1: 2.37457e10 PMI2: 11.0844 PMI3: 11.0851 NPR1: 2.14213e11 NPR2: 0.999933. Finally for a roughly spherical molecule (neopentane) the NPR values look reasonable (no great surprise) but the absolute PMI values may be too small: old program  PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 NPR1: 0.999966 NPR2: 0.999988, new program  PMI1: 6.59466 PMI2: 6.59488 PMI3: 6.59531 NPR1: 0.999902 NPR2: 0.999935 Your expectations are correct: the current RDKit implementation is wrong. The corresponding github entry is here: https://github.com/rdkit/rdkit/issues/1262 This is due to a mistake in the way the principal moments are calculated (which is due to the fact that I don't spend a lot of time working with/thinking about 3D descriptors). Instead of using the eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the RDKit is currently using the covariance matrix. There's some more on the relationship between these two here: http://numbernone.com/blow/inertia/deriving_i.html The problem is easy to fix (and I have something working here: https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws up the values of the descriptors that are derived from here: Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of Chemoinformaticshttp://dx.doi.org/10.1002/9783527618279.ch37 These include the radius of gyration, inertial shape factor, etc. Within that article they state that Ic = 0 for planar molecules. Ignoring the inequality on page 1010, which says that Ic is the largest moment and is contradicted by the rest of the text (particularly the inequalities on page 1011), Ic corresponds to the smallest principal moment : PMI1. So now I'm confused, but I'm hoping this is obvious to someone versed in the field: I'd like to reproduce the descriptors described in the Todeschini article, but I clearly can't do that using the actual moments of inertia. I could keep using the eigenvalues of the covariance matrix there, but that doesn't match what's described in the text. Two things that would be extremely helpful: 1) an explanation of the disconnect here from someone who knows this stuff, I would guess that it's pretty simple 2) The results of running the files github1262_1.mol, github1262_2.mol, and github1262_3.mol from here: https://github.com/greglandrum/rdkit/tree/fix/github1262/Code/GraphMol/MolTransforms/test_data through Dragon and calculating the radius of gyration, inertial shape factor, eccentricity, molecular asphericity, and spherocity index. Best, greg ********************************************************************** DISCLAIMER This email and any files transmitted with it, including replies and forwarded copies (which may contain alterations) subsequently transmitted from Firmenich, are confidential and solely for the use of the intended recipient. The contents do not represent the opinion of Firmenich except to the extent that it relates to their official business. **********************************************************************  Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processorbased developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi_______________________________________________ Rdkitdiscuss mailing list Rdkitdiscuss@...<mailto:Rdkitdiscuss@...> https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss 
From: Peter Gedeck <peter.gedeck@gm...>  20170115 14:08:08

According to this: https://en.wikipedia.org/wiki/List_of_moments_of_inertia The moments of inertia of a disk (something like benzene) are: Iz = mr^2/2 Ix = Iy = mr^2/4 None of them is zero. The smallest moment of inertia of a rodlike molecule (e.g. C#C) is zero. Best, Peter On Sun, Jan 15, 2017 at 8:15 AM Greg Landrum <greg.landrum@...> wrote: > Hi Guillaume, > > I think it this case it's something else. According to the Todeschini > article the smallest moment of inertia of a planar molecule like benzene > should be zero. The eigenvalues of the inertia matrix for benzene, however, > are definitely not zero (and not close enough that it's likely to be > roundoff error). > It would be very nice if you could run the three files I mention through > Dragon and let me know what it calculates for those descriptors. > > greg > > > _____________________________ > From: Guillaume GODIN <guillaume.godin@...> > Sent: Sunday, January 15, 2017 1:11 PM > Subject: RE: [Rdkitdiscuss] PMI API > To: Greg Landrum <greg.landrum@...>, RDKit Discuss < > rdkitdiscuss@...>, Chris Earnshaw <cgearnshaw@... > > > > > > Dear Greg, > > > I suspect that it's a precision error or eigen algorithm shift between > rdkit c++ & dragon. > > > To obtain good value, I suggest to try to implement a test on the eigen > values like i did in gateway.cpp implementation. > > > > JacobiSVD<MatrixXd> getSVD(MatrixXd A) { > > JacobiSVD<MatrixXd> mysvd(A, ComputeThinU  ComputeThinV); > > return mysvd; > > } > > > // get the A1 matrix using > > MatrixXd GetPinv(MatrixXd A){ > > JacobiSVD<MatrixXd> svd = getSVD(A); > > double pinvtoler=1.e2;// choose your tolerance wisely! > > VectorXd vs=svd.singularValues(); > > VectorXd vsinv=svd.singularValues(); > > > for (unsignedint i=0; i<A.cols(); ++i) { > > if ( vs(i) > pinvtoler ) > > vsinv(i)=1.0/vs(i); > > else vsinv(i)=0.0; > > } > > > MatrixXd S = vsinv.asDiagonal(); > > MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose(); > > return Ap; > > } > > > If it's not solve the problem, I would like to test it in Matlab. can you > provide me the 3 (3d xyz matrix) of your example please ? > > > I also have Dragon 6 > > > best regards, > > *Dr. Guillaume GODIN* > Principal Scientist > Chemoinformatic & Datamining > Innovation > CORPORATE R&D DIVISION > DIRECT LINE +41 (0)22 780 3645 <022%20780%2036%2045> > MOBILE +41 (0)79 536 1039 <079%20536%2010%2039> > Firmenich SA > RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 > >  > *De :* Greg Landrum <greg.landrum@...> > *Envoyé :* dimanche 15 janvier 2017 11:50 > *À :* Chris Earnshaw; RDKit Discuss > *Objet :* Re: [Rdkitdiscuss] PMI API > > I managed to make some time to look into this this weekend and I've found > a bug and something I don't understand. Hopefully the community can help > out here. > On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw <cgearnshaw@...> > wrote: > > 4) The big one! The returned results look very odd. They appear to relate > more to the dimensions of the molecule than the moments of inertia. For a > rodlike molecule (dimethylacetylene) I'd expect two large and one small > PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 > NPR2: 0.999998) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: > 25.3699 NPR1: 0.002430 NPR2: 0.002430. > For disklike (benzene) the result should be one large and two medium > (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: > 0.500013) but get PMI1: 2.37457e10 PMI2: 11.0844 PMI3: 11.0851 NPR1: > 2.14213e11 NPR2: 0.999933. > Finally for a roughly spherical molecule (neopentane) the NPR values look > reasonable (no great surprise) but the absolute PMI values may be too > small: old program  PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 > NPR1: 0.999966 NPR2: 0.999988, new program  PMI1: 6.59466 PMI2: > 6.59488 PMI3: 6.59531 NPR1: 0.999902 NPR2: 0.999935 > > > Your expectations are correct: the current RDKit implementation is wrong. > The corresponding github entry is here: > https://github.com/rdkit/rdkit/issues/1262 > This is due to a mistake in the way the principal moments are calculated > (which is due to the fact that I don't spend a lot of time working > with/thinking about 3D descriptors). Instead of using the > eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the > RDKit is currently using the covariance matrix. There's some more on the > relationship between these two here: > http://numbernone.com/blow/inertia/deriving_i.html > > The problem is easy to fix (and I have something working here: > https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws > up the values of the descriptors that are derived from here: > Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of > Chemoinformaticshttp://dx.doi.org/10.1002/9783527618279.ch37 > These include the radius of gyration, inertial shape factor, etc. > Within that article they state that Ic = 0 for planar molecules. Ignoring > the inequality on page 1010, which says that Ic is the largest moment and > is contradicted by the rest of the text (particularly the inequalities on > page 1011), Ic corresponds to the smallest principal moment : PMI1. > > So now I'm confused, but I'm hoping this is obvious to someone versed in > the field: I'd like to reproduce the descriptors described in the > Todeschini article, but I clearly can't do that using the actual moments of > inertia. I could keep using the eigenvalues of the covariance matrix there, > but that doesn't match what's described in the text. > > Two things that would be extremely helpful: > 1) an explanation of the disconnect here from someone who knows this > stuff, I would guess that it's pretty simple > 2) The results of running the files github1262_1.mol, github1262_2.mol, > and github1262_3.mol from here: > https://github.com/greglandrum/rdkit/tree/fix/github1262/Code/GraphMol/MolTransforms/test_data > through Dragon and calculating the radius of gyration, inertial shape > factor, eccentricity, molecular asphericity, and spherocity index. > > Best, > greg > > > > > > ********************************************************************** > DISCLAIMER > This email and any files transmitted with it, including replies and > forwarded copies (which may contain alterations) subsequently transmitted > from Firmenich, are confidential and solely for the use of the intended > recipient. The contents do not represent the opinion of Firmenich except to > the extent that it relates to their official business. > ********************************************************************** > > >  > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processorbased developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today. http://sdm.link/xeonphi > _______________________________________________ > Rdkitdiscuss mailing list > Rdkitdiscuss@... > https://lists.sourceforge.net/lists/listinfo/rdkitdiscuss > 
From: Greg Landrum <greg.landrum@gm...>  20170115 13:14:25

Hi Guillaume, I think it this case it's something else. According to the Todeschini article the smallest moment of inertia of a planar molecule like benzene should be zero. The eigenvalues of the inertia matrix for benzene, however, are definitely not zero (and not close enough that it's likely to be roundoff error).It would be very nice if you could run the three files I mention through Dragon and let me know what it calculates for those descriptors. greg _____________________________ From: Guillaume GODIN <guillaume.godin@...> Sent: Sunday, January 15, 2017 1:11 PM Subject: RE: [Rdkitdiscuss] PMI API To: Greg Landrum <greg.landrum@...>, RDKit Discuss <rdkitdiscuss@...>, Chris Earnshaw <cgearnshaw@...> Dear Greg, I suspect that it's a precision error or eigen algorithm shift between rdkit c++ & dragon. To obtain good value, I suggest to try to implement a test on the eigen values like i did in gateway.cpp implementation. JacobiSVD<MatrixXd> getSVD(MatrixXd A) { JacobiSVD<MatrixXd> mysvd(A, ComputeThinU  ComputeThinV); return mysvd; } // get the A1 matrix using MatrixXd GetPinv(MatrixXd A){ JacobiSVD<MatrixXd> svd = getSVD(A); double pinvtoler=1.e2;// choose your tolerance wisely! VectorXd vs=svd.singularValues(); VectorXd vsinv=svd.singularValues(); for (unsignedint i=0; i<A.cols(); ++i) { if ( vs(i) > pinvtoler ) vsinv(i)=1.0/vs(i); else vsinv(i)=0.0; } MatrixXd S = vsinv.asDiagonal(); MatrixXd Ap = svd.matrixV() * S * svd.matrixU().transpose(); return Ap; } If it's not solve the problem, I would like to test it in Matlab. can you provide me the 3 (3d xyz matrix) of your example please ? I also have Dragon 6 best regards, Dr. Guillaume GODINPrincipal ScientistChemoinformatic & DataminingInnovationCORPORATE R&D DIVISIONDIRECT LINE +41 (0)22 780 3645MOBILE +41 (0)79 536 1039 Firmenich SA RUE DES JEUNES 1  CASE POSTALE 239  CH1211 GENEVE 8 De : Greg Landrum <greg.landrum@...> Envoyé : dimanche 15 janvier 2017 11:50 À : Chris Earnshaw; RDKit Discuss Objet : Re: [Rdkitdiscuss] PMI API I managed to make some time to look into this this weekend and I've found a bug and something I don't understand. Hopefully the community can help out here. On Sun, Jan 8, 2017 at 11:17 AM, Chris Earnshaw <cgearnshaw@...> wrote: 4) The big one! The returned results look very odd. They appear to relate more to the dimensions of the molecule than the moments of inertia. For a rodlike molecule (dimethylacetylene) I'd expect two large and one small PMI (e.g. PMI1: 6.61651 PMI2: 150.434 PMI3: 150.434 NPR1: 0.0439828 NPR2: 0.999998) but actually get PMI1: 0.061647 PMI2: 0.061652 PMI3: 25.3699 NPR1: 0.002430 NPR2: 0.002430. For disklike (benzene) the result should be one large and two medium (e.g. PMI1: 89.1448 PMI2: 89.1495 PMI3: 178.294 NPR1: 0.499987 NPR2: 0.500013) but get PMI1: 2.37457e10 PMI2: 11.0844 PMI3: 11.0851 NPR1: 2.14213e11 NPR2: 0.999933. Finally for a roughly spherical molecule (neopentane) the NPR values look reasonable (no great surprise) but the absolute PMI values may be too small: old program  PMI1: 114.795 PMI2: 114.797 PMI3: 114.799 NPR1: 0.999966 NPR2: 0.999988, new program  PMI1: 6.59466 PMI2: 6.59488 PMI3: 6.59531 NPR1: 0.999902 NPR2: 0.999935 Your expectations are correct: the current RDKit implementation is wrong. The corresponding github entry is here: https://github.com/rdkit/rdkit/issues/1262This is due to a mistake in the way the principal moments are calculated (which is due to the fact that I don't spend a lot of time working with/thinking about 3D descriptors). Instead of using the eigenvectors/eigenvalues of the inertia matrix (the tensor of inertia) the RDKit is currently using the covariance matrix. There's some more on the relationship between these two here: http://numbernone.com/blow/inertia/deriving_i.html The problem is easy to fix (and I have something working here: https://github.com/greglandrum/rdkit/tree/fix/github1262), but it screws up the values of the descriptors that are derived from here:Todeschini and Consoni "Descriptors from Molecular Geometry" Handbook of Chemoinformaticshttp://dx.doi.org/10.1002/9783527618279.ch37These include the radius of gyration, inertial shape factor, etc.Within that article they state that Ic = 0 for planar molecules. Ignoring the inequality on page 1010, which says that Ic is the largest moment and is contradicted by the rest of the text (particularly the inequalities on page 1011), Ic corresponds to the smallest principal moment : PMI1. So now I'm confused, but I'm hoping this is obvious to someone versed in the field: I'd like to reproduce the descriptors described in the Todeschini article, but I clearly can't do that using the actual moments of inertia. I could keep using the eigenvalues of the covariance matrix there, but that doesn't match what's described in the text. Two things that would be extremely helpful:1) an explanation of the disconnect here from someone who knows this stuff, I would guess that it's pretty simple2) The results of running the files github1262_1.mol, github1262_2.mol, and github1262_3.mol from here: https://github.com/greglandrum/rdkit/tree/fix/github1262/Code/GraphMol/MolTransforms/test_data through Dragon and calculating the radius of gyration, inertial shape factor, eccentricity, molecular asphericity, and spherocity index. Best,greg ********************************************************************** DISCLAIMER This email and any files transmitted with it, including replies and forwarded copies (which may contain alterations) subsequently transmitted from Firmenich, are confidential and solely for the use of the intended recipient. The contents do not represent the opinion of Firmenich except to the extent that it relates to their official business. ********************************************************************** 