Re: [Rdkit-discuss] RDKit Descriptors
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Robert D. <rkd...@gm...> - 2008-09-18 14:07:01
|
Greg, Thank you for the response. I was able to get PEOE_VSA1 through PEOE_VSA14, SMR_VSA1 through SMR_VSA10, and EState_VSA1 through EState_VSA11 working. Are these the correct limits on the vector components? I was unable, however, to get Slogp_VSA or VSA_EState working with any integer suffix between 1 and 10. I've also done a correlation analysis on all the descriptors that I've gotten working. After computing descriptors for some 24,000 compounds I removed those with less than 10% variance and limited correlations between variables to a maximum of 0.85 (using KNIME). I'm happy to send a list of the resulting descriptors or a correlation matrix if you or anyone else is interested. On Wed, Sep 17, 2008 at 11:36 PM, Greg Landrum <gre...@gm...>wrote: > Dear Kirk, > > On Thu, Sep 18, 2008 at 12:58 AM, Robert DeLisle <rkd...@gm...> > wrote: > > I've finally found time to start using RDKit and started with descriptor > > calculation. Following the examples on the wiki > > (http://code.google.com/p/rdkit/wiki/DescriptorsInTheRDKit), I get a > > KeyError any time I attempt to obtain HeavyAtomCount, RingCount, > > HeavyAtomCount and RingCount were introduced after the May release, so > they're in the subversion version of the code. They will be in the Q3 > release (which will happen sometime in the next couple of weeks, > hopefully). > > > PEOP_VSA, > > SMR_VSA, Slogp_VSA, EState_VSA, and VSA_Estate. > > The various X_VSA descriptors are vector-valued and you access them by > element, so you could ask for PEOE_VSA4 or Slogp_VSA10. > > > (BTW, what is the > > difference between the two last VSA descriptors?) > > The "standard" VSA descriptors provide map summed VSA values into bins > determined by the other descriptor. So, for example, SMR_VSA uses > atomic contributions to the VSA and uses bins determined by atomic > contributions to the SMR. EState_VSA is the same, it just uses atomic > EState values. VSA_EState is reversed: atomic EState values are put > into bins determined by the VSA contributions. > > Best Regards, > -greg > |