[Cdk-devel] Re: [QSAR-devel] thoughts on the current QSAR package

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Tue, 2004-11-23 at 09:45 +0100, E.L. Willighagen wrote:

> Yes, I think so... I did not raise this yet because one step at the time...
> But it would be a good idea to have a number of sub packages...

Regarding Joerg's comments in another posting I think having a basic
hierarchy of packages - though not ncessary for actual calculations -
would make the arrangement of descriptors more managable, and more tidy
if you will.

Furthermore, if indeed the Reflection API can be used to automatically
discover descriptors and if we go with the namespace feature Egon
mentioned than I think the hierarchy would allow for an easier
understanding of the API and would'nt really harm the usability of the
descriptor routines.

> > An alternative to the hierarchy is to have the Descriptor interface have
> > a function: getType() which would return some constant (which would be
> > defined by the CDK framework) indicating descriptor type. This way we
> > could also have descriptors classified as multiple types (by OR'ing
> > constants).
> 
> Example... ?

No that I think of it this feature could be used in conjunction with the
hierarchy: the actual package hierarchy would benefit somebody looking
at the API. The getType() functionality would be of use when trying to
determine available descriptor types algorithmically.

Example: say we define some constants in CDKConstants such as

DESCRIPTORTYPE_TOPOLOGICAL = 1 << 0
DESCRIPTORTYPE_GEOMETRIC   = 1 << 1
DESCRIPTORTYPE_ELECTRONIC  = 1 << 2
DESCRIPTORTYPE_INFORMATION = 1 << 3

So then in a descriptor class (say electrotopological state), the
getType() function could look like:

int getType() {
  return( DESCRIPTORTYPE_TOPOLOGICAL ||DESCRIPTORTYPE_ELECTRONIC );
}

Then a function that calls the E-state descriptor can use the return
value of the getType() method to determine the descriptor type by
AND'ing with the descriptor type constants

> > Related to this, currently the calculate() function of the Descriptor
> > interface returns an Object. Would it be a good idea to consider
> > returning a hash that would have (key,value) pairs corresponding to
> > (descriptor name, descriptor value)?
> 
> You already know the descriptor name before calling calculate()...

Actually I was thinking of the situation where the descriptor name would
simply be AtomTypeCount of Chi - as a result each of these descriptor
types is a class of descriptors rather than a single descriptor. The
AtomTypeCount descriptor can return an array of numbers corresponding to
counts for different atom types - obviously one way is to look at the
documentation. Alternatively, the descriptor routine could bundle the
naming information

> BTW, you mentioned a number of implemented descriptors... will those be 
> submitted, or can you at least say which you have done sofar and which you 
> are working on, on the cdk-devel@ (and possible the qsar-devel) list(s)...

The ones that do'nt overlap with whats been added to the CDK so far are:

Gravitational Indices
Carbon Type counts

Also working on a routine to evaluate atomic logP values.

-------------------------------------------------------------------
Rajarshi Guha <rx...@ps...> <http://jijo.cjb.net>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
-------------------------------------------------------------------
Disembowelling takes guts.