Hi all,
I agree with Rajarshi that the classification is often subjective and I
can't see a reason for having two classes for Burden, BCUT, WHIM.
In fact, Burden uses one atom label, BCUT uses six Burden matrices with
six atom labels and WHIM applies a PCA on the 3D coordinates before
calculating the Burdens, so all those features are eigenvalues, anyway.
I would prefer to use a algorithm driven classification, not a chemical
classification, this is often misleading, because the same internal
coding is used.
We have three main classes:
1. Transformations on unlabelled graphs with nodes and edges
2. Transformations on labelled graphs with several atom and bond labels.
3. Pre- and Postprocessing routines before applying 1 or 2, e.g. PCA, 2D
or 3D structure generation, stochastic generation of 3D structures and
so on, ...
1. contains for example complexity features and information theoretic
features
2. contains
2.1. Eigenvalues, e.g. Burden, and BCUT is a special case or simply
calling six times Burden, or how many atom labels you have, e.g. in
JOELib2 60
2.2. Radial Distribution Function, which is a generalized
autocorrelation function and the Moreau-Broto autocorrelation is a
special case.
2.3. Hash-Values, which means one value for one molecule. This is
analogue to complexity features and information theoretic features, but
we are allowed to use atom labels to increase the discriminatory ability.
We can also distinguish between:
One feature value, Feature value limited to minimal distance in molecule
(Burden, Autocorrelation) and constant feature vector (RDF) for one
molecule.
Have a nice and sunny weekend.
Kind regards, Joerg
Egon Willighagen wrote:
> On Thursday 26 May 2005 16:17, Rajarshi Guha wrote:
>
>>>>- do people agree that we follow Dragon's breakdown?
>>>
>>>It would be a start at least... but I would encourage other
>>>classifications too, if there are other scheme's...
>>
>>Is this possible with the design of the dictionary?
>
>
> Yes. It uses pointers, so we can connect things any way we want.
>
>
>>I agree that the current 5 class breakdown we have is very broad. In
>>addition we are missing some classes, like information indices (but
>>thats because we dont have any such descriptors implemented, I think).
>>
>>My line of thought for the current classifcation, from a users point of
>>view, is that rather than provide too much detail at the top level, it
>>would be less of an information overload if we provide broad categories
>>that suggest the nature of the descriptors and within each category, the
>>detailed description would be available.
>
>
> Sounds good.
>
> Egon
>
--
Dipl. Chem. Joerg K. Wegner
Center of Bioinformatics Tuebingen (ZBIT)
Department of Computer Architecture
Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany
Phone: (+49/0) 7071 29 78970
Fax: (+49/0) 7071 29 5091
E-Mail: mailto:we...@in...
WWW: http://www-ra.informatik.uni-tuebingen.de
--
Never mistake motion for action.
(E. Hemingway)
Never mistake action for meaningful action.
(Hugo Kubinyi,2004)
|