From: Joerg K. W. <we...@in...> - 2005-05-27 10:22:27
|
Hi all, I agree with Rajarshi that the classification is often subjective and I can't see a reason for having two classes for Burden, BCUT, WHIM. In fact, Burden uses one atom label, BCUT uses six Burden matrices with six atom labels and WHIM applies a PCA on the 3D coordinates before calculating the Burdens, so all those features are eigenvalues, anyway. I would prefer to use a algorithm driven classification, not a chemical classification, this is often misleading, because the same internal coding is used. We have three main classes: 1. Transformations on unlabelled graphs with nodes and edges 2. Transformations on labelled graphs with several atom and bond labels. 3. Pre- and Postprocessing routines before applying 1 or 2, e.g. PCA, 2D or 3D structure generation, stochastic generation of 3D structures and so on, ... 1. contains for example complexity features and information theoretic features 2. contains 2.1. Eigenvalues, e.g. Burden, and BCUT is a special case or simply calling six times Burden, or how many atom labels you have, e.g. in JOELib2 60 2.2. Radial Distribution Function, which is a generalized autocorrelation function and the Moreau-Broto autocorrelation is a special case. 2.3. Hash-Values, which means one value for one molecule. This is analogue to complexity features and information theoretic features, but we are allowed to use atom labels to increase the discriminatory ability. We can also distinguish between: One feature value, Feature value limited to minimal distance in molecule (Burden, Autocorrelation) and constant feature vector (RDF) for one molecule. Have a nice and sunny weekend. Kind regards, Joerg Egon Willighagen wrote: > On Thursday 26 May 2005 16:17, Rajarshi Guha wrote: > >>>>- do people agree that we follow Dragon's breakdown? >>> >>>It would be a start at least... but I would encourage other >>>classifications too, if there are other scheme's... >> >>Is this possible with the design of the dictionary? > > > Yes. It uses pointers, so we can connect things any way we want. > > >>I agree that the current 5 class breakdown we have is very broad. In >>addition we are missing some classes, like information indices (but >>thats because we dont have any such descriptors implemented, I think). >> >>My line of thought for the current classifcation, from a users point of >>view, is that rather than provide too much detail at the top level, it >>would be less of an information overload if we provide broad categories >>that suggest the nature of the descriptors and within each category, the >>detailed description would be available. > > > Sounds good. > > Egon > -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |