|
From: E.L. W. <e.w...@sc...> - 2004-11-24 09:16:43
|
=2D----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Tuesday 23 November 2004 22:07, Rajarshi Guha wrote:
> I have just added the source for the gravitational indices.
>
> While coding this I was wondering about some things:
>
> * Regarding description of the descriptor, I'm placing it in the Javadoc
> and adding references to doc/ref/cheminf.bibx. This makes sense from the
> API documentation point of view. We do need to make some decisions
> regarding how descriptor routines shoud return some textual information
> regarding themselves and what type of information that should be.
What about this:
/**
* Returns a <code>Map</code> which specifies which descriptor
* is implemented by this class. These fields are used in the map:
* <ul>
* <li>Specification-Reference: refers to an entry in a unique dictionary
* <li>Implementation-Title: anything
* <li>Implementation-Identifier: a unique identifier for this version of
* this class
* <li>Implementation-Vendor: CDK, JOELib, or anything else
* </ul>
*/
public Map getDescriptorSpecification() {
Hashtable specs =3D new Hashtable();
specs.put("Specification-Reference", "<some-namespace>:bla");
specs.out("Implementation-Title", this.getClass().getName());
specs.out("Implementation-Identifier", "$Id:$"); // taken from CVS
specs.out("Implementation-Vendor", "The Chemistry Development Kit");
}
The dictionary entry would contain all information about the literature, an=
d=20
explanation, examples, whatever... As a reminder, we could use this format:
=A0 <entry id=3D"cdk:AtomCount" term=3D"Atom Count">
=A0 =A0 <annotation xmlns=3D"http://www.xml-cml.org/schema/stmml"
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 xmlns:custom=3D"something">
=A0 =A0 =A0 <documentation>
=A0 =A0 =A0 =A0 <DEFANGED_metadata name=3D"dc:creator" content=3D"mfe4"/>
=A0 =A0 =A0 =A0 <!-- the next line is important to always be able to trace =
back how
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0the value was calculated. The CVS version is=
important here,
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0as the actual algorithm might change over ti=
me.
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(Idea from Joerg Wegner) -->
=A0 =A0 =A0 =A0 <DEFANGED_metadata name=3D"dc:identifier" content=3D"$Id:$"=
/>
=A0 =A0 =A0 =A0 <DEFANGED_metadata name=3D"dc:date" content=3D"2003-03-13"/>
=A0 =A0 =A0 </documentation>
=A0 =A0 =A0 <!-- there is also stuff to denote algorithmic stuff; that shou=
ld define
=A0 =A0 =A0 =A0 =A0 =A0 =A0 the available parameters and sorts.... -->
=A0 =A0 </annotation>
=A0 =A0 <definition>
=A0 =A0 =A0 Descriptor that gives the number of atoms of a certain type.
=A0 =A0 =A0 The algorithm was first proposed in <custom:cite ref=3D"BLA"/>.
=A0 =A0 </definition>
=A0 =A0 <custom:bibliography>
=A0 =A0 =A0 <bibtex:entry id=3D"BLA"><!-- etc -->
=A0 =A0 =A0 </bibtex:entry>
=A0 =A0 </custom:bibliography>
=A0 =A0 <!-- what other important info do we need to add? -->
=A0 </entry>
(See my email around 12:10 yesterday)
> * Currently the return value for the calculate() method any descriptor
> routine implements is Object. So for the gravitational index which
> calculates 6 descriptors I return an ArrayList.
>
> However this allows another author to return any other type of data
> structure that is a subclass of Object. How can an end user of the
> routines uniformly handle return values from descriptor routines? I
> don't think using instanceof for various types is very elegant.
>
> One possible solution is to create a class, say, DescriptorReturnValue
> which could contain a few fields (which may be expanded as time goes
> on). I have already mentioned just using a double[] as a return value.
I think JOELib uses this too... to have descriptors is a specific type,=20
returning a specific return value... that at least would clearly limit the=
=20
number of returnable values. I think this is a good thing to do.
I'll update the API and stuff.
> Joerg pointed out that some descriptors might return more complex data
> structures. However I have always thought that descriptor values would
> be single numbers or vectors of numbers. Can anybody correct me here in
> this respect?
What about COMFA then? That's a 3D matrix...
> * Would it be a good idea to have a DescriptorException class and have
> classes implementing the Descriptor interface throw this exception? I
> raise this issue becuase it is possible that a descriptor might hit a
> condition, specific to the descriptor algorithm , for which it should
> abort. Using the current CDK exceptions might not be as informative.
>
> Would this be a good idea (I'm tending to yes, given that the qsar
> package will grow in size and will expand in functionality, so it might
> be useful to have specific set of Exceptions for qsar related
> functionality). If so I have a version of such a class that I can add to
> CVS
CDK had the opinion to remove all exceptions but the CDKException, but this=
=20
might be something we need to discuss again... because you indeed loose=20
specifiy... so, please add it. I'll start a discussion on the devel list...
Egon
=2D --=20
e.w...@sc...
PhD-student on Molecular Representation in Chemometrics
Radboud University Nijmegen
http://www.cac.science.ru.nl/people/egonw/
GPG: 1024D/D6336BA6
=2D----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (SunOS)
iD8DBQFBpFFpd9R8I9Yza6YRAu0wAJ9MCueUvQGJ1r9xOE3Sl7ICmcoBqgCfZkTQ
UDeSzVInE9/VXu8K0F+HsIc=3D
=3DP9rH
=2D----END PGP SIGNATURE-----
|