[Joelib-devel] Re: [Cdk-devel] QSAR

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Saturday 17 April 2004 15:02, Peter Murray-Rust wrote:
> I suggest starting not with deciding what program to write but with what
> the components of a QSAR system are and then deciding what who wants to be
> involved, we have got and setting some realistic scope to what is
> achievable.

Agreed. This is why we need to set up a SF project where we can write these 
things down.

Here's a list:

- building a molecule database
	- read from file/internet
	- draw yourself one by one, or insert from smiles
- browsing the database with 2D and possible 3D structures
- associate activities/properties with those molecules
	- preprocessing
- get mathematical (or other) descriptions of the molecules in the database
	- selection of wanted descriptions
	- ability to use external programs for this
	- descriptor value preprocessing
- statistical analysis of the database (outliers, diversity, etc)
- model building
	- chosing method, and method parameters
- model validation
	- visual validation -> plots
	- statistical validation

I've requested a new SF project ('qsar') yesterday after getting positive 
reactions to my proposal earlier.

Joerg, I did not direct you personaly yet, because I vaguely remembered you 
stating to be on holiday (?), but I might very well be confused here...

I see JOELib as an important part of the new program: it has many descriptors 
implemented, already uses CML2 for storing results, and has an interface to 
Weka.

I also see an important part for CDK: 2D editing/display is a very important 
feature here. And, I expect, some descriptors will be implemented in CDK 
later this year, though this will likely not conflict with those in JOELib.
The reason why I propose CDK's core classes must be obvious.

Hopefully, the QSAR SF project will be approved early next week, and then I 
will start adding requirements, analyses, etc to documentation, hopefully 
together with the others interested. Then we will see how the available OS 
parts fit together.

Egon