[Joelib-devel] QSAR models

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Dear Nina Nikolova,
Dear All,

yes of course, it is definitley recommended to introduce a model
validation possibility, as already discussed in my last two papers.
And also havily critisized by Agrafiotis.

Model comparison without data comparison is not really usefull, so models
and the data must be stored, i would prefer a benchmark database or at
least public web page. I duscussed these topic also with the JCICS editor,
but ... you know chemists and their data.

SO, for a usefull model validation we need at first benchmark data sets,
beause we can only compare the hypothesis, if the data sets are the same.

Furthermore a basic 'guideline' must be available, to avoid
over-/underfitted models, especially when applying feature selection:
See feature selection papers:
http://www-ra.informatik.uni-tuebingen.de/software/joelib/users.html
The first paper contains two benchmark data sets with nearly 3000
descriptors ?

For models i recommend to use Weka, because these models can be stored as
Java-objects, this is transparent enough, or if possible a XML mapping
tool can be used.
For our JavaNNS interface there exists still a text export,also for the
libSVM interface, ...
For Matlab things can be stored in Matlab objects.

No, sorry, i'm not on the ADMET-conference, but i'm going to:
-Chemoinformatics, sheffield, Just to talk to others
and 
-Analytica, Munich, Lecture: 'Model quality' !!!

Kind regards, Joerg

Dipl. Chem. Joerg K. Wegner
Center of Bioinformatics Tuebingen (ZBIT)
Department of Computer Architecture
Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany
Phone: (+49/0) 7071 29 78970
Fax: (+49/0) 7071 29 5091
E-Mail: mailto:we...@in...
WWW:    http://www-ra.informatik.uni-tuebingen.de
--
Never mistake motion for action.
                                    (E. Hemingway)

Never mistake action for meaningful action.
                               (Hugo Kubinyi,2004)