From: Joerg K. W. <we...@in...> - 2004-08-24 13:53:23
|
Dear Fr=E9d=E9ric, 1. basic and slow way: 1.1. calculate all descriptors: sh calculateDescriptors.sh +jcc inFile.sdf outFile.sdf 1.2. calculate descriptor statistic to check descriptor occurence. sh statistic.sh -iSDF outFile.sdf this will generate outFile.sdf.statistic outFile.sdf.binning Because BCUT depends on the molecule size, large matrix entries, e.g.=20 Burden_modified_eigenvalues:Graph_potentials:18 (number=3Dinner=20 topological distance) occur not for small molecules. You should only use BCUTS where every molecule has this value, otherwise=20 you should use a 'missing value'-replacing strategy, see also one of my=20 papers for more references: http://www-ra.informatik.uni-tuebingen.de/publikationen/2004/wegner04jcic= sA.html Programs like Dragon hides this problems so they select only a specific=20 number of eigenvalues and uses often 0 for missing values, because the=20 eigenvalues are sorted in decreasing order. But you can not expect that=20 this will give really good models, if the descriptor contains a lot of=20 zeros. 1.3. select descriptors sh descSelection.sh -iSDF outFile.sdf -oSDF selected.sdf descs.txt=20 normal " " where descs.txt contains all descriptors your interested in in the form: Burden_modified_eigenvalues:Graph_potentials:4 Burden_modified_eigenvalues:Graph_potentials:3 Burden_modified_eigenvalues:Graph_potentials:2 Burden_modified_eigenvalues:Graph_potentials:1 Burden_modified_eigenvalues:Graph_potentials:0 ... 2. developer way: joelib.test.DescriptorCalculation uses a helper class for this kind of=20 descriptors, which is joelib.desc.util.AtomPropertyDescriptors if you use: setDescriptors2Calculate(new String[]{"Burden_modified_eigenvalues"}); instead of the default settings you will get only the BCUT entries. You can use this class on your own or you can modify the default-jcc=20 descriptor set in the variable defDescNames in line -185- in DescriptorCalculation. The arrays generated by the AtomPropertyDescriptors class are=20 automatically mapped to single value descriptors for which you can=20 calculate the statistic. So, with some 'playing around' you should be able to store such results=20 directly in a database via JDBC, directly, by avoiding the overhead by=20 calculating all the other descriptors and the storing in files. Kind regards, Joerg > Dear Joerg, > It's me again. Could you tell me the way to compute only BCUT descripto= rs using the CalculateDescriptors.sh and more generally how can I select = the descriptors for computation ? > Thanks for your help > Fred >=20 > -----Original Message----- > From: Joerg K. Wegner [mailto:we...@in...]=20 > Sent: Tuesday, August 24, 2004 12:08 PM > To: JOELib help > Cc: Fr=E9d=E9ric Ooms > Subject: Re: Convert help >=20 >=20 > Dear Fr=E9d=E9ric, >=20 > beside the detailed bug report you can also try the extended descriptor= =20 > calculation method provided by >=20 > calculateDescriptors.sh -- help >=20 > I recommend this one anyway. Why ? > 1. The primitive calculation in convert calculates the descriptors by=20 > using default values (only one atom property). > 2. The extended class > joelib.test.DescriptorCalculation > calculates the the descriptors for all available atom properties. > You can get a list by using: > sh convert.sh +sad >=20 > So this affects especially the BCUT, RDF and GTCI descriptor. > Furthermore for BCUT and RDF you have additional parameters, like t= he > smoothing factor for RDF and the k-parameter in BCUT. > For RDF three different smoothing parameters are used, when using t= he > extended calculation class. >=20 > Then these features can be rescaled using the one dimensional > Mahalanobis distance (z-score) normalization with mean=3D0 and > standardDeviation=3D1. >=20 > Kind regards, Joerg >=20 >=20 >>Dear Fr=E9d=E9ric, >> >>please add a bug request with detailed message, system, and example=20 >>file >>to the tracking system: >> >>http://sourceforge.net/tracker/?group_id=3D39708&atid=3D425969 >> >>Kind regards, Joerg >> >> >>>Dear collegues, >>>I a new JOELIB user as well as a Java newbiess... I am using the >>>following command line to compute descriptors on a molecule >>>Convert -h +d test.mol >>>But I get the following error message: >>>Exception in thread "main" java.lang.NullPointerException at=20 >>>joelib.Util.JOEPropertyHelper.getProperty.... >>>Could you help me ? >>>Regards, >>>Fred Ooms >>>--------------------------------------------------------------- >>>Fr=E9d=E9ric Ooms, Ph.D. >>>Chemistry Project Manager, Euroscreen S.A. >>>Rue Adrienne Bolland, 47 >>>B-6041 Gosselies >>>Tel. +32 71 348 500 >>>Fax: +32 71 348 519 >>> >>> >> >> >=20 >=20 --=20 Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |