From: Joerg K. W. <we...@in...> - 2004-06-15 15:49:31
|
Hi naji, i will forward this also to the mailing list. Hope you don't mind. I=20 think this is a general question and also interesting for other=20 users/developers. > 1) >NH(participating in a ring ) > [#7;r;H1;$(*-*);$(*-*)] it matches NH in C1CNC=3DCC=3DC1 but > it doesn't match a NH of a heteroaromatic ring for example : cytosine = ( how could I do?) So as said aromatic, means aromatic bonds (:) and not aliphatic bonds (-)= . So use $(:) or try another representation, e.g. nitrogen in ring with=20 two heavy weight neighbours, implicite+explicite number of hydrogens is=20 still one. [#7;r;H1;Q2] > 2) [c+0X3&H0](:n)(:a) with this pattern I wanted to match 2*(ring> C=3D= ) (participating in two fused non-benzene rings) > in the adenosine (heteroaromatic ring) The number of participating rings to an atom is [R] The ring size is [r] So what's with [c;R2](:n) I think (:a) will match always if you use [c], so i will skip this. Aromatic [c,n] or nonaromatic [C,N] ? I think i've still not completely=20 understood your pattern request. > [c+0X3&H0](:n)(A) with this pattern I wanted to match 1*(ring> C=3D) = (two single bonds participating in a non- benzene ring) in the cytosine > there is a problem when I run the program (the same program that joelib= /src/joelib/desc/types/logP).Although *(ring> C=3D)(adenosine) and *(ri= ng> C=3D) (cytosine) are different ( in regard to contribution group) I d= on't manage to differ them how could I make?). In fact , I would like to = estimate standard Gibbs energy changes of biochemical compounds and I try= to convert from a table the different groups (nitrogen,oxygen,etc..)of t= he contribution group to smarts pattern.For small molecules there is not = very difficult but for the rings (benzene ,heteroaromatic rings=85) I enc= ounter some problems to differ the different molecular fragments of those= rings. =20 I recommend to use the added SMARTS matching code in the CVS, so you can=20 proceed much faster to generate and test different SMARTS patterns. Just add the single molecule file MDL-SD with .mol extension to=20 joelib/src/resource/smartsEvaluation/*.mol and add your test patterns to joelib/src/resource/smartsEvaluation/evaluation.txt So you can try different things and you can also see the internal=20 representation of these patterns in the result file. So you can work in=20 a kind of 'debugging mode'. > 3)> C=3D (participating in two fused non-benzene rings) > [r;!a]~[#6]~[r;!a] two non-aromatic rings connected via=20 > aliphatic/aromatic via any bonds but how could I find the smarts patte= rn if I have adenosine as molecule. Is the adenine part really non-aromatic ? If you want only exclude aromatic six rings i would use. [r6;!a]~[#6]~[r6;!a] But i'm not sure if this will match, because the 5-ring looks really=20 imidazol like :-) > 4) > C=3D (the formal double bond and a formal single bond participatin= g in a non-benzene ring) > difference to -CH=3D ? for example,in the cytosine, there are 1 fragme= nt of > C=3D (the formal double bond and a formal single bond participati= ng in=20 > a non-benzene ring) and an other fragment t -CH=3D (participating in a= non benzene ring) Definitely aromatic in cytosine, because Keto-Enol-Tautomerism ! So, i would say n:c:n > 5) how could I differ(with smarts pattern) cH aromatic of the benzene =96= CH=3D (participating in a benzene ring) and cH aromatic of the tyrosine (= cH of the phenol fragment).The contribution of those fragments is difeere= nt. From the SMARTS standpoint of view there does not exists -CH=3D in=20 benzene. Because the 'atom typer' expert systems (chemical kernel) in=20 JOELib assigns 'c:c' to all carbon atoms connected with aromatic bonds. So the only difference to the phenol part of tyrosine is 'c-OH' the=20 aliphatic bond to the hydroxyl group. Again, please use the matching module submitted to CVS, so you can try=20 the different options. Furthermore you can have a look at http://www-ra.informatik.uni-tuebingen.de/software/joelib/tutorial/atomty= per.html and you can see that the aromaticity is the first step of the atom=20 typing process, followed by the hybridization and the implicite valence. The SMARTS matching and also the real 'atom typer' works on the=20 previously assigned 'expert systems' of the chemical kernel. Kind regards, Joerg --=20 Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |