JOELib Tutorial: A Java based cheminformatics/computational chemistry package | ||
---|---|---|
Prev | Chapter 3. Molecule operation methods and classes | Next |
The SMiles ARbitrary Target Specification (SMARTS) [smarts] is based on the SMILES notation [wei88,wei89].
Example 3-3. SMARTS substructure search
// benzene String smartsPattern = "c1ccccc1"; JOESmartsPattern smarts = new JOESmartsPattern(); // parse, initialize and generate SMARTS pattern // to allow fast pattern matching if(!smarts.init(smartsPattern()) { System.err.println("Invalid SMARTS pattern."); } // find substructures smarts.match(mol); Vector matchList = smarts.getUMapList(); System.out.println("Pattern found "+matchList.size()+" times.");
The standard SMARTS definition [smarts] can be obtained directly from the Daylight homepage. The extended SMARTS definitions which were not available in the Daylight tutorial are explained here. Some of this definitions are analogous the the definitions in MOE to provide as much standard as possible.
Table 3-2. Substructure search expressions
SMARTS entry | description |
---|---|
G<n> | Atom of periodic group n |
Q<n> | n explicite bonds to heavy atoms |
D<n> | n explicit bonds (including H atoms). That's the standard definition, only the old OELib counts only heavy weight bonds. |
X<n> | n number of bonds (including implicit+explicit H atoms). That's the standard definition, only the old OELib counts only implicit hydrogen bonds. |
^<n> | Hybridisation: spn. See joelib/data/plain/atomtype.txt for more informations. |
To understand the substructure search and the chirality search functionality, it could be usefull to have a look at the atom type assigning process in the Section called Assigning atom types, aromatic flags, hybridization and hydrogens.
For assigning atom types using a geometry-based algorithm have a look at the paper of Meng and Lewis [ml91].
Programmable Atom Typer (PATTY) [bs93].
Transformation of chemical structures (joelib.data.JOEChemTransformation) can be used for a PH value correction (joelib.data.JOEPhModel). The TRANSFORMation patterns were defined in joelib/data/plain/phmodel.txt. An important property of SMARTS atoms is the vector binding number. A vector binding number is the SMARTS atom delimited by ':' and a number which were enclosed by [] ! Don't confound with the expression for aromatic bonds ':' which must be enclosed by two SMARTS atom expressions. Here are some abstract examples to understand the transformation definitions:
Delete atoms:
TRANSFORM O=CO[#1:1] >> O=CO
Delete all atoms which have an defined vector binding on the left side and no equivalent vector binding number on the right side. If the right side would be of type O=CO[#1:1] the H atom would not be deleted.
Change atom type:
TRANSFORM O=C[O:1] >> O=C[N:1]
Replace all oxygen atoms from an carboxyl group with a nitrogen atom to an amide. Change only the atoms which have the same vector binding number and different atom types.
Change atom charge:
TRANSFORM O=C[O:1] >> O=C[O-:1]
Change all oxygen atoms from an carboxyl group to oxygen atoms with a negative charge. Change only the atoms which have the same vector binding number and atom charges.
Change bond type:
TRANSFORM [O:1]=[C:2]O >> [O:1]-[C:2]O
Change all double bonds of carboxyl group to single bonds. The bonds which must be changed should be enclosed between two SMARTS atom expressions with the equal vector binding numbers.
If you want use another PH-value model you can simple change the joelib.data.JOEPhModel.resourceFile=joelib/data/plain/phmodel.txt entry in the joelib.properties-file.