SMiles ARbitrary Target Specification (SMARTS)-substructure search

SMARTS basics

The SMiles ARbitrary Target Specification (SMARTS) [smarts] is based on the SMILES notation [wei88,wei89].

Example 3-3. SMARTS substructure search

// benzene
String smartsPattern = "c1ccccc1";
JOESmartsPattern smarts = new JOESmartsPattern();

// parse, initialize and generate SMARTS pattern
// to allow fast pattern matching
if(!smarts.init(smartsPattern())
{
  System.err.println("Invalid SMARTS pattern.");
}

// find substructures
smarts.match(mol);
Vector         matchList  = smarts.getUMapList();

System.out.println("Pattern found "+matchList.size()+" times.");

SMARTS definition

The standard SMARTS definition [smarts] can be obtained directly from the Daylight homepage. The extended SMARTS definitions which were not available in the Daylight tutorial are explained here. Some of this definitions are analogous the the definitions in MOE to provide as much standard as possible.

Table 3-2. Substructure search expressions

SMARTS entrydescription
G<n>Atom of periodic group n
Q<n>n explicite bonds to heavy atoms
D<n>n explicit bonds (including H atoms). That's the standard definition, only the old OELib counts only heavy weight bonds.
X<n>n number of bonds (including implicit+explicit H atoms). That's the standard definition, only the old OELib counts only implicit hydrogen bonds.
^<n>Hybridisation: spn. See joelib/data/plain/atomtype.txt for more informations.

To understand the substructure search and the chirality search functionality, it could be usefull to have a look at the atom type assigning process in the Section called Assigning atom types, aromatic flags, hybridization and hydrogens.

For assigning atom types using a geometry-based algorithm have a look at the paper of Meng and Lewis [ml91].

Programmable Atom Typer (PATTY)

Programmable Atom Typer (PATTY) [bs93].

SMARTS based structure modification

Transformation of chemical structures (joelib.data.JOEChemTransformation) can be used for a PH value correction (joelib.data.JOEPhModel). The TRANSFORMation patterns were defined in joelib/data/plain/phmodel.txt. An important property of SMARTS atoms is the vector binding number. A vector binding number is the SMARTS atom delimited by ':' and a number which were enclosed by [] ! Don't confound with the expression for aromatic bonds ':' which must be enclosed by two SMARTS atom expressions. Here are some abstract examples to understand the transformation definitions:

If you want use another PH-value model you can simple change the joelib.data.JOEPhModel.resourceFile=joelib/data/plain/phmodel.txt entry in the joelib.properties-file.