JOELib Tutorial: A Java based cheminformatics/computational chemistry package | ||
---|---|---|
Prev | Chapter 3. Molecule operation methods and classes | Next |
Atom types can be assigned to atoms of a molecule using only topological informations and SMARTS substructure search. For more specialized atom types, like special chirality- and Z/E-isomerism-descriptors it would be a good choice to use atom property descriptors (see the Section called Atom properties in Chapter 5).
As already discussed in our three feature selection model building papers [wfz04a,wfz04b,fwz04] the descriptor calculation is the last step after calling four different expert systems, so you should be carefully check your descriptor results when predicting values with models not calculated on your own.
In our opinion for every expert system a 'standard' (e.g., JOELib/OpenBabel, atomTyperVersion=1.0), formulated as classification problem should exist, to be able to say simply: calculate descriptors for the already mentioned standard. The formulation as classification in a PUBLIC database is required to test your/our implemented atom typer against this standard. Let's see if we can ever find time and men-/women-power to formulate and test such a standard ...
Table 3-4. Process of assigning atom types
Molecule | |||||
aromaticity | hybridization | implicite valence | atom types | descriptor | |
SMARTS without | calculation | ||||
D<n> | algorithm | ||||
^<n> |
Aromatic flags can be assigned to atoms using SMARTS (see the Section called SMARTS definition) substructure patterns defined in the joelib/data/plain/aromatic.txt-file. All SMARTS patterns except D<n> (explicite bonds) and ^<n> (hybridization) are allowed. Chiral atoms are allowed, which use the XYZVector.calcTorsionAngle(...)-method.
To assign atom hybridizations it is necessary to have already assigned aromaticity flags. All INTHYB-definitions in the joelib/data/plain/atomtype.txt-file are used get the atom hybridizations.
To assign atom types it is necessary to have already assigned aromaticity flags and atom hybridizations. All EXTTYP-definitions in the joelib/data/plain/atomtype.txt-file are used get the atom types. These are mainly used for the file conversion process and for descriptor calculation algorithms.
To assign the implicite valence to atoms it is necessary to have already assigned aromaticity flags, atom hybridizations and atom types. All IMPVAL-definitions in the joelib/data/plain/atomtype.txt-file are used to calculate the number of implicite hydrogens for each atom.
Descriptors can be simple topology descriptors without requiring any chemical informations or descriptors with requiring atom types and implicite hydrogens (see e.g. the Section called Fingerprints in Chapter 5). PATTY rules (see the Section called Programmable Atom Typer (PATTY)) can be used for simple atom type descriptors. And all kinds of other models or expert rules can be used for chirality or Z/E-isomerism descriptors.
Table 3-5. Possible special atom type assignments (not implemented)
Assignment | Reference |
---|---|
chirality descriptor | [gt03] |
E/Z descriptor | [gbt02] |
planar three-coordinate nitrogen | calculate vector product of the three neighbors |
nitrogen with aromatic ligand | PATTY SMARTS rule: [#7a] |
For all atom property descriptors there must always exist a descriptor documentation-file (see the Section called Writing your own descriptor and result classes). Otherwise a HTML-documentation (generated by using DocBook) error will occur every time JOELib starts. The XML- and RTF-description files are optional.