Transformations

The descriptors obtained for the autocorrelation, BCUT and GTCI should be used carefully for QSAR models, because they are limited to the smallest molecule or more exactly the smallest topological distance in the molecule. Although in these specail cases missing values can be filled with zero values, because there is no probability for large distances in 'small' lead molecules. Nonetheless this is not a good strategy, because you will not get really good models, if you have a lot of zero values, which is obviously if you have a closer look at the data sets. Or in other words, you will get a potential 'degeneracy problem' for such molecules.

Moreau-Broto topological autocorrelation

Moreau-Broto topological autocorrelation [bmv84].

Equation 5-14. Moreau-Broto autocorrelation

where dij is the topological distance between the atoms i and atom j, wi and wj are the atom properties of the atoms i and j.

It must be mentioned that the autocorrelation is only a special case of the radial distribution function (the Section called Radial distribution function (RDF)).

Burden modified eigenvalues

Burden modified eigenvalues [tc00].

Global topological charge

Global topological charge [tc00].

Radial distribution function (RDF)

The radial distribution function (RDF) [msg99,wfz04b] can be interpreted as the probability distribution of finding an atom in a spherical volume of radius r.

Equation 5-15. Radial distribution function

where rij is the geometrical distance between the atoms i and atom j, wi and wj are the atom properties of the atoms i and j. B is the smoothing parameter (fuzziness of the distance rij) for the interatomic distance and f the scaling factor.

If B aims to infinity the RDF code approximates to the autocorrelation function (the Section called Moreau-Broto topological autocorrelation) and the fuzziness of the distance rij vanishes. So the RDF code can be treated as a generalized autocorrelation function.

The RDF user parameters can be defined in the joelib.properties-file, otherwise the default parameters will be used:

jcompchem.joelib.desc.types.RadialDistributionFunction.minSphericalVolume = 0.2
joelib.desc.types.RadialDistributionFunction.maxSphericalVolume = 10.0
joelib.desc.types.RadialDistributionFunction.sphericalVolumeResolution = 0.2
joelib.desc.types.RadialDistributionFunction.smoothingFactor = 25

Optional the RDF can be calculated with protonated molecules, but you must be sure that all available atom properties are calculated with hydrogens also. Because this is not the standard, this option should be only used by developers.