From: Egon W. <eg...@us...> - 2005-04-16 18:47:13
|
On Saturday 16 April 2005 15:15, Christoph Steinbeck wrote: > There is an important article in the current issue of JCICS, dealing > with the question on how to assign bond orders and hybridization states > to chemical graphs based on the 3D geometry of a molecule. > The article points out methods and potential pitfalls that we should be > aware of. Haven't read it yet, but independently we should start to make an overview of what CDK currently has to offer in this area, and how they should be used and what pitfalls are there. There is quite some functionality available, but not all is used properly, leading to confusing and invalid bug reports. And, bad naming/JavaDoc is the main cause here. So let's see what we have, and document the use of those appropriately. *** Connectivity Perception cdk.graph.rebond.RebondTool perceives connectivity from 3D coordinates based on bond lengths (Miguel's algorithm for Jmol). Does not perceive bond orders! *** Bond order Perception cdk.tools.SaturationChecker.saturate()/-newSaturate() perceives bond orders given a certain connectivity, but does not deal with hybridization states, charges and not with atoms marked as aromatic. Most notable failure is nitrogen in aromatic ring systems. cdk.tools.ValencyChecker.saturate() perceives bond orders given a certain connectivity, and deals with formal charges, but not with hybridization states. cdk.tools.ValencyHybridChecker.saturate() perceives bond orders given a certain connectivity, and deals with hybridization states. Part of the second too algorithms is atom type perception. For this CDK has several methods, again not too well documented. This partly is integrated in the SaturationChecker and ValencyHybridChecker. *** Atom type Perception cdk.tools.SaturationChecker.couldMatchAtomType() perceives the atom type against the "structgen" [1] dictionary of atom types by matching the element symbol, and the bond order sum + the number of implicit hydrogens against the atom types. Does not generally work well with charged atoms, but was never developed for this, and misused later. cdk.tools.ValencyChecker.couldMatchAtomType() perceives the atom type against the "valency" [1] dictionary of atom types by matching the element symbol, the formal charge, and the bond order sum + the number of implicit hydrogens against the atom types. Does not generally work well with hybridization states. cdk.tools.ValencyHybridChecker.couldMatchAtomType() perceives the atom type against the "valency/hybridization" [1] dictionary of atom types by matching the element symbol, the formal charge, the hybridization state, and the bond order sum + the number of implicit hydrogens against the atom types. This functionality is being extracted and incorporated into the recently cdk.atomtype package. Based on the last method, there is now cdk.atomtype.HybridizationMatcher perceives the atom type against the "valency/hybridization" [1] dictionary of atom types by matching the element symbol, the formal charge, the hybridization state, and the bond order sum + the number of implicit hydrogens against the atom types. *** Hybridization state Perception cdk.atomtype.HybridizationStateATMatcher perceives the hybridization state perception against the "valency/hybridization" [1] dictionary of atom types by matching the element symbol, the formal charge, and the bond order sum + the number of implicit hydrogens against the atom types. It requires all hydrogen atoms to be given *EXPLICITELY*, and bond orders too!. *** Summary We might have more than the above. Is someone remembers an interesting algorithm, please add it to this list. So why do we need all this? Answer: because we are lazy chemists who prefer to leave out important information when we start working :) Notable problem areas related to the above algorithms: XYZ files, SMILES, and aromaticity representation. Egon 1. http://cdk.sourceforge.net/old_web/atlists.html -- eg...@us... GPG: 1024D/D6336BA6 |