From: Peter Murray-R. <pm...@ca...> - 2004-12-04 14:54:49
|
At 10:00 02/12/2004 -0800, rich apodaca wrote: >--- Christoph Steinbeck <c.s...@un...> >wrote: I have only come in at the end of this discussion, but feel some points need making: - atom parity has NO relation to CIP R/S. None. CIP is a poor means of specifying parity to machines as the algorithms fail for certain classes of molecule. - atom parity is based on a unique labelling of the ligands. IMO It is extremely important that this is explicit. Implicit semantics such as ("label the first in some sequence 1, then the next 2, etc. are fragile, because of editing, sorting, etc. It is my belief that MDL dropped the use of atom Parity - you will see it is a write-only field - i.e. you are not meant to use it. - SMILES has an implicit ordering for atom parity in which the H may or may not be present. - CML has explicit labelling of the atoms and a simple algorithm based on the MIF specification (chiral determinant). IMO this is the only reliable robust approach for atom parity. HTH P. > > In CDK, we intended to use atom parity as given in > > the MDL CT file > > definition. > >Thanks, Christoph. I thought CDK might be working from >that spec. > > > For such a use, all you need to code with the parity > > is the order in > > which the ligands are *printed* around a stereo > > center. Consequently, > > the parity must be used in conjunction with stereo > > bond symbols. > >I'm still a little confused, though. I'm looking at >the diagram with the eye on page 42 of the "1997 MDL >CT MDL File Formats" document. It seems like with CDK >it should be possible to completely specify >configuration using AtomParity only (without bond >stereo properties). > >CT apparently doesn't allow the ordering of atoms >around a stereocenter to be separately defined - this >is set by the ordering of atoms in the connection >table. Consequently, a bond needs to be marked up/down >to start with. Then the parity value determines the >clockwise/counterclockwise rotation of the remaining >three ligands when sighted as shown in the diagram. > >CDK does allow the ordering around a stereocenter >independent of the ordering of Atoms in an >AtomContainer. So it should in theory be possible to >do away with the bond stereo flag and just use >AtomParity. > >I think there is a simpler way, and it has nothing to >do with the MDL CT File format (well, almost nothing): > >Define two enantiomeric tetrahedral templates with >verticies labeled with "0", "1", "2", or "3". >AtomParity = 1 means one template. AtomParity = 2 >means the other template. Higher values could refer to >racemate, unknown config, etc. > >Now use AtomParity.getSurroundingAtoms() to obtain an >array atomArray. If this array contains a null >element, it is understood to be an implicit hydrogen >atom. So the elements of atomArray may look like: >[atom, null, atom, atom]. > >Place each Atom in atomArray into the position on the >appropriate tetrahedral template matching its index in >atomArray. This will unambiguously regenerate the >tetrahedral configuration of the central atom in the >AtomParity. If the central Atom is null, the >configuration would be interpreted as axial chirality, >I imagine. > >The main requirement for this sytem is the accurate >specification of the tetrahedral templates. One way to >do this would be to say the (R) tetrahedral template >would be the one whose 1,2,3 verticies are arranged >clockwise when the 0 vertex is behind the viewer. The >(S) tetrahedron would be the enantiomer. By specifying >both the parity value and the ordering of ligands in >AtomParity, the configuration of the stereocenter is >fully defined. > >So, with this approach it would be possible to >accurately specify tetrahedral chirality using only >AtomParity. No need for bond stereo flags, 3-D >coordinates, or the determination of CIP ligand >priorities. Unless I'm missing something... > >If this makes sense, then it should also make sense >that the bond stereo descriptor should be eliminated. >I would go even further to say that "AtomParity" >should be renamed to "Configuration" to make the >intent even clearer. > >Of course, any template system has the limitation that >features deviating from the template will be >undefinable. This is why I favor the approach taken in >Octet (http://octet.sf.net) in which no template is >used at all and any Configuration and/or Conformation >can be consistently specified and queried. The >disadvantage is that the generality leads to higher >complexity that needs to be managed. Maybe the >template approach represents a nice tradeoff for CDK? > >best, >rich > > > > >__________________________________ >Do you Yahoo!? >Take Yahoo! Mail with you! Get it on your mobile phone. >http://mobile.yahoo.com/maildemo > > >------------------------------------------------------- >SF email is sponsored by - The IT Product Guide >Read honest & candid reviews on hundreds of IT Products from real users. >Discover which products truly live up to the hype. Start reading now. >http://productguide.itmanagersjournal.com/ >_______________________________________________ >Cdk-devel mailing list >Cdk...@li... >https://lists.sourceforge.net/lists/listinfo/cdk-devel Peter Murray-Rust Unilever Centre for Molecular Informatics Chemistry Department, Cambridge University Lensfield Road, CAMBRIDGE, CB2 1EW, UK Tel: +44-1223-763069 |