|
From: Peter Murray-R. <pm...@ca...> - 2005-02-26 14:28:46
|
At 15:21 25/02/2005 +0100, Andreas Maunz wrote: >Hi all, > >is there yet a possibillity to create a canonified version of a molecule >with OB? I did a SMILES canonifier with JOELib, using morgan renumbering >and this functionality seems to be missing in OB, although there is >RenumberAtoms (std::vector< OBNodeBase * > &) as a member function in >the OBMol class. What does it? Though it doesn't directly answer your question immediately, we hope that the global approach to canonical numbering will be through the new IUPAC InChI (V 1.0RC) being released this week. The InChI defines a canonical numbering based on the Morgan algorithm and refinements (wherever possible taken from the published literature). Some us in the InChI group met last week in London and our discussions included the deployment of InChI. InChI source is freely available but because it is has to act as the reference for the spec cannot be technically Open (i.e. it can be freely used, integrated into code but cannot be modified). OpenBabel is in an ideal position to use InChI as Dmitrii has written it in C++ and we (Billy) have interfaced it to CML.CPP (as included in OB). We would therefore see it as the natural way of canonicalising molecules. I believe that the Open chemistry community will wish to move towards InChI as the definitive approach for all canonicalisation in their codes. We have found that "unique SMILES" is not precisely defined and there is no accepted reference implementation that is freely available. For example a given molecule (e.g. caffeine) has at least 9 representations on the public Web. There is a slight technical limitation in that InChI is only available in C(++) and therefore Java-based systems such as JUMBO, JOELib and CDK can only access it by wrapping it in System.exec or JNI. My current approach, therefore is to regard CML/InChI/OpenBabel as the most useful way of managing canonicalisation. We are continuing to increase our range of webservices (http://wwmm.ch.cam.ac.uk/gridsphere/gridsphere) and would hope to provide canonical numbering very shortly. (The problem is not doing it, but representing the result!) P. > Peter Murray-Rust Unilever Centre for Molecular Informatics Chemistry Department, Cambridge University Lensfield Road, CAMBRIDGE, CB2 1EW, UK Tel: +44-1223-763069 |