From: Joerg K. W. <we...@in...> - 2003-02-18 12:53:50
|
Hello, > IMO it is essential that the community all uses the same approach to > atom type definition. (This doesn't necessarily mean a single monolithic > approach, but an agreement on how to make sure that all systems behave > in the same way. There should be operational definitions of atoms types > where possible based on chemical and/or geometrical environment whose > interpretation should be reproducible on all of the systems that adopt > this approach. To ensure uniformity I believe that all algorithms should > - as far as possible - come from external files and be independently of > programming language. I've added the atom type assigning process workflow to the XML-DocBook-Tutorial: http://www-ra.informatik.uni-tuebingen.de/software/joelib/tutorial/atomtyper.html It can be also usefull for OpenBabel-users, only the class names are different. I hope this will be interesting for all users and developers. > SMARTS is a useful approach here, and I would > also suggest XML and a representational language. (If readers are keen, > I'd be happy to see how CML can be extended to support this). As already discussed, i would recommend SMARTS expressions, because they are short and established. The generic CML query language can be also used, if all the basic substructure functionalities can be established as defined in the text files linked above (aromatic.txt, atomtype.txt). The query language should have a converter, like: SMARTS <-> CML query Or everybody must redefine SMARTS patterns found in the literature to CML queries. > There are many fuzzy concepts such as "trigonal N" and "aromatic". These > can be determined by different criteria (neighbouring atoms, geometry, > MO calculation, etc.). Where possible the name should suggest the > method, e.g. "planar three-coordinate nitrogen", "nitrogen with aromatic > ligand", etc. Unless there is clear nomenclature and algorithmic > definition of concepts, the program systems won't be implemented in the > same way and we are limited to "CDKaromatic", "BabelHydrogen", etc. 1. I think the basic atom type assignments are more or less the same, by the way: Thank's Geoff for collecting and checking in the test data sets. The transparency is given by using SMARTS with comments. 2. Specialized chirality- or Z/E-descriptors must always be implemented, but that's no transparency problem if these descriptors are well documented (e.g. using XML-DocBook) or known in the literature. And altough i'm no CML expert, it think using something like chirality.descriptor.definition="Golbraikh_Tropsha_JCICS_2003" should work. The descriptor calculation facility is not directly available under OpenBabel, eventually there could be some dynamic factory design patterns implemented to establish such things, but i'm not totally convinced, if the Java-Reflection-analogue mechanism can be totally supported in a C++ environment. > The first step could be to coordinate the current approaches > independently of program system. This is a sizeable task, but it's the > sort of thing that IUPAC is doing for graphic representation of chemical > structures, See http://www.angelfire.com/sc3/iupacstructure for very > impressive discussion coordinated by Jonathan Brecher. I think that atom > types is outwith the scope of this particular activity (I am sure there > is cross-readership) but ultimately it would be valuable for IUPAC to > have a view on atom types. The link doesn't work for me. Who will organize such things ? > P. Regards, Joerg -- Dipl. Chem. Joerg K. Wegner Univ. Tuebingen, Computer Architecture, Sand 1, D-72076 Tuebingen, Germany Tel. (+49/0) 7071 29 78970, Fax (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de |