From: Joerg W. <we...@in...> - 2003-02-10 23:08:03
|
Hi all, i was adressed recently with the following question: Is there any standard for atom types and a protonation model available? Or from a more practical standpoint of view. Is there any multiple molecule file available with a standard definition of atom types AND protonation models for one (or different pH) value(s) ? I did'nt know a real good forum for this question, so i wanna adress this problem to your community of developers. Because this is a posting to different forums, i would recommend the Openbabel discussion list for a response (if this is o.k. !), otherwise you can send responses to the JOELib or CDK (if this is o.k. !) developer mailing list. If you aren't interested in this problem, you can stop reading here, because i will explain the problem in more detail, from my standpoint of view. ------------------------------------------------------ Atom types are required for the following tasks: - file conversion, e.g. SDF->Sybyl mol2 - protonation models which are BASED on this previous atom type assignments - descriptor calculation methods which BASED on this previous atom type assignments There are a lot of descriptor calculation programs available and i think that every program uses it's own atom typer facility. Additionaly this causes different results for different programs ! From the computer scientist point of view that's absolutely awful !!! 1. An algorithm produces always the same result ! That's o.k. 2. Models, like atom typer models and protonation models are based on expert knowledge, BUT it would be more efficient AND transparent if there will be a set of molecules available WITH assigned atom types. Any hypothetically 'new' atom typer can be cross-checked against this multiple definition file. If it fails on single atoms, the atom type assigning process (e.g. based on SMARTS, as in Openbabel/JOELib) can be corrected and any side effects changing this definition can be double checked again and so on. I used the Openbabel definitions and corrected one or two entries, but i didn't check all definitions. That will be hard work, and all previous developers of Babel, OELib, Openbabel have invested a lot of time to develop these atom types and definitions. I'm busy with other main task, but nonetheless it would be really usefull: 1. to have a standard or some standard files. 2. if any wrongly assigned atom type will be send to a mailing list, where these files should be stored in a transparent and usefull way. Now the problem: I think this problem is to big for a handfull of developers: - without knowing the standard - without knowing how to represent the standard - without knowing who will be able to collect the definition files - without having a transparent method to debug the atom type assigning process Solution suggestions: 1. I would suggest a special mailing list in the OpenBabel or JOElib project to adress only atom type and protonation model assignments. 2. The assignemts should be defined in a transparent and reproducable way. Any suggestion? Which are practical ? Technical docu ? 3. I think it will be definitely possible to extend Openbabel and JOELib to have a transparent debugging mode for the atom type assignments. That's an open discussion: Feel free for some constructive comments and fruitful suggestions. Regards, Joerg Dipl. Chem. Joerg K. Wegner Univ. Tuebingen, Computer Architecture, Sand 1, D-72076 Tuebingen, Germany Tel. (+49/0) 7071 29 78970, Fax (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de |
From: Geoff H. <hut...@ch...> - 2003-02-17 05:21:44
|
Apologies that I haven't responded to you sooner, Joerg! I have an oral exam on Thursday and until a few days ago, I was working on the manuscript portion of it. > different forums, i would recommend the Openbabel > discussion list for a response (if this is o.k. !), I agree that we probably need a list for discussing such inter-program "standards," but I certainly don't mind the Open Babel list used if other people follow it. Or, if you like, I'd be glad to start a new list. It would probably be worthwhile to talk about pseudo-code and algorithm development as well. Java and C++ are not so different and it would be silly to duplicate efforts! There's also the FSAtom project: http://www.fsatom.org/ which looks similar to what we're talking about. There's really a lot more than just atom typing that should be "hammered out," as there are some truly basic chemical concepts that are duplicated! * periodic table * isotopes (and masses--or else these are ignored completely) * atom types - type translations from one program's syntax to another's - type rules * test molecules (or other test sets?) * residue info * chemical MIME types, extensions, etc. ... Open Babel loads most of these from text files, which has enabled workarounds for some bugs. They're all under the GPL and I'd be glad to set aside a separate CVS repository for just these data files. > 2. The assignemts should be defined in a transparent > and reproducable way. Any suggestion? Which > are practical ? Technical docu ? At least in Open Babel, the typing is either done by the program that generated the file, or by a SMARTS pattern. Certainly the latter seems pretty transparent and reproducible to me, assuming that everyone has bug-free SMARTS code, of course. :-) Now I can't say that external type assignments are necessarily reproducible or transparent, but we can't easily solve that. (Until everyone uses open file formats...) OTOH, Open Babel has a pretty complete type translation table. It's not bug free, but it's getting much better with continued testing (and roundtripping). -- -Geoff Hutchison <hut...@ch...> Marks/Ratner Groups (847) 491-3295 Northwestern Chemistry <http://www.chem.northwestern.edu> |
From: Joerg K. W. <we...@in...> - 2003-02-17 07:37:01
|
Hello Geoff, thanks ! Have a nice exam !:-) > There's also the FSAtom project: http://www.fsatom.org/ which looks > similar to what we're talking about. I will have a look ... i know that Ghemical uses OElib(OpenBabel) also. > Open Babel loads most of these from text files, which has enabled > workarounds for some bugs. They're all under the GPL and I'd be glad to > set aside a separate CVS repository for just these data files. Perfect, that's what i thought !!! >> 2. The assignemts should be defined in a transparent >> and reproducable way. Any suggestion? Which >> are practical ? Technical docu ? > > > At least in Open Babel, the typing is either done by the program that > generated the file, or by a SMARTS pattern. Certainly the latter seems > pretty transparent and reproducible to me, assuming that everyone has > bug-free SMARTS code, of course. :-) > > Now I can't say that external type assignments are necessarily > reproducible or transparent, but we can't easily solve that. (Until > everyone uses open file formats...) OTOH, Open Babel has a pretty > complete type translation table. It's not bug free, but it's getting > much better with continued testing (and roundtripping). I agree, i think SMARTS will be transparent enough !!! Thanks, for your efforts ! Regards, Joerg -- Dipl. Chem. Joerg K. Wegner Univ. Tuebingen, Computer Architecture, Sand 1, D-72076 Tuebingen, Germany Tel. (+49/0) 7071 29 78970, Fax (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de |
From: - 2003-02-17 11:09:39
|
On Mon, Feb 17, 2003 at 08:38:55AM +0100, Joerg K. Wegner wrote: > Have a nice exam !:-) Yeah! > >There's also the FSAtom project: http://www.fsatom.org/ which looks > >similar to what we're talking about. > I will have a look ... i know that Ghemical uses OElib(OpenBabel) also. xdrawchem/chemtool do, too. While chemtool calls the obabel-executable via a pipe or something, xdrawchem includes the whole source-tree. It sure would be nice to have a stable libopenbabel for 3rd parties to use, but with the current rate of development, we're still far from that, *kickmyself* > >Open Babel loads most of these from text files, which has enabled > >workarounds for some bugs. They're all under the GPL and I'd be glad to > >set aside a separate CVS repository for just these data files. > Perfect, that's what i thought !!! I am not totally convinced that thoses text-files are appropriate for a library. You are expected to be able to install different versions of a library next to the other. Two thoughts on this: 1. Install the text-files in a per-version directory like ${sharedir}/openbabel/${version}, expanding to /usr/share/openbabel/2.00 on GNU systems or something. 2. Have a library function to override all or specific files with a locally provided one by the calling program, e.g. /usr/share/xdrawchem/openbabel/extable.txt. Most programs don't need all the extensions babel provides and their users might get confused at all the options. I didn't check the code, perhaps 2. is already there... Michael |
From: Joerg K. W. <we...@in...> - 2003-02-17 11:52:04
|
Hello, >>>Open Babel loads most of these from text files, which has enabled >>>workarounds for some bugs. They're all under the GPL and I'd be glad to >>>set aside a separate CVS repository for just these data files. >> >>Perfect, that's what i thought !!! > > > I am not totally convinced that thoses text-files are appropriate for a > library. You are expected to be able to install different versions of a > library next to the other. Two thoughts on this: > > 1. Install the text-files in a per-version directory like > ${sharedir}/openbabel/${version}, expanding to > /usr/share/openbabel/2.00 on GNU systems or something. > > 2. Have a library function to override all or specific files with a > locally provided one by the calling program, e.g. > /usr/share/xdrawchem/openbabel/extable.txt. Most programs don't need all > the extensions babel provides and their users might get confused at all > the options. > > I didn't check the code, perhaps 2. is already there... C++: I would agree. If i'm reminding this correctly OpenBabel(OELib) has a default binary representation of the definitions and loads alternatively the text file definitions if they are available. JAVA: I would agree. Loading scheme in JOELib: default: 1. load text definition file from resource (jar, zip) library 2. load text definition from file in classpath. user specific (for experts, which will not be confused): define other text definition files in joelib.properties For both task there exist the class wsi/ra/tool/ResourceLoader Is you want to change your definitions you must simply actualize the text file defnitions and be sure that they are available in the actual classpath. Regards, Joerg -- Dipl. Chem. Joerg K. Wegner Univ. Tuebingen, Computer Architecture, Sand 1, D-72076 Tuebingen, Germany Tel. (+49/0) 7071 29 78970, Fax (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de |