[octet-devel] PropertyAcceptor
Status: Alpha
Brought to you by:
r_apodaca
|
From: rich a. <che...@ya...> - 2004-05-23 21:33:43
|
Hello All, Joerg, many thanks for putting the "octetInterface.zip) package together. I had a chance to look through it and had some comments and questions. I very much agree that QSAR will need some mechanism for assigning (and retreiving) properties to atoms, molecules, bonding systems, and other model-level objects. However, I do not agree that the place to put this functionality is in the model-level objects themselves. Let me explain: To me, the model-level classes Molecule, Atom, AtomPair, and BondingSystem are intended to model, in an approximate way, the physical entities they represent. Of course, the model is not intended to represent "reality", but more something along the lines of a cartoon. The bottom line is that I want developers who know about chemistry and molecular structure to feel at home as much as possible with the Octet API. The inclusion of methods in model-level objects that allow for the storage/retreival of arbitrary data obscures this intent to some extent because physical molecules don't store generic data - people store generic data about molecules and that data has different meanings depending on context. Aside from design aesthetics (which, of course, are highly subjective :-) ), there are practical issues as well. One of the considerations that went into designing Octet is whether model-level objects should be mutable or not. Octet provides read-only interface definitions because this enables clients to make some very useful simplifying assumptions about Molecule/Atom/AtomPair/BondingSystem behavior. If model-level classes implement the PropertyAcceptor interface, then client A is free to change the properties of Molecule X without client B knowing about the change. To build a robust system, client B has to assume Molecule X can change at any time, and so has to make a defensive copy of Molecule X, which can be a substantial performance penalty (and an easy thing to forget). This needn't be limited to using threads, either. Any time a client needs to hang onto an instance of Molecule (for example, in a Hashtable), this problem will rear its ugly head. Additionally, inheriting the PropertyAcceptor interface would place an additional burden on implementors of the QSAR interfaces. I'm willing to do this, but only as a last resort. What do you think of an alternative approach? Let's say I'm writing a TPSA implementation (TPSACalculator) for the QSAR project. I need to store a property, for example, indicating whether an atom is an sp2 nitrogen. The approach in octetInterface.zip, if I understand correctly, would be to store that property in the Atom itself. What if, instead, my TPSACalculator simply maintained a boolean array called sp2nitrogen with a size equal to the number of atoms in a given molecule. My calculator would simply iterate over all Atoms int the Molecule checking for sp2 nitrogen. When one is found, lets say at index n, sp2nitrogen[n] is set to true. At the end of this process I have a boolean array whose elements tell me the indicies of the sp2-nitrogen atoms in the Molecule. Is this not the same result as Atom inheriting the PropertyAcceptor interface and using atom.setProperty()? This approach is flexible in that any data structure supported by Java - collections, arrays, 3rd party libraries can be used. And it does this without changing the Atom interface or requiring a defensive copy by TPSACalculator. So, the approach I'm suggesting is to rigorously separate data storage from model-level interface definition. The downside of this approach is that it requires more work by clients to maintain their data. There are many ways to address this issue. One solution would be to define a special data storage class whose API is designed to assiciate generic data with model-level objects. Another, more integrated, approach would be to use the Decorator Pattern (see: http://c2.com/cgi/wiki?DecoratorPattern). Structure (http://structure.sourceforge.net) uses this approach to associate 2-D atom coordinates with Atoms. In a nutshell, - net.sourceforge.octet.extension.MoleculeDecorator is a concrete convenience class that implements the Molecule interface by passing all method calls through to a private instance of the Molecule it decorates. - net.sourceforge.structure.molecule.BasicMolecule2D extends MoleculeDecorator, adding methods for 2-D coordinate manipulation. Clients then go: Molecule mol = ...; // get the molecule from somewhere Molecule2D mol2d = new Molecule2D(mol); mol2d.move(mol2d.getAtom(0), 0.0, 0.0); // etc... Getting back to octetInterface... If we absolutely need to have a Molecule that can store properties, how about defining a class like: public class PropertyMolecule extends net.sourceforge.octet.extensions.MoleculeDecorator implements PropertyAcceptor { // implement the PropertyAcceptor interface } (Aside: Maybe a set of methods like PropertyAcceptor.setProperty(Atom atom, PropertyKey key) would be useful here. Also, to avoid the defensive copy situation above, PropertyAcceptor could then define PropertyAcceptor.addPropertiesListener()). Now this frees implementors of the Molecule interface from having to support the PropertiesAcceptor interface. It also fosters a "pay-as-you-go" approach to interface complexity and overhead. If a client needs this functionality, then they have to pay for it, but we don't compel clients to pay for what they don't need. We can get ease of use, simplicity, and high performance at the same time. cheers, rich "Joerg K. Wegner" <we...@in...> wrote: Hi all, so here is the first version, of the 'combined' octet interface: (ups, our server seems to be down, please try again in a few hours) http://www-ra.informatik.uni-tuebingen.de/mitarb/wegner/tmp/octet/octetInterface.zip TECHNICAL: It has the following structure: cdk joelib octet octet4CDK octet4JOELib octetImplementations CDK, JOELib, Octet should work on their own via 'ant compile' octet4CDK,octet4JOELib requires Octet/CDK or Octet/JOELib they catch on their own octetImplementations Octet/CDK/JOELib and contains new implementations for Octet. So this part of the project combines all efforts !!! DESIGN: I've added to octet/properties AssignmentFactory.java DataFactory.java Property.java PropertyAcceptor.java PropertyKey.java PropertyKeyFactory.java PropertyKeySet.java PropertySet.java PropertyVersion.java So all objects which can accept properties can extend PropertyAcceptor. I've added them to Atom and Molecule, so the CDK interface will not work any more, also the basic implementations which are part of octetImplementations (i've moved them from octet). The octet4JOELib part contains until now no functionality. I will also add/change the substructure parts to Octet, but not this week :-) So, Rich AND Egon, if you agree to this structure, then we should add this first version to Octet-CVS, otherwise change all things you like and send me a link to download the changed version. (I hate CVS and it's directory problem) Please read the API docu, so i must not explain things twice in this e-mail, please complain anything you do not like in the actual design. I've used a really widely usable Object storing mechanism, and are forcing property-version methods. Also we should remove things like in Atom: countElectrons getLabel Because this are simply two Property's for an Atom, so we are working on attributed molecular graphs. To get the values, these definitions should be part of the PropertyKeyFactory adding methods, like public PropertyKey getElectronsKey(); public PropertyKey getAtomLabelKey(); // what's meant by label, isn't // this is too general here? Kind regads, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) ------------------------------------------------------- This SF.Net email is sponsored by: Oracle 10g Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click _______________________________________________ octet-devel mailing list oct...@li... https://lists.sourceforge.net/lists/listinfo/octet-devel --------------------------------- Do you Yahoo!? Yahoo! Domains - Claim yours for only $14.70/year |