octet-devel Mailing List for Octet (Page 3)
Status: Alpha
Brought to you by:
r_apodaca
You can subscribe to this list here.
2004 |
Jan
|
Feb
|
Mar
|
Apr
(3) |
May
(11) |
Jun
(7) |
Jul
(12) |
Aug
(10) |
Sep
|
Oct
(2) |
Nov
(10) |
Dec
(14) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(3) |
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
(1) |
Jul
(1) |
Aug
(1) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
2006 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
(5) |
Oct
(31) |
Nov
(13) |
Dec
|
From: Joerg K. W. <we...@in...> - 2004-07-26 12:22:50
|
Hi Rich, > * Molecule implements AtomGraph. In the near future, BondingSystem should also implement AtomGraph to enable traversal/query with the same tools used for Molecules (any objections to this?) Good. > * Traversers traverse the graph structure of any AtomGraph. Traversers are low-level components that are helpful for building higher-level functionality. Currently two types of Traverser are available: DepthFirstTraverser and CycleTraverser. Both use a system of Handlers and Controllers - Handlers for handling events generated at various stages of a traversal algorithm and Controllers for exercising limited control over the algorithm itself. This system borrows from SAX's ContentHandler idea. HanserCycleTraverser is an implementation of CycleTraverser that uses Hanser's algorithm for finding the set of all cycles of an AtomGraph using collapsing Path-Graphs. CycleTraverser should use an interface, so that we can switch the traverser. If nothing is said a default traverser should be used. The traverser should also have an ID and version number analogue to descriptors. > * MoleculeComparator compares two AtomGraphs for isomorphism, but without comparing atom/bonding properties. UllmanComparator implements MoleculeComparator by using Ullman's subgraph isomorphism algorithm. Like Traverser, MoleculeComparator uses a system of Handlers and Controllers for fine-grained control. It should be possible to use this sytem to create additional isomorphism algorithms implementing MoleculeComparator. Isn't this only a formulation problem ? Can't we use a boolean method compareNode(LabelSet) which uses a set of labels to check isomorphism ? > * QueryBuilder enables clients to build a molecular query using the same process that is used for building a Molecule with MoleculeBuilder. In fact, QueryBuilder extends MoleculeBuilder and can be used in many contexts calling for a MoleculeBuilder. QueryBuilder is designed for building queries that are based on a template molecule with constraints placed on individual Atoms with AtomQuery. Can 'pharmacophores' treated also with this approach. So are combined features, e.g. carbon acid group combined to a single feature and a distance to all other features allowed ? > * SmartsQueryFactory is in the early stages, but is intended to simplify the process of using QueryBuilder by enabling clients to use SMARTS Atomic Primitive strings as keys to obtain a fully functional AtomQuery. Although this isn't exactly a SMARTS parser, it isn't that far from being one given Octet's SmilesReader. Currenly only the wildcard Atomic Primitive ("*") is supported, but other should be appearing soon. The approach here has some elements in common with that of CDK's growing SMARTS support, but there are also some interesting differences. Same as above, so atom based (not feature based) compareNode(LabelSet) method, where the LabelSet is what i would call the chemical kernel atom labelling set. > Looking a little further down the road for QSAR, what are people's thoughts on a framework for molecular descriptors? Of course, there hundreds of descriptors, and of course we all have our ideas on what a particular descriptor means or doesn't mean. What I'm actually wondering about is what a descriptor facility in QSAR would look and feel like. I've been looking at JOELib's descriptor framework, which has some reasonable concepts. From what I can tell, there are two basic kinds of descriptor: a "holistic" descriptor that is a single value (i.e. TPSA) and which is primitive-like, and everything else, which tends to be higher-resolution in nature (i.e. Topological Torsion) and more object-like. Are there any other ideas? With respect to query i would prefer the object approach, so we can use: result=molecule.calculate("XYZ") or as in JOELib result1=calculator.calculate(mol1,"XYZ", Properties) result2=calculator.calculate(mol2,"XYZ", Properties) for matching or similarity we can then use // inherited from Comparator in Java API // applicable for euclidian, tanimoto, atom-pairs similarity=metricThatILike(result1,result2, Properties); For simple single value descriptors it would be also interesting to have: similarity=metricThatILike(ResultSet1,ResultSet2, Properties); Also with pharmacophore outlook or multiple graph isomorphism and not only pair-wise matching. So a query is from my standpoint a kind of similarity-metric which can only return 0 and 1. Sometimes, as in SMARTS matching we are only interested in subgraph isomorphism. result1=calculator.calculate(mol1,"XYZ", LabelSet) result2=calculator.calculate(mol2,"XYZ", LabelSet) // only applicable for this specific calculator // can be used for maximum common substructure search (MCS) matchings=matchingsThatILike(result1,result2, Properties); So, for SMARTS matching we need also: matchings=matchingsThatILike(query1,result2, Properties); For pharmacophores 2D/3D/Shape we can also use this appraoch, because the representation for the similarity/matching is the relevant point. matchings=matchingsThatILike(query1,result2, Properties); or similarity=metricThatILike(result1,result2, Properties); Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: rich a. <che...@ya...> - 2004-07-26 09:00:32
|
Hello All, With some interesting news with CDK and joelib being recently reported, I thought an update on Octet might be useful, especially with respect to moving forward on QSAR. Developing a feature-rich and extensible molecular query/traversal facility is the main short term goal for Octet. I expect this process to take another month or so of heavy development (so what follows may change ;-)). The last few releases, and especially the most recent one (0.3.0), have focussed on laying the foundation for developing scalable molecular query capability. The CVS contains additional work that will be released as v0.3.1 shortly. Major enhancements include: * Molecule implements AtomGraph. In the near future, BondingSystem should also implement AtomGraph to enable traversal/query with the same tools used for Molecules (any objections to this?) * Traversers traverse the graph structure of any AtomGraph. Traversers are low-level components that are helpful for building higher-level functionality. Currently two types of Traverser are available: DepthFirstTraverser and CycleTraverser. Both use a system of Handlers and Controllers - Handlers for handling events generated at various stages of a traversal algorithm and Controllers for exercising limited control over the algorithm itself. This system borrows from SAX's ContentHandler idea. HanserCycleTraverser is an implementation of CycleTraverser that uses Hanser's algorithm for finding the set of all cycles of an AtomGraph using collapsing Path-Graphs. * MoleculeComparator compares two AtomGraphs for isomorphism, but without comparing atom/bonding properties. UllmanComparator implements MoleculeComparator by using Ullman's subgraph isomorphism algorithm. Like Traverser, MoleculeComparator uses a system of Handlers and Controllers for fine-grained control. It should be possible to use this sytem to create additional isomorphism algorithms implementing MoleculeComparator. * QueryBuilder enables clients to build a molecular query using the same process that is used for building a Molecule with MoleculeBuilder. In fact, QueryBuilder extends MoleculeBuilder and can be used in many contexts calling for a MoleculeBuilder. QueryBuilder is designed for building queries that are based on a template molecule with constraints placed on individual Atoms with AtomQuery. * SmartsQueryFactory is in the early stages, but is intended to simplify the process of using QueryBuilder by enabling clients to use SMARTS Atomic Primitive strings as keys to obtain a fully functional AtomQuery. Although this isn't exactly a SMARTS parser, it isn't that far from being one given Octet's SmilesReader. Currenly only the wildcard Atomic Primitive ("*") is supported, but other should be appearing soon. The approach here has some elements in common with that of CDK's growing SMARTS support, but there are also some interesting differences. Additonal enhancements will almost certainly include more types of molecular queries: queries that match only; queries that count matches; queries that highlight matching Atoms, AtomPairs, or BondingSystems of an input molecule. Another near-term goal is the detection of aromaticity, although I doubt this will be straighforward. I'd also like to apply the above query/traversal functionality to the development of a Topological Polar Surface Area calculator as a proof of concept. Looking a little further down the road for QSAR, what are people's thoughts on a framework for molecular descriptors? Of course, there hundreds of descriptors, and of course we all have our ideas on what a particular descriptor means or doesn't mean. What I'm actually wondering about is what a descriptor facility in QSAR would look and feel like. I've been looking at JOELib's descriptor framework, which has some reasonable concepts. From what I can tell, there are two basic kinds of descriptor: a "holistic" descriptor that is a single value (i.e. TPSA) and which is primitive-like, and everything else, which tends to be higher-resolution in nature (i.e. Topological Torsion) and more object-like. Are there any other ideas? rich --------------------------------- Do you Yahoo!? New and Improved Yahoo! Mail - 100MB free storage! |
From: Joerg K. W. <we...@in...> - 2004-07-26 07:45:24
|
Hi Egon, sorry for cross-posting, but this is interesting for all java-project-admins. good old ANT, i've attached the build file, no it's not in the CVS. I've simply used all old sf-source-code-releases to get an objective PMD measure. Unfortunately setting properties seems only to work sometimes ? I don't get it. So, i've started the script for each PMD metric. You can download the required libraries at: http://sourceforge.net/project/showfiles.php?group_id=39708&package_id=108845 Kind regards, Joerg > On Monday 26 July 2004 00:05, Joerg K. Wegner wrote: > >>- Extensive PMD, JavaNCSS offline statistics, so on long term runs there >>are some non-functioanlity things to do ... just for improving the >>design and becoming more user friendly :-) >>http://www-ra.informatik.uni-tuebingen.de/software/joelib/statistics.html > > > That's a nice page... how did you create those in-time statistics? > Are the scripts in JOELib CVS? > > Egon > -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Egon W. <eg...@us...> - 2004-07-26 06:49:47
|
On Monday 26 July 2004 00:05, Joerg K. Wegner wrote: > - Extensive PMD, JavaNCSS offline statistics, so on long term runs there > are some non-functioanlity things to do ... just for improving the > design and becoming more user friendly :-) > http://www-ra.informatik.uni-tuebingen.de/software/joelib/statistics.html That's a nice page... how did you create those in-time statistics? Are the scripts in JOELib CVS? Egon -- eg...@sc... GPG: 1024D/D6336BA6 |
From: Joerg K. W. <we...@in...> - 2004-07-25 22:02:53
|
Hi Egon, Hi all, i've released a new JOELib version http://sourceforge.net/projects/joelib/ with the following changes: - Extended tutorial introducing formally molecular graphs. This is especially interesting for the Octet interface and the QSAR project. http://www-ra.informatik.uni-tuebingen.de/software/joelib/tutorial/JOELibPrimer.html - E/Z SMARTS matching and adding improved autodetection for 2D/3D structures if not using SMILES as input. - Extensive PMD, JavaNCSS offline statistics, so on long term runs there are some non-functioanlity things to do ... just for improving the design and becoming more user friendly :-) http://www-ra.informatik.uni-tuebingen.de/software/joelib/statistics.html Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: E.L. W. <eg...@sc...> - 2004-07-06 07:39:07
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Monday 05 July 2004 18:17, rich apodaca wrote: > I would like to propose a change to the Molecule interface: > > (1) AtomGraph extends AtomCollection, AtomPairCollection > -AtomGraph.countNeighbors(Atom) > -AtomGraph.iterateNeighbors(Atom) > > (2) Molecule extends AtomGraph > > This would mean that Molecule and all implementors would define the two n= ew > methods shown above. This change will enable Traversers to work with > Molecules and substructures through the unified AtomGraph interface witho= ut > having to treat each as a special case. > > Any objections? Sounds good. Egon =2D --=20 eg...@sc... PhD on Molecular Representation in Chemometrics Nijmegen University http://www.cac.sci.kun.nl/people/egonw/ GPG: 1024D/D6336BA6 "Again a chemist did something useful with a computer" =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (SunOS) iD8DBQFA6lcSd9R8I9Yza6YRAmOLAJ94tR/F7SHMmhWtoCwkHhJuM4mcewCfScle w1uDLxr5ffnqJmBhsFJJQb8=3D =3DU8LY =2D----END PGP SIGNATURE----- |
From: rich a. <che...@ya...> - 2004-07-05 16:17:11
|
Hi All, I would like to propose a change to the Molecule interface: (1) AtomGraph extends AtomCollection, AtomPairCollection -AtomGraph.countNeighbors(Atom) -AtomGraph.iterateNeighbors(Atom) (2) Molecule extends AtomGraph This would mean that Molecule and all implementors would define the two new methods shown above. This change will enable Traversers to work with Molecules and substructures through the unified AtomGraph interface without having to treat each as a special case. Any objections? rich __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |
From: Joerg K. W. <we...@in...> - 2004-06-30 08:53:12
|
Dear developers, dear community, as already mentioned i've less time at the moment and i can only guarantee to maintain JOELib as it should be maintained. I will also submit new functionalities, like maximum common substructure search in 2-3 months, then we have an extensively tested release. Now my help request to all developers and all people which can advertise our projects. Please bear in mind that i've plannded to end my PhD thesis until the middle of 2005, so we need also other people which are able to maintain JOELib, because my spare-time is restricted. But i've no time for refactor-maintenance, which recommended for long term time slides. I'm really interested in a QSAR data mining tool based on Java. http://joelib.sf.net http://qsar.sf.net http://octet.sourceforge.net/ http://cdk.sf.net And as you know JOELib has already some - descriptors and - data mining tools (Weka interface and class structure containing molecules) - basic algorithms, like sub-optimal distance matrix with O(N^3) - Labeling functions for molecular graphs as atom labels (atom properties) - Labeling functions for molecular graphs as bond labels (bond properties) I'm not sure if Rich and Egon have the same opinion, but from my standpoint of view it is crucial important that this community helps us to implement things. We, the project administrators can assign tasks and help you understand the actual packages, but we can not implement all things we need on our own, as at the moment. So, especially the JOELib project needs more developers and anybody which is willingly to implement the basic Octet-JOELib interface as starting point. Example was already given at: http://www-ra.informatik.uni-tuebingen.de/mitarb/wegner/tmp/octet/ The changes are found in CDKTools 0.2.1, which can be downloaded at: http://sourceforge.net/project/showfiles.php?group_id=96108 Javadoc is at: http://octet.sourceforge.net/api.html What i like to see in the interface is an analogue structure as shown here, especially with respect to the chemical expert systems (chemical kernel) in JOELib: The following ASAP articles were posted to the ACS Web Edition of Journal of Chemical Information and Computer Sciences on June 17, 2004. Title: Some Basic Data Structures and Algorithms for Chemical Generic Programming Authors: Wei Zhang, Tingjun Hou, Xuebin Qiao, and Xiaojie Xu Link: http://dx.doi.org/10.1021/ci049938s Kind regards, Joerg Kurt Wegner -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: E.L. W. <eg...@sc...> - 2004-06-30 08:04:16
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday 30 June 2004 04:55, rich apodaca wrote: > Egon, I realized that CDKMolecule would not work with MDLReader because I > failed to override addBond(Bond). So I fixed that and now it works. > > The changes are found in CDKTools 0.2.1, which can be downloaded at: > http://sourceforge.net/project/showfiles.php?group_id=3D96108 > Javadoc is at: > http://octet.sourceforge.net/api.html Ok, I'll have a look at it tomorrow... > In addition, I added a unit test (MolfileTest.java) that loads a series of > molfiles using one of three approaches: (1) a CDKMolecule using MDLReader; > (2) a org.openscience.cdk.Molecule using MDLReader; and (3) a CDKMolecule > using Ocet's MolfileReader and CDKMoleculeBuilder. > > The isomorphism of molecule pairs (1)-(3) and (2)-(3) are verified with > CDK's UniversalIsomorphismTester. Everything passes. I took the molfiles > from CDK's "data" directory. Nice. > I think this little unit test is also a good demonstration of how to > actually get Octet and CDK to work together. I like your idea of either > posting this code or code like it on the QSAR site. Yes, I'll see how that can be most easily accomplished... > BTW, did you update all CDK io classes analogously to MDLReader?=20 No, not yet... > If not,=20 > would it help to put this new behavior into DefaultChemObjectReader?=20 The read() methods are often too specific... so I would need to clean up th= e=20 code much anyway... > Or maybe add an explicit method like read(Molecule molecule)? Those are mostly private... to simplify the API...=20 > Regarding where we go from here.... > > I was interested in your thoughts on the dict idea. But I must confess, I > still don't know what a dict is. It looks like it has something to do with > defining molecular descriptors. But I'm struggling to understand how one > would use it in QSAR. A dictionary is a look up table that allows a program/a user to be exactly= =20 sure what descriptorX is. It serves as a major documentation tool, but also= =20 provides unique identifiers that point to only one description of that=20 descriptor. This should ensure that there is little ambiguity on what the descriptor is, and how it is calculated... An example. Say 'partialAtomicCharge'... It is roughly possible to understa= nd=20 what this (atomic) descriptor is, but it does not say how it is calculated= =20 (Gasteiger charges or Gaussian03 charges?) and possibly what parameters are used to calculate it... (which electronegativity table or which basis=20 set?)... A dictionary should clarify these things. So given a certain descriptor name, each program implementing this descript= or=20 from that dictionary, should always give the exact same outcome. To ensure= =20 reproducibility of model building. > Nevertheless, I think going in the direction of specifying the components > and behaviors that go into descriptors is the next logical direction. I'm > especially interested in drafting some kind of specification outlining the > requirements for such a system - maybe using RFE. And of course, I'm keen= ly > interested in reducing this spec. into a set of Java interfaces defined in > terms of QSAR model-level objects. Octet functionality will probably need > to be developed to fill in the gaps, which I'm happy to do. I'm not sure I understand what you're trying to say here... > If it is at all possible to take a composite approach where descriptors or > descriptor components could easily be combined to make new descriptors, I > think this could be useful. > > But my expertise in the area of molecular descriptors is about the same as > Matt Foley's experise as a motivational speaker. So - I'd like to get some > perspectives on this from developers, potential developers, or amused > onlookers. :) Egon =2D --=20 eg...@sc... PhD on Molecular Representation in Chemometrics Nijmegen University http://www.cac.sci.kun.nl/people/egonw/ GPG: 1024D/D6336BA6 "Again a chemist did something useful with a computer" =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (SunOS) iD8DBQFA4nPyd9R8I9Yza6YRAplUAJ9XehQNR+zO3hapfkn/utBeklYrxwCdE01n DhRbP7pf2iIQasttVmhKAjY=3D =3DdHRU =2D----END PGP SIGNATURE----- |
From: rich a. <che...@ya...> - 2004-06-30 02:55:45
|
Hello All, Egon, I realized that CDKMolecule would not work with MDLReader because I failed to override addBond(Bond). So I fixed that and now it works. The changes are found in CDKTools 0.2.1, which can be downloaded at: http://sourceforge.net/project/showfiles.php?group_id=96108 Javadoc is at: http://octet.sourceforge.net/api.html In addition, I added a unit test (MolfileTest.java) that loads a series of molfiles using one of three approaches: (1) a CDKMolecule using MDLReader; (2) a org.openscience.cdk.Molecule using MDLReader; and (3) a CDKMolecule using Ocet's MolfileReader and CDKMoleculeBuilder. The isomorphism of molecule pairs (1)-(3) and (2)-(3) are verified with CDK's UniversalIsomorphismTester. Everything passes. I took the molfiles from CDK's "data" directory. I think this little unit test is also a good demonstration of how to actually get Octet and CDK to work together. I like your idea of either posting this code or code like it on the QSAR site. BTW, did you update all CDK io classes analogously to MDLReader? If not, would it help to put this new behavior into DefaultChemObjectReader? Or maybe add an explicit method like read(Molecule molecule)? Regarding where we go from here.... I was interested in your thoughts on the dict idea. But I must confess, I still don't know what a dict is. It looks like it has something to do with defining molecular descriptors. But I'm struggling to understand how one would use it in QSAR. Nevertheless, I think going in the direction of specifying the components and behaviors that go into descriptors is the next logical direction. I'm especially interested in drafting some kind of specification outlining the requirements for such a system - maybe using RFE. And of course, I'm keenly interested in reducing this spec. into a set of Java interfaces defined in terms of QSAR model-level objects. Octet functionality will probably need to be developed to fill in the gaps, which I'm happy to do. If it is at all possible to take a composite approach where descriptors or descriptor components could easily be combined to make new descriptors, I think this could be useful. But my expertise in the area of molecular descriptors is about the same as Matt Foley's experise as a motivational speaker. So - I'd like to get some perspectives on this from developers, potential developers, or amused onlookers. cheers, rich "E.L. Willighagen" <eg...@sc...> wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tuesday 29 June 2004 17:32, E.L. Willighagen wrote: > We've spoken about setting up a descriptor dictionary, and an example of > such a dictionary is on our website: > > http://qsar.sourceforge.net/dicts.html > > Let's consider the association in that dict as an descriptor, the config > file could look like: > > > > To be a bit more explicit, I'm thinking things like from a dict like id="constitution" title="Constitution based Descriptors"> List of descriptors derived from the constitution of molecules. Or whatever. The number of carbon atoms in the molecule. This descriptor was originally proposed by J.Doe in Some.J., 1896. He showed that it had good correlation with the boiling point of n-alkanes. - -- eg...@sc... PhD on Molecular Representation in Chemometrics Nijmegen University http://www.cac.sci.kun.nl/people/egonw/ GPG: 1024D/D6336BA6 "Again a chemist did something useful with a computer" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (SunOS) iD8DBQFA4Y6gd9R8I9Yza6YRAlA8AJ9QUfkKgC92C68mxy4EKD0lPFjHjgCdGEM9 hHHvbRQ3vQCL3F1Fa90TiAA= =m5O2 -----END PGP SIGNATURE----- ------------------------------------------------------- This SF.Net email sponsored by Black Hat Briefings & Training. Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com _______________________________________________ Qsar-devel mailing list Qsa...@li... https://lists.sourceforge.net/lists/listinfo/qsar-devel --------------------------------- Do you Yahoo!? Yahoo! Mail Address AutoComplete - You start. We finish. |
From: Joerg K. W. <we...@in...> - 2004-06-29 16:16:15
|
Hi all, > PS, Joerg, did you have time for JOELib bindings? I've forgotten the status of > that... I could then directly convert the above idea into a working > program... Sorry, no. I've six students around here working with JOELib and also the project work. Then i've a seminar on 'algorithms in drug design' starting this week and such boring things like my phD thesis. My plate is more than full at the time and the JOELib developer crew is not really growing in a way i would prefer. Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: E.L. W. <eg...@sc...> - 2004-06-29 15:45:48
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tuesday 29 June 2004 17:32, E.L. Willighagen wrote: > We've spoken about setting up a descriptor dictionary, and an example of > such a dictionary is on our website: > > http://qsar.sourceforge.net/dicts.html > > Let's consider the association in that dict as an descriptor, the config > file could look like: > > <calculateDescriptors> > <descriptor dictRef=3D"compendium:association"/> > </calculateDescriptors> To be a bit more explicit, I'm thinking things like <calculateDescriptors> <descriptor dictRef=3D"constitution:carbonCount"/> </calculateDescriptors> from a dict like <dictionary xmlns=3D"http://www.xml-cml.org/schema/cml2/core" id=3D"constitution" title=3D"Constitution based Descriptors"> <description> List of descriptors derived from the constitution of molecules. Or whatever. </description> <entry id=3D"carbonCount" term=3D"Carbon Count"> =20 <annotation> <documentation> <metadata name=3D"dc:creator" content=3D"QSAR Project"/> <metadata name=3D"dc:identifier" content=3D"constitution:000001"/> <metadata name=3D"dc:contributor" content=3D"Egon Willighagen"/> <metadata name=3D"dc:date" content=3D"2004-06-29"/> </documentation> </annotation> <definition> The number of carbon atoms in the molecule. </definition> <description> This descriptor was originally proposed by J.Doe in Some.J., 1896. He showed that it had good correlation with the boiling point of n-alkanes. </description> </entry> </dictionary> =2D --=20 eg...@sc... PhD on Molecular Representation in Chemometrics Nijmegen University http://www.cac.sci.kun.nl/people/egonw/ GPG: 1024D/D6336BA6 "Again a chemist did something useful with a computer" =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (SunOS) iD8DBQFA4Y6gd9R8I9Yza6YRAlA8AJ9QUfkKgC92C68mxy4EKD0lPFjHjgCdGEM9 hHHvbRQ3vQCL3F1Fa90TiAA=3D =3Dm5O2 =2D----END PGP SIGNATURE----- |
From: E.L. W. <eg...@sc...> - 2004-06-29 15:32:55
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tuesday 29 June 2004 17:13, rich apodaca wrote: > A new package is available that addresses CDK-Octet interoperabiliy - > CDKTools 0.2.0. It is a rewrite of version 0.0.1 and uses some new featur= es > of Octet. It can be downloaded here: > > http://sourceforge.net/project/showfiles.php?group_id=3D96108&package_id= =3D1163 >78 > > Javadoc with detailed comments is available online: > > http://octet.sourceforge.net/api.html Excellent! Thanx. > Highlights: > > * A two-way adapter, CDKMolecule, inherits the CDK Molecule implementation > and the Octet Molecule interface. This adapter molecule can be used > simultaneously, with some limitations, by Octet and CDK clients. It can be > directly built using the CDK API. Limited boundary checking is implemente= d. > > * A CDKMoleculeBuilder inherits the Octet MoleculeBuilder interface. It c= an > be used, with some limitations, by any client expecting an Octet > MoleculeBuilder. This allows clients to build a CDKMolecule using the Oct= et > API. Very good. What are the limititation you mention? I've modified CDK's MDLReader such that it will use the Molecule that you p= ass=20 it... so it can be used to parse into a CDKMolecule class... Have you tried= =20 that yet? > * A bare-bones test suite. One method compares biphenyl created with CDK > (via MoleculeFactory) and biphenyl created with Octet (via TestMolecules). > CDK's UniversalIsomorphismChecker compares the two molecules and finds th= em > isomorphic. Good. We should put such code examples on the website... > Although this approach has its limitations, it appears quite viable. It's > especially noteworthy that the differences between Octet and CDK models f= or > molecular structure are encapsulated in the molecule creation process. Th= is > means that attempts to build CDKMolecules with a state that will be > uninterpretable or inconsistent when viewed by either framework are caught > at creation-time with Exceptions being thrown rather than at use-time when > the error will be harder to trace. Sounds like a good approach... > I think the results of this approach bode well for QSAR, as it was propos= ed > to use Octet as a starting point for the QSAR molecule API. Ok, so it's now possible to do a few basic things: =2D - IO =2D - processing What do you think should be the next step? A code example that takes an SDF file, and calculates descriptors for it? We've spoken about setting up a descriptor dictionary, and an example of su= ch=20 a dictionary is on our website: http://qsar.sourceforge.net/dicts.html Let's consider the association in that dict as an descriptor, the config fi= le=20 could look like: <calculateDescriptors> <descriptor dictRef=3D"compendium:association"/> </calculateDescriptors> Any QSAR program would then exactly know what it should do, wether it can o= r=20 not, depending wether it implements this specific descriptor... How does that sound? If the XML structure of the above example config sound= s=20 reasonable, I'll write some code to parse such a file, and convert that int= o=20 a simple Java structure, List<Descriptor> or so... Egon PS, Joerg, did you have time for JOELib bindings? I've forgotten the status= of=20 that... I could then directly convert the above idea into a working=20 program... =2D --=20 eg...@sc... PhD on Molecular Representation in Chemometrics Nijmegen University http://www.cac.sci.kun.nl/people/egonw/ GPG: 1024D/D6336BA6 "Again a chemist did something useful with a computer" =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (SunOS) iD8DBQFA4YuXd9R8I9Yza6YRAt8IAKCY9YL8WLhT37o1+ebs/74yYP8gHwCgo1lK tW8UtEatl+1ykTSAylb5kM0=3D =3D2MP6 =2D----END PGP SIGNATURE----- |
From: rich a. <che...@ya...> - 2004-06-29 15:13:17
|
Hi All, A new package is available that addresses CDK-Octet interoperabiliy - CDKTools 0.2.0. It is a rewrite of version 0.0.1 and uses some new features of Octet. It can be downloaded here: http://sourceforge.net/project/showfiles.php?group_id=96108&package_id=116378 Javadoc with detailed comments is available online: http://octet.sourceforge.net/api.html Highlights: * A two-way adapter, CDKMolecule, inherits the CDK Molecule implementation and the Octet Molecule interface. This adapter molecule can be used simultaneously, with some limitations, by Octet and CDK clients. It can be directly built using the CDK API. Limited boundary checking is implemented. * A CDKMoleculeBuilder inherits the Octet MoleculeBuilder interface. It can be used, with some limitations, by any client expecting an Octet MoleculeBuilder. This allows clients to build a CDKMolecule using the Octet API. * A bare-bones test suite. One method compares biphenyl created with CDK (via MoleculeFactory) and biphenyl created with Octet (via TestMolecules). CDK's UniversalIsomorphismChecker compares the two molecules and finds them isomorphic. Although this approach has its limitations, it appears quite viable. It's especially noteworthy that the differences between Octet and CDK models for molecular structure are encapsulated in the molecule creation process. This means that attempts to build CDKMolecules with a state that will be uninterpretable or inconsistent when viewed by either framework are caught at creation-time with Exceptions being thrown rather than at use-time when the error will be harder to trace. I think the results of this approach bode well for QSAR, as it was proposed to use Octet as a starting point for the QSAR molecule API. cheers, rich --------------------------------- Do you Yahoo!? Yahoo! Mail is new and improved - Check it out! |
From: Peter Murray-R. <pm...@ca...> - 2004-05-27 09:48:14
|
At 22:06 26/05/2004 -0700, rich apodaca wrote: >Hello All, > >Egon, I like your idea about an Octet-to-Octet converter. I liked it so >much I decided to try my hand at implementing it (or part of it at least). > >The idea is simple: I added a static method ("copyMolecule") to >MoleculeTools. It takes as a parameter a Molecule and a MoleculeBuilder. >It uses the Molecule as a template to provide instructions to the >MoleculeBuilder that result in a copy of the molecule being produced. > >I have committed my changes (and unit tests - contained in MoleculeTest) >to the Octet CVS. Thanks - I haven't looked at this but your description makes sense. We clearly have to have translations and the more clearly this is encapsulated the safer it will be. > >Now, this is only part of the solution you were thinking about. The second >part needs to come from the "compatability layer." In particular, a >CDKMoleculeBuilder needs to be developed. It is a class that implements >the net.sourceforge.octet.molecule.MoleculeBuilder interface, and which >probably would have an additional method that could return a CDKMolecule >(say, "releaseCDKMolecule()"). With this class in place, one could then go: > >CDKMoleculeBuilder builder = ... ; >Molecule molecule = ... ; // the molecule to be copied > >MoleculeKit.copyMolecule(molecule, builder); > >CDKMolecule cdkMolecule = builder.releaseCDKMolecule(); > >// do something CDK-specific with cdkMolecule > >I imagine that there could be an analogous JOEMoleculeBuilder, >JUMBOMoleculeBuilder, etc. This makes sense. The JUMBOMoleculeBuilder will essentially use DOM get/set to create a CMLDOM. Since the methods depend on the actual schema used it may be necessary to use reflection to ask whether a method exists. I haven't used reflection (I read about it many years ago) but basically I anticipate something like the pseudo code: read octetMolecule info create CMLMolecule iterate through octetMolecule.properties if cmlMolecule.hasProperty(p) cmlMolecule.setProperty(p) the reflection might be done in a generic cml.setProperty() - each class already has setAttribute(foo, bar) which are then rerouted to setFoo(bar) if they exist. P. > >The nice thing about this solution is that the same code, namely the >MoleculeKit.copyMolecule() method, can be used to faithfully copy any >Molecule implementation to any other for which the MoleculeBuilder has >been written. Only one copy method need be written, tested, and debugged - >which I have done to a first approximation. Also, since only Octet >interfaces are involved, this places control of the process pretty much in >the hands of the compatability layer, rather than QSAR - which is where I >think it belongs. > >As an added benefit, this CDKMoleculeBuilder, once developed, could be >used seamlessly within the Octet IO package or with TestMolecules - >basically anywhere that a Molecule needs to be constructed. > >In summary, there are probably two key interfaces that a QSAR >compatability layer may need to implement for maximum utility: Molecule >and MoleculeBuilder. I am, of course, willing to help with writing any of >these that might be needed by QSAR-aware projects. > >cheers, >rich > > >"E.L. Willighagen" <eg...@sc...> wrote: >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > >On Wednesday 26 May 2004 07:16, rich apodaca wrote: > > Peter, you raise an interesting point about simultaneously being able to > > use functionality in multiple toolkits. I hadn't considered simultaneous > > use of more than two toolkits. > > > > Because Java permits only multiple inheritance of interfaces, I can't > > imagine that a three-way or four-way Adapter Pattern could be used - both > > JOELib and CDK define model-level objects in terms of classes rather than > > interfaces. > > > > Nevertheless, the two-way Adapter Pattern is just one way to try to > achieve > > interoperability. And it may not be the best solution in every situation. > > > > Another approach to this problem is to develop a translator class that > will > > use one toolkit's molecule as a template for building another toolkit's > > molecule. I included a skeletal implementation of this concept in Octet's > > CDKTools package (CDKKit). > >I've been thinking about this: > >a Octet2Octect convertor... i.e. a class that takes the input from a class >implementing a certain octet interface, and stores it into a second (possibly >different) implementation of that octet interface, obviously using this given >interface... > >Then you only need one translater for each octect interface, no matter how >many implementation there are... It also ensures that no information will be >converted outside this interface. > >Egon > >- -- >eg...@sc... >PhD on Molecular Representation in Chemometrics >Nijmegen University >http://www.cac.sci.kun.nl/people/egonw/ >GPG: 1024D/D6336BA6 > >-----BEGIN PGP SIGNATURE----- >Version: GnuPG v1.0.7 (SunOS) > >iD8DBQFAtE9zd9R8I9Yza6YRAv3zAJ948c5BaiuNIsrf/X2CAok5yDdTgQCfUvba >CYKGD/HQdNSLaJWOl1tGmWc= >=vw/p >-----END PGP SIGNATURE----- > > > >__________________________________________________ >Do You Yahoo!? >Tired of spam? Yahoo! Mail has the best spam protection around >http://mail.yahoo.com Peter Murray-Rust Unilever Centre for Molecular Informatics Chemistry Department, Cambridge University Lensfield Road, CAMBRIDGE, CB2 1EW, UK Tel: +44-1223-763069 |
From: Peter Murray-R. <pm...@ca...> - 2004-05-27 07:52:49
|
At 14:33 23/05/2004 -0700, rich apodaca wrote: >Hello All, > >Joerg, many thanks for putting the "octetInterface.zip) package together. >I had a chance to look through it and had some comments and questions. > >I very much agree that QSAR will need some mechanism for assigning (and >retreiving) properties to atoms, molecules, bonding systems, and other >model-level objects. > >However, I do not agree that the place to put this functionality is in the >model-level objects themselves. I don't have strong views either way - I have tried both... > >Let me explain: > >To me, the model-level classes Molecule, Atom, AtomPair, and BondingSystem >are intended to model, in an approximate way, the physical entities they >represent. Of course, the model is not intended to represent "reality", >but more something along the lines of a cartoon. The bottom line is that I >want developers who know about chemistry and molecular structure to feel >at home as much as possible with the Octet API. > >The inclusion of methods in model-level objects that allow for the >storage/retreival of arbitrary data obscures this intent to some extent >because physical molecules don't store generic data - people store generic >data about molecules and that data has different meanings depending on context. > >Aside from design aesthetics (which, of course, are highly subjective :-) >), there are practical issues as well. > >One of the considerations that went into designing Octet is whether >model-level objects should be mutable or not. Octet provides read-only >interface definitions because this enables clients to make some very >useful simplifying assumptions about Molecule/Atom/AtomPair/BondingSystem >behavior. I accept this - hopefully we can all converge on a *minimalist* interface that we all need. That doesn't mean that we cannot deign a larger interface. > >If model-level classes implement the PropertyAcceptor interface, then >client A is free to change the properties of Molecule X without client B >knowing about the change. To build a robust system, client B has to assume >Molecule X can change at any time, and so has to make a defensive copy of >Molecule X, which can be a substantial performance penalty (and an easy >thing to forget). This needn't be limited to using threads, either. Any >time a client needs to hang onto an instance of Molecule (for example, in >a Hashtable), this problem will rear its ugly head. > True in principle. However I would program defensively and have A and B create independent copies at the start. It is too difficult (for me) to keep track of when A and B have independently updated objects. This is one reason why all my interaction with CDK is via XML. It has the downside of performance, but it exactly encapsulates the state of the molecule and effectively creates a copy. Otherwise I would spend the time worrying whether I had updated property X on the CDK side or the CML side. >Additionally, inheriting the PropertyAcceptor interface would place an >additional burden on implementors of the QSAR interfaces. I'm willing to >do this, but only as a last resort. > >What do you think of an alternative approach? > >Let's say I'm writing a TPSA implementation (TPSACalculator) for the QSAR >project. I need to store a property, for example, indicating whether an >atom is an sp2 nitrogen. This is a good example > >The approach in octetInterface.zip, if I understand correctly, would be to >store that property in the Atom itself. This is the natural way I would do it today. It has the advantage that we can forget whether we have calculated it. I have atomTool.getGeometricalHybridisation() and if this is called repeatedly then it can hold a cached value as long as the ligands are not changed > >What if, instead, my TPSACalculator simply maintained a boolean array >called sp2nitrogen with a size equal to the number of atoms in a given >molecule. > >My calculator would simply iterate over all Atoms int the Molecule >checking for sp2 nitrogen. When one is found, lets say at index n, >sp2nitrogen[n] is set to true. At the end of this process I have a boolean >array whose elements tell me the indicies of the sp2-nitrogen atoms in the >Molecule. Is this not the same result as Atom inheriting the >PropertyAcceptor interface and using atom.setProperty()? > >This approach is flexible in that any data structure supported by Java - >collections, arrays, 3rd party libraries can be used. And it does this >without changing the Atom interface or requiring a defensive copy by >TPSACalculator. > >So, the approach I'm suggesting is to rigorously separate data storage >from model-level interface definition. We have also debated this today. I think the balance is whether the calculated objects are persistent and whether there is a performance hit. > >The downside of this approach is that it requires more work by clients to >maintain their data. > >There are many ways to address this issue. One solution would be to define >a special data storage class whose API is designed to assiciate generic >data with model-level objects. > >Another, more integrated, approach would be to use the Decorator Pattern >(see: ><http://c2.com/cgi/wiki?DecoratorPattern>http://c2.com/cgi/wiki?DecoratorPattern). > >Structure >(<http://structure.sourceforge.net>http://structure.sourceforge.net) uses >this approach to associate 2-D atom coordinates with Atoms. In a nutshell, > >- net.sourceforge.octet.extension.MoleculeDecorator is a concrete >convenience class that implements the Molecule interface by passing all >method calls through to a private instance of the Molecule it decorates. > >- net.sourceforge.structure.molecule.BasicMolecule2D extends >MoleculeDecorator, adding methods for 2-D coordinate manipulation. We use basically the same Decorator approach (I think!) in JUMBO Tools: You have: double BasicMolecule2D.getX(net.sourceforge.octet.molecule.Atom atom) JUMBO has: AtomTool atomTool = atom.getTool(); Point3 point = atomTool.getXYZ3(); which provides the same approach (although some might call this an adapter). > >Clients then go: > >Molecule mol = ...; // get the molecule from somewhere > >Molecule2D mol2d = new Molecule2D(mol); > >mol2d.move(mol2d.getAtom(0), 0.0, 0.0); > >// etc... > >Getting back to octetInterface... If we absolutely need to have a Molecule >that can store properties, how about defining a class like: > >public class PropertyMolecule extends >net.sourceforge.octet.extensions.MoleculeDecorator > implements PropertyAcceptor >{ > // implement the PropertyAcceptor interface >} > >(Aside: Maybe a set of methods like PropertyAcceptor.setProperty(Atom >atom, PropertyKey key) would be useful here. Also, to avoid the defensive >copy situation above, PropertyAcceptor could then define >PropertyAcceptor.addPropertiesListener()). > >Now this frees implementors of the Molecule interface from having to >support the PropertiesAcceptor interface. It also fosters a >"pay-as-you-go" approach to interface complexity and overhead. If a client >needs this functionality, then they have to pay for it, but we don't >compel clients to pay for what they don't need. We can get ease of use, >simplicity, and high performance at the same time. Sounds reasonable. My general inclination is now to build some agreed important functionality that we can integrate and see how well it works. I think TPSA is an important one. I'd also like to see the basic topology (SSSR, etc.) and I'll throw in geometricHybridsation, atomParities, etc. , assign bondOrders, layout2D, which are at the top of my current priorities. Best P. > Peter Murray-Rust Unilever Centre for Molecular Informatics Chemistry Department, Cambridge University Lensfield Road, CAMBRIDGE, CB2 1EW, UK Tel: +44-1223-763069 |
From: rich a. <che...@ya...> - 2004-05-27 05:06:33
|
Hello All, Egon, I like your idea about an Octet-to-Octet converter. I liked it so much I decided to try my hand at implementing it (or part of it at least). The idea is simple: I added a static method ("copyMolecule") to MoleculeTools. It takes as a parameter a Molecule and a MoleculeBuilder. It uses the Molecule as a template to provide instructions to the MoleculeBuilder that result in a copy of the molecule being produced. I have committed my changes (and unit tests - contained in MoleculeTest) to the Octet CVS. Now, this is only part of the solution you were thinking about. The second part needs to come from the "compatability layer." In particular, a CDKMoleculeBuilder needs to be developed. It is a class that implements the net.sourceforge.octet.molecule.MoleculeBuilder interface, and which probably would have an additional method that could return a CDKMolecule (say, "releaseCDKMolecule()"). With this class in place, one could then go: CDKMoleculeBuilder builder = ... ; Molecule molecule = ... ; // the molecule to be copied MoleculeKit.copyMolecule(molecule, builder); CDKMolecule cdkMolecule = builder.releaseCDKMolecule(); // do something CDK-specific with cdkMolecule I imagine that there could be an analogous JOEMoleculeBuilder, JUMBOMoleculeBuilder, etc. The nice thing about this solution is that the same code, namely the MoleculeKit.copyMolecule() method, can be used to faithfully copy any Molecule implementation to any other for which the MoleculeBuilder has been written. Only one copy method need be written, tested, and debugged - which I have done to a first approximation. Also, since only Octet interfaces are involved, this places control of the process pretty much in the hands of the compatability layer, rather than QSAR - which is where I think it belongs. As an added benefit, this CDKMoleculeBuilder, once developed, could be used seamlessly within the Octet IO package or with TestMolecules - basically anywhere that a Molecule needs to be constructed. In summary, there are probably two key interfaces that a QSAR compatability layer may need to implement for maximum utility: Molecule and MoleculeBuilder. I am, of course, willing to help with writing any of these that might be needed by QSAR-aware projects. cheers, rich "E.L. Willighagen" <eg...@sc...> wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday 26 May 2004 07:16, rich apodaca wrote: > Peter, you raise an interesting point about simultaneously being able to > use functionality in multiple toolkits. I hadn't considered simultaneous > use of more than two toolkits. > > Because Java permits only multiple inheritance of interfaces, I can't > imagine that a three-way or four-way Adapter Pattern could be used - both > JOELib and CDK define model-level objects in terms of classes rather than > interfaces. > > Nevertheless, the two-way Adapter Pattern is just one way to try to achieve > interoperability. And it may not be the best solution in every situation. > > Another approach to this problem is to develop a translator class that will > use one toolkit's molecule as a template for building another toolkit's > molecule. I included a skeletal implementation of this concept in Octet's > CDKTools package (CDKKit). I've been thinking about this: a Octet2Octect convertor... i.e. a class that takes the input from a class implementing a certain octet interface, and stores it into a second (possibly different) implementation of that octet interface, obviously using this given interface... Then you only need one translater for each octect interface, no matter how many implementation there are... It also ensures that no information will be converted outside this interface. Egon - -- eg...@sc... PhD on Molecular Representation in Chemometrics Nijmegen University http://www.cac.sci.kun.nl/people/egonw/ GPG: 1024D/D6336BA6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (SunOS) iD8DBQFAtE9zd9R8I9Yza6YRAv3zAJ948c5BaiuNIsrf/X2CAok5yDdTgQCfUvba CYKGD/HQdNSLaJWOl1tGmWc= =vw/p -----END PGP SIGNATURE----- __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |
From: <ben...@id...> - 2004-05-25 09:21:11
|
Dear Open Source developer I am doing a research project on "Fun and Software Development" in which I kindly invite you to participate. You will find the online survey under http://fasd.ethz.ch/qsf/. The questionnaire consists of 53 questions and you will need about 15 minutes to complete it. With the FASD project (Fun and Software Development) we want to define the motivational significance of fun when software developers decide to engage in Open Source projects. What is special about our research project is that a similar survey is planned with software developers in commercial firms. This procedure allows the immediate comparison between the involved individuals and the conditions of production of these two development models. Thus we hope to obtain substantial new insights to the phenomenon of Open Source Development. With many thanks for your participation, Benno Luthiger PS: The results of the survey will be published under http://www.isu.unizh.ch/fuehrung/blprojects/FASD/. We have set up the mailing list fa...@we... for this study. Please see http://fasd.ethz.ch/qsf/mailinglist_en.html for registration to this mailing list. _______________________________________________________________________ Benno Luthiger Swiss Federal Institute of Technology Zurich 8092 Zurich Mail: benno.luthiger(at)id.ethz.ch _______________________________________________________________________ |
From: Joerg K. W. <we...@in...> - 2004-05-24 07:36:00
|
Hi Rich, > What if, instead, my TPSACalculator simply maintained a boolean array called sp2nitrogen with a size equal to the number of atoms in a given molecule. > My calculator would simply iterate over all Atoms int the Molecule checking for sp2 nitrogen. When one is found, lets say at index n, sp2nitrogen[n] is set to true. At the end of this process I have a boolean array whose elements tell me the indicies of the sp2-nitrogen atoms in the Molecule. Is this not the same result as Atom inheriting the PropertyAcceptor interface and using atom.setProperty()? I think there is also a pattern for this array-access via index and wrapping in an object Simple... i can't remember the name. But i agree completely that we should have the ability to use arrays here (SparseWrapper ?). Sure you can produce the same result with a meta-layer, e.g. as in JOELib. Store array in molecule getProperty(SP2ArrayKey).get(atomIndex), so it's just a question of the design you're more experienced with The most thing i'm insterested in is to provide the interface with an ability to access propertys also in a common way, i don't care which way, so feel free to suggest another approach. > So, the approach I'm suggesting is to rigorously separate data storage from model-level interface definition. Amazing ... simplicity, complexity-if-you-want and speed ... all in one ... sounds fantastic ! So change these things !!! Kind regards, Joerg > Another, more integrated, approach would be to use the Decorator Pattern (see: http://c2.com/cgi/wiki?DecoratorPattern). > Structure (http://structure.sourceforge.net) uses this approach to associate 2-D atom coordinates with Atoms. In a nutshell, > - net.sourceforge.octet.extension.MoleculeDecorator is a concrete convenience class that implements the Molecule interface by passing all method calls through to a private instance of the Molecule it decorates. > - net.sourceforge.structure.molecule.BasicMolecule2D extends MoleculeDecorator, adding methods for 2-D coordinate manipulation. > > Clients then go: > > Molecule mol = ...; // get the molecule from somewhere > Molecule2D mol2d = new Molecule2D(mol); > mol2d.move(mol2d.getAtom(0), 0.0, 0.0); > > // etc... > > Getting back to octetInterface... If we absolutely need to have a Molecule that can store properties, how about defining a class like: > > public class PropertyMolecule extends net.sourceforge.octet.extensions.MoleculeDecorator > implements PropertyAcceptor > { > // implement the PropertyAcceptor interface > } > > (Aside: Maybe a set of methods like PropertyAcceptor.setProperty(Atom atom, PropertyKey key) would be useful here. Also, to avoid the defensive copy situation above, PropertyAcceptor could then define PropertyAcceptor.addPropertiesListener()). > > Now this frees implementors of the Molecule interface from having to support the PropertiesAcceptor interface. It also fosters a "pay-as-you-go" approach to interface complexity and overhead. If a client needs this functionality, then they have to pay for it, but we don't compel clients to pay for what they don't need. We can get ease of use, simplicity, and high performance at the same time. > > cheers, > rich > > "Joerg K. Wegner" <we...@in...> wrote: > Hi all, > > so here is the first version, of the 'combined' octet interface: > (ups, our server seems to be down, please try again in a few hours) > http://www-ra.informatik.uni-tuebingen.de/mitarb/wegner/tmp/octet/octetInterface.zip > > TECHNICAL: > It has the following structure: > cdk > joelib > octet > octet4CDK > octet4JOELib > octetImplementations > > CDK, JOELib, Octet should work on their own via 'ant compile' > > octet4CDK,octet4JOELib requires Octet/CDK or Octet/JOELib they catch on > their own > > octetImplementations Octet/CDK/JOELib and contains new implementations > for Octet. So this part of the project combines all efforts !!! > > DESIGN: > I've added to octet/properties > AssignmentFactory.java > DataFactory.java > Property.java > PropertyAcceptor.java > PropertyKey.java > PropertyKeyFactory.java > PropertyKeySet.java > PropertySet.java > PropertyVersion.java > > So all objects which can accept properties can extend PropertyAcceptor. > I've added them to Atom and Molecule, so the CDK interface will not work > any more, also the basic implementations which are part of > octetImplementations (i've moved them from octet). > The octet4JOELib part contains until now no functionality. > I will also add/change the substructure parts to Octet, but not this > week :-) > > So, Rich AND Egon, if you agree to this structure, then we should add > this first version to Octet-CVS, otherwise change all things you like > and send me a link to download the changed version. > (I hate CVS and it's directory problem) > > Please read the API docu, so i must not explain things twice in this > e-mail, please complain anything you do not like in the actual design. > I've used a really widely usable Object storing mechanism, and are > forcing property-version methods. Also we should remove things like in Atom: > countElectrons > getLabel > Because this are simply two Property's for an Atom, so we are working on > attributed molecular graphs. To get the values, these definitions should > be part of the > PropertyKeyFactory > adding methods, like > public PropertyKey getElectronsKey(); > public PropertyKey getAtomLabelKey(); // what's meant by label, isn't > > // this is too general here? > > Kind regads, Joerg > -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: rich a. <che...@ya...> - 2004-05-23 21:33:43
|
Hello All, Joerg, many thanks for putting the "octetInterface.zip) package together. I had a chance to look through it and had some comments and questions. I very much agree that QSAR will need some mechanism for assigning (and retreiving) properties to atoms, molecules, bonding systems, and other model-level objects. However, I do not agree that the place to put this functionality is in the model-level objects themselves. Let me explain: To me, the model-level classes Molecule, Atom, AtomPair, and BondingSystem are intended to model, in an approximate way, the physical entities they represent. Of course, the model is not intended to represent "reality", but more something along the lines of a cartoon. The bottom line is that I want developers who know about chemistry and molecular structure to feel at home as much as possible with the Octet API. The inclusion of methods in model-level objects that allow for the storage/retreival of arbitrary data obscures this intent to some extent because physical molecules don't store generic data - people store generic data about molecules and that data has different meanings depending on context. Aside from design aesthetics (which, of course, are highly subjective :-) ), there are practical issues as well. One of the considerations that went into designing Octet is whether model-level objects should be mutable or not. Octet provides read-only interface definitions because this enables clients to make some very useful simplifying assumptions about Molecule/Atom/AtomPair/BondingSystem behavior. If model-level classes implement the PropertyAcceptor interface, then client A is free to change the properties of Molecule X without client B knowing about the change. To build a robust system, client B has to assume Molecule X can change at any time, and so has to make a defensive copy of Molecule X, which can be a substantial performance penalty (and an easy thing to forget). This needn't be limited to using threads, either. Any time a client needs to hang onto an instance of Molecule (for example, in a Hashtable), this problem will rear its ugly head. Additionally, inheriting the PropertyAcceptor interface would place an additional burden on implementors of the QSAR interfaces. I'm willing to do this, but only as a last resort. What do you think of an alternative approach? Let's say I'm writing a TPSA implementation (TPSACalculator) for the QSAR project. I need to store a property, for example, indicating whether an atom is an sp2 nitrogen. The approach in octetInterface.zip, if I understand correctly, would be to store that property in the Atom itself. What if, instead, my TPSACalculator simply maintained a boolean array called sp2nitrogen with a size equal to the number of atoms in a given molecule. My calculator would simply iterate over all Atoms int the Molecule checking for sp2 nitrogen. When one is found, lets say at index n, sp2nitrogen[n] is set to true. At the end of this process I have a boolean array whose elements tell me the indicies of the sp2-nitrogen atoms in the Molecule. Is this not the same result as Atom inheriting the PropertyAcceptor interface and using atom.setProperty()? This approach is flexible in that any data structure supported by Java - collections, arrays, 3rd party libraries can be used. And it does this without changing the Atom interface or requiring a defensive copy by TPSACalculator. So, the approach I'm suggesting is to rigorously separate data storage from model-level interface definition. The downside of this approach is that it requires more work by clients to maintain their data. There are many ways to address this issue. One solution would be to define a special data storage class whose API is designed to assiciate generic data with model-level objects. Another, more integrated, approach would be to use the Decorator Pattern (see: http://c2.com/cgi/wiki?DecoratorPattern). Structure (http://structure.sourceforge.net) uses this approach to associate 2-D atom coordinates with Atoms. In a nutshell, - net.sourceforge.octet.extension.MoleculeDecorator is a concrete convenience class that implements the Molecule interface by passing all method calls through to a private instance of the Molecule it decorates. - net.sourceforge.structure.molecule.BasicMolecule2D extends MoleculeDecorator, adding methods for 2-D coordinate manipulation. Clients then go: Molecule mol = ...; // get the molecule from somewhere Molecule2D mol2d = new Molecule2D(mol); mol2d.move(mol2d.getAtom(0), 0.0, 0.0); // etc... Getting back to octetInterface... If we absolutely need to have a Molecule that can store properties, how about defining a class like: public class PropertyMolecule extends net.sourceforge.octet.extensions.MoleculeDecorator implements PropertyAcceptor { // implement the PropertyAcceptor interface } (Aside: Maybe a set of methods like PropertyAcceptor.setProperty(Atom atom, PropertyKey key) would be useful here. Also, to avoid the defensive copy situation above, PropertyAcceptor could then define PropertyAcceptor.addPropertiesListener()). Now this frees implementors of the Molecule interface from having to support the PropertiesAcceptor interface. It also fosters a "pay-as-you-go" approach to interface complexity and overhead. If a client needs this functionality, then they have to pay for it, but we don't compel clients to pay for what they don't need. We can get ease of use, simplicity, and high performance at the same time. cheers, rich "Joerg K. Wegner" <we...@in...> wrote: Hi all, so here is the first version, of the 'combined' octet interface: (ups, our server seems to be down, please try again in a few hours) http://www-ra.informatik.uni-tuebingen.de/mitarb/wegner/tmp/octet/octetInterface.zip TECHNICAL: It has the following structure: cdk joelib octet octet4CDK octet4JOELib octetImplementations CDK, JOELib, Octet should work on their own via 'ant compile' octet4CDK,octet4JOELib requires Octet/CDK or Octet/JOELib they catch on their own octetImplementations Octet/CDK/JOELib and contains new implementations for Octet. So this part of the project combines all efforts !!! DESIGN: I've added to octet/properties AssignmentFactory.java DataFactory.java Property.java PropertyAcceptor.java PropertyKey.java PropertyKeyFactory.java PropertyKeySet.java PropertySet.java PropertyVersion.java So all objects which can accept properties can extend PropertyAcceptor. I've added them to Atom and Molecule, so the CDK interface will not work any more, also the basic implementations which are part of octetImplementations (i've moved them from octet). The octet4JOELib part contains until now no functionality. I will also add/change the substructure parts to Octet, but not this week :-) So, Rich AND Egon, if you agree to this structure, then we should add this first version to Octet-CVS, otherwise change all things you like and send me a link to download the changed version. (I hate CVS and it's directory problem) Please read the API docu, so i must not explain things twice in this e-mail, please complain anything you do not like in the actual design. I've used a really widely usable Object storing mechanism, and are forcing property-version methods. Also we should remove things like in Atom: countElectrons getLabel Because this are simply two Property's for an Atom, so we are working on attributed molecular graphs. To get the values, these definitions should be part of the PropertyKeyFactory adding methods, like public PropertyKey getElectronsKey(); public PropertyKey getAtomLabelKey(); // what's meant by label, isn't // this is too general here? Kind regads, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) ------------------------------------------------------- This SF.Net email is sponsored by: Oracle 10g Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click _______________________________________________ octet-devel mailing list oct...@li... https://lists.sourceforge.net/lists/listinfo/octet-devel --------------------------------- Do you Yahoo!? Yahoo! Domains - Claim yours for only $14.70/year |
From: Joerg K. W. <we...@in...> - 2004-05-21 08:34:12
|
Hi all, so here is the first version, of the 'combined' octet interface: (ups, our server seems to be down, please try again in a few hours) http://www-ra.informatik.uni-tuebingen.de/mitarb/wegner/tmp/octet/octetInterface.zip TECHNICAL: It has the following structure: cdk joelib octet octet4CDK octet4JOELib octetImplementations CDK, JOELib, Octet should work on their own via 'ant compile' octet4CDK,octet4JOELib requires Octet/CDK or Octet/JOELib they catch on their own octetImplementations Octet/CDK/JOELib and contains new implementations for Octet. So this part of the project combines all efforts !!! DESIGN: I've added to octet/properties AssignmentFactory.java DataFactory.java Property.java PropertyAcceptor.java PropertyKey.java PropertyKeyFactory.java PropertyKeySet.java PropertySet.java PropertyVersion.java So all objects which can accept properties can extend PropertyAcceptor. I've added them to Atom and Molecule, so the CDK interface will not work any more, also the basic implementations which are part of octetImplementations (i've moved them from octet). The octet4JOELib part contains until now no functionality. I will also add/change the substructure parts to Octet, but not this week :-) So, Rich AND Egon, if you agree to this structure, then we should add this first version to Octet-CVS, otherwise change all things you like and send me a link to download the changed version. (I hate CVS and it's directory problem) Please read the API docu, so i must not explain things twice in this e-mail, please complain anything you do not like in the actual design. I've used a really widely usable Object storing mechanism, and are forcing property-version methods. Also we should remove things like in Atom: countElectrons getLabel Because this are simply two Property's for an Atom, so we are working on attributed molecular graphs. To get the values, these definitions should be part of the PropertyKeyFactory adding methods, like public PropertyKey getElectronsKey(); public PropertyKey getAtomLabelKey(); // what's meant by label, isn't // this is too general here? Kind regads, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: Joerg K. W. <we...@in...> - 2004-05-10 17:00:14
|
Hi all, >>Are we (QSAR, CDK, JOELib, Octet, Jumbo) trying to do too much at the same >>time? > Maybe, but I think that things are going fine as they go now... we approach > things step by step... I guess we are mostly just glueing existing tools > together... Maybe ... step by step ... and we need at first a common merged interface, before any concrete implemention helps us to improve the actual design. >>(3) Wouldn't it be even more useful if project we're planning interacted >>with a single "standard" Java API for accessing and manipulating Molecular >>information? >>(4) Yes it would, > focus on chemical entities only... very difficult to make the 'single > standard API'... Chemistry is too fuzzy, too broad... > But this does not mean that we can define 'a standard Java API' which > glues together a few existing projects... Let's start with the 'glued' interface, if people have plans to write their own implementation, they can do that. But at first we must find a common interface... combining actual available open source projects may be at a later stage be interesting. >>but such a thing doesn't exist! How can we ensure that >>the new API will be general enough, robust, and useful? > I don't think we can... At the moment, i don't think we have time ... hey, these are open source projects, so in future we have the ability to refactor things ... >>My point is this: would it be useful to tackle the problem of developing a >>single standard Molecular API separately from the development of a QSAR >>framework? > Interesting, but I don't think we can easily come up with the solution to this > problem... (if it was easy, it was already done...) Correct, of course is refactoring much more easy than developing functionality, but there are still some really nasty problems out there, so i'm optimistic that we can iterate to a common interface and a common API, but this will need time ... it's still hard enough to maintain the actual available projects, because there are still some open performance-problems or bad-designs in them. And simply 'merging' the functionality is difficult, because it may demand a difficult refactoring. You surely know the actual LinesOfCode: ChemicalMarkupLanguge: 30285 CDK: 43772 JOELib: 63761 http://pmd.sourceforge.net/scoreboard.html So, assuming that a good developer reads 1000 LOC/day and understands them and all the dependencies, he will need 30+44+64=138 days (4 1/2) months to understand all the projects, then he can start with refactoring and testing, so ... hope you get paid for one year producing nothing :-) So are LOC a good measure for productivity ? No, but ... that's another problem, and out of the QSAR project focus. > Interesting, too... OpenBabel is struggling with atom types in file conversion > (i.e., I think they still are...)... which indicates only part of the > problems... I've discussed this topic with Geoff, but as always ... there are some other things to do, but we have exactly the same chemistry 'kernels', but this was checked 'by hand', because we have partially hard-coded assignment algorithms, so still suboptimal. > Jakarta is a much simpler working area... all the results are artificial... > that is, they don't have to match with nature... so they don't really care on > how things should be interpreted, only that they work... I agree ... chemoinformatics is still strongly connected to science, because we need still standards, which are in progress ... CML, 'expert systems', interfaces, ... Unfortunately, as already critisized by Kubyini (or at least cited by him) the contribution of the pharmaceutical industry could be higher in helping to set a standard. So, refactoring helps me not to publish papers and does not help pharmaceutical industry to reduce their data piles, of course for the future it can be helpfull, but financial pressure might be high for them and for us ... so who cares about a good hypothetical standard in the future which faciliates the maintenance ? So let's work with shell-scripts, they are fast and have an included copy protection, but that's unrealistic :-) As already said by Egon ... let's iterate ... step by step ... nothing is exluded ... but also nothing should be included too early ... Kind regards, Joerg -- Dipl. Chem. Joerg K. Wegner Center of Bioinformatics Tuebingen (ZBIT) Department of Computer Architecture Univ. Tuebingen, Sand 1, D-72076 Tuebingen, Germany Phone: (+49/0) 7071 29 78970 Fax: (+49/0) 7071 29 5091 E-Mail: mailto:we...@in... WWW: http://www-ra.informatik.uni-tuebingen.de -- Never mistake motion for action. (E. Hemingway) Never mistake action for meaningful action. (Hugo Kubinyi,2004) |
From: E.L. W. <eg...@sc...> - 2004-05-10 12:26:54
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Monday 10 May 2004 00:05, rich apodaca wrote: > I almost don't want to bring this up because the discussion around the QS= AR > project is pretty involved as it is. But I can't resist... > > Egon, your comment about QSAR being a "meta project" hit home with me in a > big way. > > The thought occurs: > > Are we (QSAR, CDK, JOELib, Octet, Jumbo) trying to do too much at the same > time? Maybe, but I think that things are going fine as they go now... we approach= =20 things step by step... I guess we are mostly just glueing existing tools=20 together... > Here's my impression of the line of discussion that led to where we are n= ow > (which I believe is a good place, by the way): > > (1) Wouldn't it useful to have an open-source project devoted exclusively > to QSAR with open implementations based on existing projects, a GUI, and > which makes use of open-source data mining tools (such as weka)? > > (2) Yes it would. Yes, that's a nice summary of the goal of qsar.sf.net :) > (3) Wouldn't it be even more useful if project we're planning interacted > with a single "standard" Java API for accessing and manipulating Molecular > information? > > (4) Yes it would,=20 Mmmm... people have tried that... there are some articles in which they onl= y=20 focus on chemical entities only... very difficult to make the 'single=20 standard API'... Chemistry is too fuzzy, too broad... But this does not mean that we can define 'a standard Java API' which glues= =20 together a few existing projects... > but such a thing doesn't exist! How can we ensure that > the new API will be general enough, robust, and useful?=20 I don't think we can... > How can we meet > this objective AND minimize refactorings of existing cheminformatics > projects to accomodate this new API? > > This is where we are now, in my view. The problem is, just tackling point > (4) will be a very big job in itself. Agreed. And I do not think we should make this our focus... I very much lik= ed=20 your suggestion of spliting up API's which can be merges for some specific= =20 application...: Very basic Atom API 3DRenderingAPI 2DRenderingAPI > My point is this: would it be useful to tackle the problem of developing a > single standard Molecular API separately from the development of a QSAR > framework? Interesting, but I don't think we can easily come up with the solution to t= his=20 problem... (if it was easy, it was already done...) > Would it be even more helpful to devote a separate project toward > cheminformatics standardization and/or integration in general? This proje= ct > could start off by trying address our point (4), but could easily expand = to > deal with any number of standardization/integration issues currently > plaguing cheminformatics research. The focus of the project needn't be > Java-centric either, although it would probably start out that way. Interesting, too... OpenBabel is struggling with atom types in file convers= ion=20 (i.e., I think they still are...)... which indicates only part of the=20 problems... But, I think doing this for the QSAR field only, reduces the problem size, = and=20 would make an very interesting test case... > As a model for such an effort, how about the Apache Jakarta project > (http://jakarta.apache.org/)? This project nicely ties together a lot of > technologies and serves as an essential resource for experienced develope= rs > and newcomers alike. More importantly, experiences in one project often > lead to new projects that address novel problems. > > Any thoughts? Jakarta is a much simpler working area... all the results are artificial...= =20 that is, they don't have to match with nature... so they don't really care = on=20 how things should be interpreted, only that they work... But, the resources that such a thing provides is applicable to our situatio= n=20 too... I'm hoping that the qsar.sf.net project can server such a function t= o=20 the QSAR field of science... Egon =2D --=20 eg...@sc... PhD on Molecular Representation in Chemometrics Nijmegen University http://www.cac.sci.kun.nl/people/egonw/ GPG: 1024D/D6336BA6 =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (SunOS) iD8DBQFAn3T9d9R8I9Yza6YRAse7AKCuSJRXMLMoSAxYDtjg8Zk+dvGv5wCgkz98 lZ/LyciBliBj5jzF3tSIwMw=3D =3DdM7G =2D----END PGP SIGNATURE----- |
From: rich a. <che...@ya...> - 2004-05-09 22:05:53
|
Hello All, I almost don't want to bring this up because the discussion around the QSAR project is pretty involved as it is. But I can't resist... Egon, your comment about QSAR being a "meta project" hit home with me in a big way. The thought occurs: Are we (QSAR, CDK, JOELib, Octet, Jumbo) trying to do too much at the same time? Here's my impression of the line of discussion that led to where we are now (which I believe is a good place, by the way): (1) Wouldn't it useful to have an open-source project devoted exclusively to QSAR with open implementations based on existing projects, a GUI, and which makes use of open-source data mining tools (such as weka)? (2) Yes it would. (3) Wouldn't it be even more useful if project we're planning interacted with a single "standard" Java API for accessing and manipulating Molecular information? (4) Yes it would, but such a thing doesn't exist! How can we ensure that the new API will be general enough, robust, and useful? How can we meet this objective AND minimize refactorings of existing cheminformatics projects to accomodate this new API? This is where we are now, in my view. The problem is, just tackling point (4) will be a very big job in itself. My point is this: would it be useful to tackle the problem of developing a single standard Molecular API separately from the development of a QSAR framework? Would it be even more helpful to devote a separate project toward cheminformatics standardization and/or integration in general? This project could start off by trying address our point (4), but could easily expand to deal with any number of standardization/integration issues currently plaguing cheminformatics research. The focus of the project needn't be Java-centric either, although it would probably start out that way. As a model for such an effort, how about the Apache Jakarta project (http://jakarta.apache.org/)? This project nicely ties together a lot of technologies and serves as an essential resource for experienced developers and newcomers alike. More importantly, experiences in one project often lead to new projects that address novel problems. Any thoughts? cheers, rich Egon Willighagen <eg...@sc...> wrote: The interfaces and the wrappers can be in Octet, but personally I prefer to do this is the common, implementation neutral, QSAR project... The compile scheme is identical, the only difference is where people get added as developer. I prefer this setup, because it more clearly shows that the QSAR part is sort of meta project which tries to connect available OS tools for QSAR research. Please comment. --------------------------------- Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs |
From: rich a. <che...@ya...> - 2004-05-06 06:22:34
|
Hello Joerg, Welcome to the Octet Project. I have added you as a developer. I look forward to working with you! I have some guidelines for development: (1) All submitted code requires documentation of purpose and use. Javadoc comments on well-named short methods are preferred over comments contained within long methods. (2) Complete unit test should be performed and passed immediately after checkout and before any commit ("ant clean; ant test"). (3) Any new functionality that is added requires a unit test that clearly demonstrates the feature works. This unit test will be contained in a subclass of OctetTestCase and added to net.sourceforge.octet.junit.CompleteTest. (4) If any developer (including me) doesn't follow these guidelines, feel free to point that out and ask for compliance :-). In addition, I'd like to ask that any changes you feel might be necessary to interfaces be discussed with me first (via oct...@li...). Becuase there will be some effort by ourselves and others to try to implement the Octet model-level interfaces, I would like to change these as little as possible in the short term and make sure these changes can be reviewed before being made. My main focus on Octet going forward will be to solidify its implementations by handling boundary conditions, developing and using Exceptions, writing more detailed documentation, and providing more rigorous unit testing. cheers, rich --------------------------------- Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs |