Hello All,
Egon, I realized that CDKMolecule would not work with MDLReader because I failed to override addBond(Bond). So I fixed that and now it works.
The changes are found in CDKTools 0.2.1, which can be downloaded at:
Javadoc is at:
In addition, I added a unit test (MolfileTest.java) that loads a series of molfiles using one of three approaches: (1) a CDKMolecule using MDLReader; (2) a org.openscience.cdk.Molecule using MDLReader; and (3) a CDKMolecule using Ocet's MolfileReader and CDKMoleculeBuilder.
The isomorphism of molecule pairs (1)-(3) and (2)-(3) are verified with CDK's UniversalIsomorphismTester. Everything passes. I took the molfiles from CDK's "data" directory.
I think this little unit test is also a good demonstration of how to actually get Octet and CDK to work together. I like your idea of either posting this code or code like it on the QSAR site.
BTW, did you update all CDK io classes analogously to MDLReader? If not, would it help to put this new behavior into DefaultChemObjectReader? Or maybe add an explicit method like read(Molecule molecule)?
Regarding where we go from here....
I was interested in your thoughts on the dict idea. But I must confess, I still don't know what a dict is. It looks like it has something to do with defining molecular descriptors. But I'm struggling to understand how one would use it in QSAR.
Nevertheless, I think going in the direction of specifying the components and behaviors that go into descriptors is the next logical direction. I'm especially interested in drafting some kind of specification outlining the requirements for such a system - maybe using RFE. And of course, I'm keenly interested in reducing this spec. into a set of Java interfaces defined in terms of QSAR model-level objects. Octet functionality will probably need to be developed to fill in the gaps, which I'm happy to do.
If it is at all possible to take a composite approach where descriptors or descriptor components could easily be combined to make new descriptors, I think this could be useful.
But my expertise in the area of molecular descriptors is about the same as Matt Foley's experise as a motivational speaker. So - I'd like to get some perspectives on this from developers, potential developers, or amused onlookers.

"E.L. Willighagen" <egonw@sci.kun.nl> wrote:
Hash: SHA1

On Tuesday 29 June 2004 17:32, E.L. Willighagen wrote:
> We've spoken about setting up a descriptor dictionary, and an example of
> such a dictionary is on our website:
> http://qsar.sourceforge.net/dicts.html
> Let's consider the association in that dict as an descriptor, the config
> file could look like:

To be a bit more explicit, I'm thinking things like

from a dict like

id="constitution" title="Constitution based Descriptors">

List of descriptors derived from the constitution of molecules.
Or whatever.

The number of carbon atoms in the molecule.

This descriptor was originally proposed by J.Doe in Some.J., 1896.
He showed that it had good correlation with the boiling point of

- --
PhD on Molecular Representation in Chemometrics
Nijmegen University
GPG: 1024D/D6336BA6

"Again a chemist did something useful with a computer"
Version: GnuPG v1.0.7 (SunOS)


This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
Qsar-devel mailing list

Do you Yahoo!?
Yahoo! Mail Address AutoComplete - You start. We finish.