From: rich a. <che...@ya...> - 2005-03-18 16:35:07
|
Hi all, It was great meeting some of you in San Diego. It's inspired me to get to work on CML support in Octet (http://octet.sf.net). I've been looking at how CDK implements support for CML and I had some questions about the org.openscience.cdk.io.cml package, although some of these questions may be better addressed at CML. (feel free to correct any of my misinterpretations below) I was surprised to see that common file formats/applications (i.e. molfile, pdb) and others that are more specialized (jmol) each have their own unique CML conventions. My impression was that CML was a universal molecular information exchange format - write one DocumentHandler in SAX and be done, but it looks like the picture is not that simple? From Egon's article at http://www.openscience.org/~egonw/cml/cml_conventions.html I gather that there are a number of ways to encode molecular information in CML. For example, the meaning of bond order = 4 will vary. In some ontologies it will mean an aromatic bond and in others it will mean a quadruple bond - other ontologies may have still another interpretation. To be able to assign meaning to bond order = 4, a CML convention needs to be specified in the CML document itself as an aid to the parser. It looks like CDK's approach is to use the Strategy Pattern. The core strategy is CMLCoreModule which dynamically delegates to one of a few possible concrete ConventionInterface implementations as the need arises. This approach, if I am understanding it correctly, makes sense from the perspective of software development. But I'm concerned about a larger issue: doesn't this essentially leave us in the same position we were in before CML? Sure, we can take advantage of the semi-automated file parsing available through SAX or DOM's, but the core issue of multiple, often incompatible interpretations of the data in these files remains, doesn't it? What happens when CDK encounters a CML processing convention for which no ConventionInterface implementation exists? Isn't this the same as having a file format without a parser - or a file format with a partial parser? thanks for your help, rich __________________________________ Do you Yahoo!? Yahoo! Small Business - Try our new resources site! http://smallbusiness.yahoo.com/resources/ |