I'd say that the best data-model is as close to reality as possible (at least as far as quantum mechanics currently understands 'reality').
Quantum-mechanically speaking, the closest approach would be to have the ring with all bonds Ibond.Order.SINGLE (the sigma-bonds), and then add a single 'meta-bond' for the pi system with all the ringatoms and aromatic electrons in it. This meta-bond then gets the ISAROMATIC flag. So for benzene: 6 atoms, 7 bonds: 6 conventional single bonds + 1 'pi system meta-bond' containing all 6 ringatoms and 6 electrons.
This is the closest I can think of actually representing electron de-localization in the current CDK model. An advantage is that this is by far the most flexible and generic solution, disadvantage is that it means reworking every single piece of CDK aromaticity code. Ouch..
Reality-wise, the next best compromise I think would be the Ibond.Order.SINGLE_OR_DOUBLE solution, which is a bondorder of 1.5, and could equally well be termed Ibond.Order.AROMATIC or Ibond.Order.ONE_PLUS_HALF.
We may need an extra Ibond.Order.TWO_PLUS_HALF if we allow for triple-bonds in an aromatic system. I'm not sure this is practically possible unless you have rings of ridiculous size to overcome the angle strain.
Of course this scheme breaks down when (d-orbital-using) metals become involved, such as Ferrocene, or when representing inorganic compounds/crystals (e.g. metal alloys). I believe this would potentially require a long list of THREE_PLUS_HALF, FOUR_PLUS_HALF etc. bondorders, but I'm not sure of that..
(The multi-atom-multi-electron 'pi-system-bond' scheme would easily accommodate this problem.)
As for some IO-formats not supporting aromaticity but only Kekulé notation:
I think some crippled legacy disk-format (I.E. .MOL), even if it is still often used, shouldn't limit the CDK functionality nor its internal datamodel.
In the worst case the affected writers will need an extra call to a 'convertAromaticToKekule()' method, paired with a 'convertKekuleToAromatic()' in the readers. This doesn't sound like a problem to me as the conversion is pretty straightforward.
The 'problem' in the current flag-using data model is that it allows for corrupt data. You could build a seven-ring with only single-bonds, set the ISAROMATIC flag for all of them, and no-one would notice. Of course the current handling of infamous c1ccccc1 by JChemPaint is another nice example: it's aromatic, but you don't see it unless you manually count implicit hydrogens.
I think a return of the 1.5 bond order would be the best practical solution for CDK v1.4, but would like to suggest a move to the pi-system metabonds for CDK v2.0.
On 11 August 2010 17:19, Egon Willighagen <email@example.com> wrote:
> I think you might be right...
> On Wed, Aug 11, 2010 at 4:29 PM, Nina Jeliazkova
> <firstname.lastname@example.org> wrote:
>> Consider an user entering aromatic SMILES, readers will parse it into
>> the new singe or double type of bond; even if all processing code
>> works fine, it will still be impossible to write the structure back in
>> a e.g. MOL file, unless bonds are assigned either single or double
> I was thinking of perhaps just using UNKNOWN instead... which we
> currently have as 'null'...
>> Getting back the aromatic bond type might be an easier workaround.
> I remember I was quite happy to see the aromatic bond type gone... we
> have flags for this, allowing us to actually have SINGLE *and*
> ISAROMATIC... with an IBond.Order.AROMATIC that would not be possible.
> Dr E.L. Willighagen
> Post-doc @ Uppsala University (only until 2010-09-30)
> Proteochemometrics / Bioclipse Group of Prof. Jarl Wikberg
> Homepage: http://egonw.github.com/
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> This SF.net email is sponsored by
> Make an app they can't live without
> Enter the BlackBerry Developer Challenge
> Cdk-devel mailing list