Hi Adrian,
On Tue, Apr 8, 2008 at 5:06 PM, Adrian Schreyer <am...@ca...> wrote:
> I was wondering what would be the best way in RDKit to represent
> molecules with a metal complex in SMILES, for instance heme.
> Apparently different programs use different approaches, CACTVS uses a
> non-standard notation
> CC1=C(CCC(O)=O)C2=N3|[Fe]45|N6=C(C=c7n4c(=C2)c(CCC(O)=O)c7C)C(=C(C=C)C6=Cc8n5c(C=C13)c(C=C)c8C)C,
> some represent coordinate bonds as standard bonds and others use a
> disconnected notation
> CC1=C(C2=CC3=NC(=CC4=C(C(=C([N-]4)C=C5C(=C(C(=N5)C=C1[N-]2)C=C)C)C)CCC(=O)O)C(=C3C)CCC(=O)O)C=C.[Fe+2].
I'm afraid I don't have a good answer for you. If you just want to be
able to read the molecules in and do things like calculate MW, etc.
then I'd probably go with the dot-disconnected form. Anything else
just isn't going to work particularly well.
> Will there be any openSMILES standard for this?
There was a good proposal from Peter Ertl on the OpenSMILES mailing
list for an extension to support organometallic bonding:
http://sourceforge.net/mailarchive/forum.php?thread_name=5191c3b80710060104h8d16a03j66a3d2b4ee3f9585%40mail.gmail.com&forum_name=blueobelisk-smiles
But it's an extension, so I guess that it will have to wait until
after the OpenSMILES itself is finalized; and who knows when that will
happen.
If we can get a couple of people to agree on what should be done, I'd
be happy to see an organometallic smiles extension added to the RDKit
(probably as a separate parser that handles the extended smiles), but
I'm unlikely to have the time to do it anytime soon. This is one of
those wonderful opportunities for the open-source model to function
and someone who needs this stuff and knows C++ to step forward and add
it. It's not a particular easy addition (some core data structures may
have to be modified), so I'm not holding my breath. :-)
-greg
|