From: Andrew D. <da...@da...> - 2009-02-20 10:25:11
|
On Feb 20, 2009, at 10:57 AM, ma...@eb... wrote: > Thanks Andrew and Chris. I'm still a bit puzzled - first of all I > think > the rings in 35623 are aromatic (I just checked the rings' bond > aromaticity flags using that Java program I attached before). They could be. I only checked the SD file and I'm not so experienced with SD files as I am with SMILES. But the query structure was definitely not aromatic. > Secondly Chris, if you draw that query in Pubchem for substructure > query > searching, you get 1625 hits of which mant look to me suspiciously > like > 35623.. That's a different question. Now you're asking "why is the CDK definition of substructure different than the PubChem definition?" Here's your query as a SMILES, generated via PubChem's sketcher. C1C2C(CCC1)CC4C3C2CCCC3CCN4 Here's an example matched target, CID 2215 http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=2215&loc=ec_rcs It has the SMILES CN1CCC2=CC=CC3=C2C1CC4=C3C(=C(C=C4)O)O Clearly the latter contains many double bonds while the former only contains single bonds. This is a guess: PubChem might treat single, double, and aromatic bonds as the same if they are between carbons in a ring. For example, I did a search for this C1=CC=C=C=CC1 which is a 7-member ring containing 3 double-bonds in a row (C=C=C=C). One of the PubChem matches was http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi? cid=10330236&loc=ec_rcs SMILES C1C=CC=CC2=C1C3=CC=CC=CC3=N2 Looking at the structure, there's no place with 3 connected double bonds. Or even two for that matter. I suspect this is a PubChem tweak because aromaticity is an ambiguous definition and people sketching the structure might not be as sensitive to the problems that can arise. Andrew da...@da... |