From: Craig A. J. <cj...@em...> - 2009-08-07 16:32:46
|
Geoffrey Hutchison wrote: > What you're saying is that this Kekule SMILES when used as a SMARTS > doesn't match. That's fair enough -- that the SMARTS should be > transformed. > > I'll file this as a bug. > > Thanks > -Geoff > > >>>> OC1=C2C=CC(=C)N=C2CCC1 >>> Sure. Take a look at Daylight Depict: >>> http://www.daylight.com/daycgi/depict?4f43313d4332433d4343283d43294e3d433243434331 >>> >>> The two SMILES have the same depiction (i.e., the ring is >>> aromatic). In >>> SMILES, an exocyclic double bond does not break aromaticity. >> Well, OK. >> But if I want to search OC1=C2C=CC(=C)N=C2CCC1 with obgrep in my >> library, it doesn't match. I must use the aromatic notation. >> I wonder if perhaps obgrep could run the "SMILES aromaticity detection >> algorithm" on the smiles/smarts string before searching matching >> molecules? (Just like the searched library, which contains >> OC1=C2C=CC(=C)N=C2CCC1 substructures, is converted to aromatic). This is not a bug, it's just the way SMARTS is defined. A SMARTS that uses a Kekule form will never match anything; you have to use the aromatic form. Some applications will try to parse a SMARTS as a SMILES, and if it works, convert it to the aromatic form and re-parse it as a SMARTS. But that's often impossible, for example, what do you do with "C1=[N,C]C=CC=C1"? Better to require the user to type in "c1[n,c]cccc1", which is clear. Craig |