Hi Rajarshi,

This is indeed strange and made me think the aromatic flags are turned on by the SMILES parser, not aromaticity detector.  This is indeed the case - I rewrote your test into a simpler one:

   public void testAromaticitySmiles() throws Exception {

        String[] smiles = {
                "c1ccc2c(c1)c1c(C2)cccc1NC(C)=O",
                "c1cc2ccc3c4c2c(c1)ccc4ccc3",
                "c1ccc2c(c1)Cc1c2ccc(c1)NC(=O)C",
                "c1c2ccc3cccc4ccc(c5c1cccc5)c2c34",
                "C1c2c3c(C1)c(ccc3cc1c2ccc2c1cccc2)C",
                "c1ccc2c(c1)c1c(C2)cc(cc1)NC(=O)C(F)(F)F",
                "C1c2c(c3c1c(ccc3)N(C(C)=O)C(C)=O)cccc2",
                "c1c2ccc3c(c2cc2c1c1c(cc2)cccc1)cccc3",
                "C1c2c(c3c1cc(cc3)N(C(C)=O)O)cccc2",
                "c1ccc2c(c1)C(c1c3c2ccc2c3c(cc1)c1c(C2=O)cccc1)=O",
                "CC(=O)Nc1c2Cc3ccccc3c2ccc1",
                "Cc1c2ccccc2c(c2ccc3c(c12)cccc3)C"
        };
        SmilesParser sp = new SmilesParser
(DefaultChemObjectBuilder.getInstance());

        for (String smile : smiles) {
            IAtomContainer mol = sp.parseSmiles(smile);
            int aromatic = 0;
            for (int i=0; i < mol.getAtomCount();i++)
                if (mol.getAtom(i).getFlag(CDKConstants.ISAROMATIC))
                    aromatic ++;
            System.out.print(aromatic + "\t");
            System.out.println(smile);
        }

    }

and the output is the same as yours, even without using aromaticity descriptor (and AtomaticAtomsCountDescriptor )

12    c1ccc2c(c1)c1c(C2)cccc1NC(C)=O
16    c1cc2ccc3c4c2c(c1)ccc4ccc3
12    c1ccc2c(c1)Cc1c2ccc(c1)NC(=O)C
20    c1c2ccc3cccc4ccc(c5c1cccc5)c2c34
18    C1c2c3c(C1)c(ccc3cc1c2ccc2c1cccc2)C
12    c1ccc2c(c1)c1c(C2)cc(cc1)NC(=O)C(F)(F)F
12    C1c2c(c3c1c(ccc3)N(C(C)=O)C(C)=O)cccc2
22    c1c2ccc3c(c2cc2c1c1c(cc2)cccc1)cccc3
12    C1c2c(c3c1cc(cc3)N(C(C)=O)O)cccc2
22    c1ccc2c(c1)C(c1c3c2ccc2c3c(cc1)c1c(C2=O)cccc1)=O
12    CC(=O)Nc1c2Cc3ccccc3c2ccc1
18    Cc1c2ccccc2c(c2ccc3c(c12)cccc3)C
Regards,
Nina


Rajarshi Guha wrote:
On Dec 7, 2007, at 1:12 PM, Nina Jeliazkova wrote:

  
I've put  two files with PAHs, which are not recognised as  
aromatics at http://ambit.acad.bg/toxTree/data/

Those compounds were sent to me by my colleagues, as an example for  
PAHs with cancerogenic activity.
    


Hmm, strange. I had the following code which works pretty well with  
the SMILES versions of your molecules.

IMolecularDescriptor descriptor = new AromaticAtomsCountDescriptor();

         String[] smiles = {
                 "c1ccc2c(c1)c1c(C2)cccc1NC(C)=O",
                 "c1cc2ccc3c4c2c(c1)ccc4ccc3",
                 "c1ccc2c(c1)Cc1c2ccc(c1)NC(=O)C",
                 "c1c2ccc3cccc4ccc(c5c1cccc5)c2c34",
                 "C1c2c3c(C1)c(ccc3cc1c2ccc2c1cccc2)C",
                 "c1ccc2c(c1)c1c(C2)cc(cc1)NC(=O)C(F)(F)F",
                 "C1c2c(c3c1c(ccc3)N(C(C)=O)C(C)=O)cccc2",
                 "c1c2ccc3c(c2cc2c1c1c(cc2)cccc1)cccc3",
                 "C1c2c(c3c1cc(cc3)N(C(C)=O)O)cccc2",
                 "c1ccc2c(c1)C(c1c3c2ccc2c3c(cc1)c1c(C2=O)cccc1)=O",
                 "CC(=O)Nc1c2Cc3ccccc3c2ccc1",
                 "Cc1c2ccccc2c(c2ccc3c(c12)cccc3)C"
         };
         SmilesParser sp = new SmilesParser 
(DefaultChemObjectBuilder.getInstance());

         for (String smile : smiles) {
             IAtomContainer mol = sp.parseSmiles(smile);
             CDKHueckelAromaticityDetector.detectAromaticity(mol);

             descriptor.calculate(mol);
             System.out.print(((IntegerResult) descriptor.calculate 
(mol).getValue()).intValue() + "\t");
             System.out.println(smile);
         }

where the SMILES are obtained from M18_*.sdf using OpenBabel

The results are

12	c1ccc2c(c1)c1c(C2)cccc1NC(C)=O
16	c1cc2ccc3c4c2c(c1)ccc4ccc3
12	c1ccc2c(c1)Cc1c2ccc(c1)NC(=O)C
20	c1c2ccc3cccc4ccc(c5c1cccc5)c2c34
18	C1c2c3c(C1)c(ccc3cc1c2ccc2c1cccc2)C
12	c1ccc2c(c1)c1c(C2)cc(cc1)NC(=O)C(F)(F)F
12	C1c2c(c3c1c(ccc3)N(C(C)=O)C(C)=O)cccc2
22	c1c2ccc3c(c2cc2c1c1c(cc2)cccc1)cccc3
12	C1c2c(c3c1cc(cc3)N(C(C)=O)O)cccc2
22	c1ccc2c(c1)C(c1c3c2ccc2c3c(cc1)c1c(C2=O)cccc1)=O
12	CC(=O)Nc1c2Cc3ccccc3c2ccc1
18	Cc1c2ccccc2c(c2ccc3c(c12)cccc3)C

where the numbers are the number of aromatic atoms. I checked the  
validity of the numbers by using Daylights depictmatch and counting  
the hits for the [a] SMARTS pattern

All the above numbers match, except for c1ccc2c(c1)C(c1c3c2ccc2c3c 
(cc1)c1c(C2=O)cccc1)=O (there should be 24 aromatic atoms).


Doing the same for the hetero case I use the following SMILES,  
convert from M19.*.sdf

c1ccc2c(c1)n(c1c2cc(cc1)N)CC
c1c(cc2c(c1)cc1c(n2)cc(cc1)N)N
n1c2n(c3c1ccc(n3)N)cccc2
c1ccc2c(c1)c1c([nH]2)cccc1
n1(c2c(nc1N)c1cccnc1cc2)C
[nH]1c2c(c3c1nc(c(c3)C)N)cccc2
c1ccc2c(c1)c1c([nH]2)c(c(nc1C)N)C
n1cccc2c3nc(n(c3c(cc12)C)C)N
n1c2n(c3c1ccc(n3)N)cccc2C
c1c2c(ccc1)[nH]c1nc(ccc21)N

and I get the following output for aromatic atom count

13	c1ccc2c(c1)n(c1c2cc(cc1)N)CC
14	c1c(cc2c(c1)cc1c(n2)cc(cc1)N)N
9	n1c2n(c3c1ccc(n3)N)cccc2
13	c1ccc2c(c1)c1c([nH]2)cccc1
13	n1(c2c(nc1N)c1cccnc1cc2)C
13	[nH]1c2c(c3c1nc(c(c3)C)N)cccc2
13	c1ccc2c(c1)c1c([nH]2)c(c(nc1C)N)C
13	n1cccc2c3nc(n(c3c(cc12)C)C)N
9	n1c2n(c3c1ccc(n3)N)cccc2C
13	c1c2c(ccc1)[nH]c1nc(ccc21)N

The two failures are n1c2n(c3c1ccc(n3)N)cccc2 and n1c2n(c3c1ccc(n3)N) 
cccc2C (should have 13 aromatic atoms)

However the SDF version (i.e., Nina's code) gives 0 for aromatic atom  
counts. But AFAIK, the SmilesParser was updated to ignore the  
aromatic markup in a SMILES string. I'm not sure whats going on here

-------------------------------------------------------------------
Rajarshi Guha  <rguha@indiana.edu>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04  06F7 1BB9 E634 9B87 56EE
-------------------------------------------------------------------
Q:  Why did the mathematician name his dog "Cauchy"?
A:  Because he left a residue at every pole.



-------------------------------------------------------------------------
SF.Net email is sponsored by: 
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Cdk-devel mailing list
Cdk-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-devel