I set up an atom with a SMILES parser, then generated 3D coordinates for it, and am trying to calculate all descriptors. DescriptorEngine keeps failing at LengthOverBreadth descriptor with the following message:
org.openscience.cdk.exception.NoSuchAtomTypeException: The AtomType Csp2 could not be found
This must be a bug since AtomType is entirely encapsulated.
Here are some code snippets:
//PARSE SMILES
IChemObject mol = new Molecule();
SmilesParser sp = new SmilesParser(DefaultChemObjectBuilder.getInstance());
mol = sp.parseSmiles("COC1=CC2=C(C=C1)NC3=C2CCNC3");
Molecule molecule = (Molecule)mol;//GENERATE 3D COORDINATES
TemplateHandler3D template = TemplateHandler3D.getInstance();
ModelBuilder3D mb3d = ModelBuilder3D.getInstance(template,"mm2");
molecule = mb3d.generate3DCoordinates(molecule,true);//CALCULATE DESCRIPTORS
engine = new DescriptorEngine(DescriptorEngine.MOLECULAR);
try{
engine.process(molecule);
}
catch(CDKException e)
{System.out.println("*CDKException: "+e.getMessage()); System.out.println();}
Logged In: YES
user_id=2086894
Originator: YES
I've tried this with other molecules now as well and I'm getting similar exceptions:
"org.openscience.cdk.exception.NoSuchAtomTypeException: The AtomType C= could not be found"
Logged In: YES
user_id=349408
Originator: NO
The problem is not in the descriptor code and is most likely in the 3D structure generation code
Logged In: YES
user_id=2086894
Originator: YES
Hmm... but why would it say that it could not recognize those AtomTypes? Also this is the only descriptor which is giving an error. If the 3D coordinates code were wrong, why wouldn't others fail? Surely LengthOverBreadth can't be the only one which needs AtomTypes. Furthermore, the qsarmolecular JUnit test of the most recent nightly build on your site shows LengthOverBreadthDescriptor causing Errors (in contrast to others which cause Failures).
Logged In: YES
user_id=349408
Originator: NO
I don't know how the 3D builder is getting the types. If you can send me an SDF with a structure that causes this problem I can take a look. At this point I can't reproduce your problem.
Furthermore, the LoB code does not deal with atom typing at all. The errors you note on the nightly test page deal with the fact that the center of mass calculation is failing (due to a problem in the atom typing code before invocation of the LoB code)
Are you working with trunk or 1.0.2? On trunk I can't even generate the 3D coords using the CDK.
Logged In: YES
user_id=2086894
Originator: YES
Well, one example of a smiles string which is failing is "COc1ccc2nc3CNCCc3c2c1" for "6-Methoxy-1,2,3,4-tetrahydro-9H-pyrido[3,4b] indole" (I am writing this program to iterate over a large chemical database and fill in descriptors). I have attached the sdf file for comparison. Both fail, but this is just one example, I have seen the behaviour with many others as well. The exception is definitely being thrown during the execution of the DescriptorEngine, and definitely during LengthOverBreadthDescriptor. My fix at the moment is to just remove LengthOverBreadthDescriptor from the list of descriptors to calculate, and when I do that it works fine.
This is CDK 1.0.2
File Added: SID_24278180.sdf
sdf file for "COc1ccc2nc3CNCCc3c2c1"
Logged In: YES
user_id=452972
Originator: NO
Working on the ModelBuilder3D I found that I get a similar error in some cases. It's actually the fingerprinter and in the fingerprinter the Aromaticity detection which causes the problem. If you want to test, change line 171 of
TemplateHandler3D from BitSet ringSystemFingerprint = new
Fingerprinter().getFingerprint(queryRingSystem);
to BitSet ringSystemFingerprint = new
Fingerprinter().getFingerprint(ringSystems); and run the ModelBuilder3D
tests. Four of them fail with this problem.
Works fine in 1.2.x, so closing this bug