From: Pascal M. <pas...@gm...> - 2010-09-30 11:31:08
|
Hi, > Each of the components should only do the minimal promised work, which gives maximum > flexibility in wiring things together as you like them. > In your example, you would always need to do aromaticity detection (using some > CDKHueckelAromaticityDetector.java or so) to do the detection before doing the Murcko stuff. > The reasoning in this case, (...) Ok, I think I understand... But do you agree that the Murcko framework should retain the bond order? I.e., that a benzene should not be a cyclohexane in the frameworks? And should, or not, the framework retain all cycles? Reading the reference publication, I think so, but the following code generate up to 3 or 6 framework in my tests. E.g. C[N+]1(CCC[C@@H]1COC(=O)[C@@](c2ccccc2)(C3CCCCC3)O)C ZINC00000585 gives 3 frameworks: C(C1CCCCC1)C2CCCCC2 ZINC00000585_1 C(CC1CCCCC1)OCC2[N+]CCC2 ZINC00000585_2 C(OCC1[N+]CCC1)CC2CCCCC2 ZINC00000585_3 (2 and 3 are by the way identical, due to the phenyl conversion in cyclohexyl - although I now use CDKHueckelAromaticityDetector.detectAromaticity(mol) And is intended that the nitrogen keeps the charge in the framework? (I don't know what the behaviour should be in this case...) Here is my code: GenerateFragments gf=new GenerateFragments(); boolean bolFalse = false; while(reader.hasNext()){ Molecule mol = (Molecule)reader.next(); CDKHueckelAromaticityDetector.detectAromaticity(mol); gf.generateMurckoFragments(mol,bolFalse,bolFalse,3); String[] smiles=gf.getMurckoFrameworksAsSmileArray(); if (smiles.length > 0) { for (int i = 0; i < smiles.length; i++) { System.out.print(smiles[i]); System.out.print(" "); System.out.print(mol.getProperty(CDKConstants.TITLE)); System.out.print("_"); System.out.println(i+1); } } By the way, is there a better way of printing like System.out.println(smiles[i]), " ", mol.getProperty(CDKConstants.TITLE); ? Thanks again, Regards, Pascal |