The method for MCS computation UniversalIsomorphismTester.getOverlaps returns different subgraphs, depending on the input order
The following code:
public static void main(String args[]) throws Exception { SmilesParser sp = new SmilesParser(DefaultChemObjectBuilder.getInstance()); IMolecule mol1 = sp.parseSmiles("Oc1ccccc1"); IMolecule mol2 = sp.parseSmiles("OCCCCCC"); System.out.println("call 1"); printMCSs(UniversalIsomorphismTester.getOverlaps(mol1, mol2)); System.out.println("call 2"); printMCSs(UniversalIsomorphismTester.getOverlaps(mol2, mol1)); } public static void printMCSs(List<IAtomContainer> list) { for (IAtomContainer iAtomContainer : list) { for (IAtom iatom : iAtomContainer.atoms()) System.out.println(iatom.getSymbol() + " arom: " + iatom.getFlag(CDKConstants.ISAROMATIC)); System.out.println(); } }
produces:
call 1
C arom: true
O arom: false
call 2
C arom: false
O arom: false
Actually, the results kind of make sense to me... the overlap originates from one of the two structures, so the overlap substructure found in mol1 is different than that overlap structure in mol2...
But, the question is, should the overlap itself reflect properties of the first structure at all...
Is that what you are thinking here too?
Well the getOverlaps() method is documented like that:
"Returns all the maximal common substructure between twp atom containers."
IMHO, this should not depend on the input order.
I was able to fix it by adding in the nodeConstructor()-method a check for the aromaticity of atoms (instead of only comparing two atoms only by there symbol)