Re: [OpenBabel-Devel] Fragmentation

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Dec 21, 2010, at 10:55 PM, erapp wrote:
> I need to be able
> to find all unique fragments for a molecule of a given length (i.e. number
> of atoms in fragment) and provide information about what makes it unique.

You might be able to adapt some code I wrote a few weeks ago, at

http://www.dalkescientific.com/writings/diary/archive/2011/01/13/faster_subgraph_enumeration.html

It's for OEChem but the translation to OpenBabel wouldn't be that hard. Craig James wrote:

> If you do find an algorithm that produces a reasonable number of fragments, the canonical SMILES generator might be useful.  It has the ability to generate fragment SMILES.  

and if you rewrite the subgraph-to-SMARTS code using that, you might get better performance than OEChem, which doesn't have that feature.

As Craig pointed out, the number of subgraphs grows quickly as a function of the number of atoms in the subgraph. I've found that k=6 and k=7 are tractable, but I haven't yet explored larger sizes to judge what the distribution looks like for typical small molecules.

				Andrew
				da...@da...