Re: [Rdkit-discuss] Submol: bond vs atom indicies
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Maciek W. <ma...@wo...> - 2016-03-22 08:50:27
|
Hi Greg, 2016-03-22 6:28 GMT+01:00 Greg Landrum <gre...@gm...>: > > Hi Maciek, > > > On Mon, Mar 21, 2016 at 8:33 PM, Maciek Wójcikowski <ma...@wo...> wrote: >> >> >> I came across one problem with RDKit today, namely Chem.PathToSubmol() function. Does the "path" mean atom or bond indices? On this very list I fount the examples showing usage with atom idx [ https://www.mail-archive.com/rdk...@li.../msg03966.html], while the example on "Getting started in python" is feeding Chem.FindAtomEnvironmentOfRadiusN() which gives a list of bond indices. The documentation could be more explicit here... After my brief analysis of the code I found out that the bonds should be used (correct me if I'm wrong). > > > The function is still not documented, but it's definitely bonds. I think the thread you reference from the mailing list says the same thing. Ok, you're right I've just noticed your comment, while the example was still using atom indices (although they worked for the sample mol - fortunatelly aligned with atom indices). > > >> >> So here comes the question: is there an equivalent function or a clever way to do Chem.PathToSubmol() on atom indices? Currently I do: 1) get the atom path; 2) get bonds between every atom in path (their indices); 3) get submol with Chem.PathToSubmol() > > > I don't think so. > >> >> PS. >> I use it to get each proteins residue (amino acid) in separate mol. It would be much easier if we could use "Molecule -> Residues -> Atoms" instead of "Molecule -> Atoms -> (grouping of monomers) -> Residues". >> > > SplitMolByPDBResidues() doesn't do what you want? > > Not really. I want to get each amino acid separately, so I'd have to do SplitMolByPDBChainId() -> SplitMolByPDBResidues() -> break the peptide bonds (to eliminate series of aa) -> split disconnected molecules. And that only outputs valid PDB amino acids. Accessing non-standard ones, like HOH, LIG, UNL, although present in PDB would be also desired. In other words the unique key should be "monomer index + chain id" instead of only three letter name as in SplitMolByPDBResidues(). Maciek > > -greg > |