Re: [Rdkit-discuss] Submol: bond vs atom indicies
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Maciek W. <ma...@wo...> - 2016-03-22 10:06:19
|
I correct myself, all residue types are available from Chem.SplitMolByPDBResidues(). ---- Pozdrawiam, | Best regards, Maciek Wójcikowski ma...@wo... 2016-03-22 9:50 GMT+01:00 Maciek Wójcikowski <ma...@wo...>: > Hi Greg, > 2016-03-22 6:28 GMT+01:00 Greg Landrum <gre...@gm...>: > > > > Hi Maciek, > > > > > > On Mon, Mar 21, 2016 at 8:33 PM, Maciek Wójcikowski < > ma...@wo...> wrote: > >> > >> > >> I came across one problem with RDKit today, namely Chem.PathToSubmol() > function. Does the "path" mean atom or bond indices? On this very list I > fount the examples showing usage with atom idx [ > https://www.mail-archive.com/rdk...@li.../msg03966.html], > while the example on "Getting started in python" is feeding > Chem.FindAtomEnvironmentOfRadiusN() which gives a list of bond indices. The > documentation could be more explicit here... After my brief analysis of the > code I found out that the bonds should be used (correct me if I'm wrong). > > > > > > The function is still not documented, but it's definitely bonds. I think > the thread you reference from the mailing list says the same thing. > > Ok, you're right I've just noticed your comment, while the example was > still using atom indices (although they worked for the sample mol - > fortunatelly aligned with atom indices). > > > > > > >> > >> So here comes the question: is there an equivalent function or a clever > way to do Chem.PathToSubmol() on atom indices? Currently I do: 1) get the > atom path; 2) get bonds between every atom in path (their indices); 3) get > submol with Chem.PathToSubmol() > > > > > > I don't think so. > > > >> > >> PS. > >> I use it to get each proteins residue (amino acid) in separate mol. It > would be much easier if we could use "Molecule -> Residues -> Atoms" > instead of "Molecule -> Atoms -> (grouping of monomers) -> Residues". > >> > > > > SplitMolByPDBResidues() doesn't do what you want? > > > > > > Not really. I want to get each amino acid separately, so I'd have to do > SplitMolByPDBChainId() -> SplitMolByPDBResidues() -> break the peptide > bonds (to eliminate series of aa) -> split disconnected molecules. And that > only outputs valid PDB amino acids. Accessing non-standard ones, like HOH, > LIG, UNL, although present in PDB would be also desired. In other words the > unique key should be "monomer index + chain id" instead of only three > letter name as in SplitMolByPDBResidues(). > > Maciek > > > > > -greg > > > |