Hi Nick - 

This PDB employs non-standard usage of the PDB insertion code column.  Typically, it is used to insert one or more residues between adjacent numbers in a canonically numbered structure, e.g. residue 42A (along with 42B, 42C, … , 42P, etc.) is between residues 42 and 43.  

As for whether it affects the structure: Yes, in this case, this non-standard usage also has the unwanted side effect in PyMOL of preventing the prodomain from being shown in cartoon representation, because its residues are too far from “adjacent” residues (i.e. the mature peptide residues with numbers on either side of the “inserted” residue).

To view the sequences as intended while keeping the mature peptide numbering the same, try this:

fetch 3psg, async=0
select prodomain, resi *P
alter prodomain, resi=str(int(resi[:-1])-44)  # decrement integer portion of prodomain residues by 44

With this method (as opposed to extracting the prodomain to a new chain or object), you get a continuous cartoon based on atom connectivity.  The drawback is, if you need to refer to specific residues of the prodomain, you have to do a little math.  

If you don’t need the cartoon representation to be continuous (e.g. if you’re zooming in on a different section of the protein you can replace the 3rd line with:

alter prodomain, chain=“P”

to split off the prodomain into a separate chain.  In this case, the sequence will be numbered as in the PDB, only without the insertion code.

Hope that helps.


Jared Sampson
Xiangpeng Kong Lab
NYU Langone Medical Center
550 First Avenue
New York, NY 10016

On Mar 20, 2014, at 8:59 AM, Wimock <wimock@hotmail.com> wrote:


I'm new to Pymol and am looking at the structure of pepsinogen (PDB: 3psg). This protein contains a 44 residue prodomain, which the pdb file says is assigned an insertion code 'P' and labelled 1 - 44. This prodomain is cleaved to yield pepsin (5pep), whose sequence is labelled 1 - 326. I gather Pymol should be able to see these as separate parts of the sequence, but it instead seems to interleave the sequences - I don't know if this affects the structure or not? Other programs (Chimera) show the sequence as it should be (P1-44...1-326).

Is there a way to have the sequence parts recognised in the sequence viewer? This would make it much easier for me to manipulate these sections. Please let me know if I haven't been clear.

Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
PyMOL-users mailing list (PyMOL-users@lists.sourceforge.net)
Info Page: https://lists.sourceforge.net/lists/listinfo/pymol-users
Archives: http://www.mail-archive.com/pymol-users@lists.sourceforge.net

This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email.