How_it_works

vplg (9)
Tim Schäfer

This page explains the general idea behind VPLG, i.e., how the data in PDB and DSSP files is used to compute a protein ligand graph.

PDB files contain 3D atom/residue data of a protein, while DSSP files contain information on the secondary structure of the protein. We consider the following SSE types:

  1. Alpha-helix
  2. Beta-strand
  3. Ligand

NOTE: Strictly speaking a ligand is of course not an SSE, but if we talk of SSEs in the following, we mean "SSEs and ligands".

Each SSE becomes a vertex in the protein graph. We use the atom coordinates to determine which protein residues are close to each other in 3D. We differentiate between the following atom level contact types:

  1. backbone--backbone contact (BB contact)
  2. backbone--side chain contact (BC contact)
  3. side chain--side chain contact (CC contact)
  4. ligand--other SSE contact (LX contact)

We then check which SSEs these residues are part of. If enough contacts of the different types exist between a pair (a, b) of SSEs, they are considered neighbors and an edge a--b is added to the protein graph.

What is "enough" is defined by a rule set. The current rule set depends on the SSE types of a and b. It is shown here:

Type of SSE a Type of SSE b Rule
E E BB > 1 or CC > 2
H E (BB > 1 and BC > 3)or CC > 3
H H BB > 3 or CC > 3
L any LX >= 1

Graph creation


Computation of the protein graph from 3D atom data. Contacts are calculated on atom level from the 3D data in a PDB file (a). All residues of the considered protein chain are assigned to SSEs (b), which become the vertices of the protein graph (c). The atom contact information is used to calculate the spatial relationships between the SSEs, represented by edges in the graph (d).

The following image shows an overview of the input and output data of VPLG:

VPLG input and output



Data sources and output of VPLG. The PDB files containing 3D atomic coordinates are downloaded from the RCSB PDB and DSSP is used to assign SSEs to each protein residue. VPLG reads both the PDB and DSSP files and uses them to generate a protein graph file and a bitmap or vector image of the protein graph. Optionally, statistics and the graph can also be written to a database.



Related

Wiki: Home