Dockground

Docking Templates


Templates for structural alignment v1.0 are generated based on following constraints:


Interfaces. The initial set of all bound hetero- and homo-dimers from DOCKGROUND was reduced using following requirements:

(1) X-ray structures with resolution >= 3.5 A;

(2) mean accessible surface area buried by each chain >= 250 A2;

(3) number of residues at the interface in each chain >= 10.

Each complex was further checked for inter-penetration by an automated procedure. Application of these criteria resulted in 12,134 redundant complexes. The structural redundancy was eliminated by MM-align (Mukherjee & Zhang, Nucleic Acids Res. 2009, 37: e83). Two interfaces were similar if their TM-score was > 0.9. The similarity graph was generated and clustering performed by an in-house graph clustering procedure, producing 7,107 clusters. Cluster representatives were selected based on the lowest number of missing residues and the best resolution.

The following notation is used to name the PDB files:

iXXXXM1CH1M2CH2_N.pdb

'i' - indicates that the template is from protein interface library
XXXX - PDB code
M1, M2 - serial number of the model in the corresponding Biounit file for chains CH1 and CH2
CH1, CH2 - chain identifiers for the two interacting proteins
N - 1 for chain CH1 and 2 for chain CH2



Full structures. The additional requirement for the initial set of 12,134 redundant complexes (see 'Interfaces' above) was the presence of at least three regular secondary structure elements (alpha-helices and/or beta-strands) in each subunit, reducing the number of complexes to 11,774. The structural redundancy was eliminated by MM-align comparison of full structures, with the same criteria as for the interface (see above). The final set consisted of 5,050 structurally non-redundant complexes.

The following notation is used to name the PDB files:

XXXXM1CH1M2CH2_N.pdb

XXXX - PDB code
M1, M2 - serial number of the model in the corresponding Biounit file for chains CH1 and CH2
CH1, CH2 - chain identifiers for the two interacting proteins
N - 1 for chain CH1 and 2 for chain CH2


Templates for structural alignment v1.1 were generated using a more sophisticated graph clustering algorithm by Hartuv and Shamir (Inform. Process. Lett. 2000, 76: 175-181) to eliminate redundancies. The sets contain 4,950 full structures and 5,936 interfaces.



Templates for structural alignment v2.0 are generated based on the following restraints:


Interfaces. were extracted from the full structure template set using a distance cutoff of 12 Å from any heavy atoms in a residue of one chain to any heavy atom of the other. Any fragments generated that were less than five residues in length were then eliminated from the final structure.



Full structures. The Protein Data Bank as of September 1 2020 was reduced using the following requirements:

  1. X-ray structures with resolution >= 3.5 Å
  2. Mean accessible surface area buried by each chain >= 250 Å2
  3. Number of residues in each chain >= 10
  4. At least three regular secondary structure elements (alpha-helices or beta-strands)

Redundancy was filtered using MMalign where a TMscore of 0.9 or higher is considered similar (Mukherjee & Zhang, Nucleic acids Res. 2009, 37: e83). Selection between proposed templates was done based on best resolution and lowest number of missing residues.