Redocking of a large ligand. The picture on the left-hand side shows the crystal structure of a phosphopeptide inhibitor (green) bound to the SH2 domain of the proto-oncogene tyrosine-protein kinase Src (blue surface), available in the Protein Data Bank (PDB) under access code 1SKJ. The picture on the right-hand side shows the superposition of this crystal structure with the best binding mode predicted by DINC (pink).

Overview

Given the structures of a protein and ligand of interest (input), the goal of molecular docking is to predict the most likely binding mode of the resulting protein-ligand complex (output), i.e., to find the most likely conformation of this flexible ligand when it interacts with the binding site of the protein receptor. Among the main limitations of docking tools are the size and flexibility of ligands. The more flexible bonds in the ligand, the more degrees of freedom (DoFs) must be sampled, leading to a combinatorial explosion. To deal with this issue and perform molecular docking of large ligands, the Kavraki Lab at Rice University has developed DINC (Docking INCrementally). DINC uses an incremental algorithm to efficiently explore the search space of potential binding modes between a ligand and a protein. Instead of considering the entire solution space at once (i.e., docking the entire ligand), the algorithm breaks the search into several stages and optimizes each one locally before progressing to the next one.

Search Space

DINC treats the ligand as a superposition of a rigid-body component and a rotatable component. The rigid-body component contributes 6 DoFs (three for the position in space, three for the orientation), while the rotatable component contributes as many DoFs as there are rotatable bonds (or “torsions”) in the ligand. Clearly, the search space grows very large for ligands with numerous rotatable bonds. (The protein is also highly flexible and contributes so much to the complexity of the problem that docking procedures often treat it as rigid for simplicity.)

Since DINC is an incremental docking protocol, the “partial solutions” it builds correspond to fragments (contiguous sections) of the ligand. Expanding a partial solution is done by adding atoms to each fragment until the entire ligand has been reconstructed. At each stage of the algorithm, DINC considers 6 rotatable bonds as “active” in the partial solution (in addition to the 6 DoFs for position and orientation).

Algorithm

DINC is a meta-docking algorithm, in the sense that it relies on a standard docking tool, currently AutoDock 4 (AD4), to perform the sampling and the scoring at each docking round. The algorithm below summarizes what was first described in (Dhanik et al., 2013). See References for details.
  1. Choose an initial fragment containing 6 torsions.
  2. Mark all torsional DoFs as active. Dock the initial fragment using AutoDock.
  3. Until you have considered a fragment that encompasses all atoms in the ligand:
    • Select the lowest-energy conformations from the most recent docking results.
    • Extend each conformation for the next iteration of docking by adding groups of atoms composing 3 previously unvisited torsions in the ligand.
    • Mark each of these 3 new torsions as active. Select 3 of the most recently visited torsions in the previous fragment to remain active and mark all other torsions as rigid.
    • Dock each of the expanded fragments in parallel using AutoDock.
  4. Return:
    • The three lowest-energy conformations of the final docking results.
    • Representatives of the three lowest-energy clusters of the final docking results.
Algorithm flowchart. DINC starts by selecting a small fragment of the input ligand (Fragment 0), with only 6 rotatable DoFs, and uses it as input for the first round of docking with AutoDock 4 (AD4). The best binding modes are selected for expansion, i.e., they are “grown” by adding a small number of atoms. These new fragments are then docked in parallel using AD4. The process is repeated incrementally, until the entire input ligand is reconstructed. At the end, the best conformations are selected, based on binding energy ranking or RMSD clustering.