Considering Global and Local Conformational Changes in Molecular Docking
- 1. Laboratório de Biologia Computacional e Bioinformática, Universidade Federal do ABC, Brazil
- 2. Programa de Pós-Graduação em Ciências Fisiológicas, Universidade Federal do Triângulo Mineiro, Brazil
- 3. Centro de Ciências Naturais e Humanas, Universidade Federal do ABC, Brazil
- 4. Centro de Matemática Computação e Cognição, Universidade Federal do ABC, Brazil
Abstract
Proteins undergo changes in their form (conformational changes) upon interaction with compounds/substrates. Molecular docking is an important tool used in the study of correlations between structure and function, aiding the understanding of several biological processes, shedding light on drug development. Structural rearrangements can occur during molecular recognition in order to optimize interactions in the complex, leading to local and global conformational changes. Conformational selection and induced fit are models that attempt to explain structural variation effects in molecular recognition. In this review we discuss the different strategies employed for global and local conformational changes, in both protein-ligand and protein-protein docking.
Keywords
Keywords
Conformational selection; Induced fit; Protein flexibility; Molecular docking
Citation
Philot EA, Lima AN, Resende-Lara PT, Tuffaneto PC, Scott LPB (2016) Considering Global and Local Conformational Changes in Molecular Docking. JSM Chem 4(3): 1031.
ABBREVIATIONS
ANM: Anisotropic Network Models; CS: Conformational Selection; IF: Induced Fit; NMA: Normal Modes Analysis
INTRODUCTION
are inherently dynamic systems with various internal motions that result in different shapes (conformational changes) upon interactions with substrate and/or compounds [1,2].These conformational changes allow proteins to function properly, as well as to modulate other protein’s functions [1]. Since structure and function are intrinsically related, their understanding is essential for drug design [1]. During the molecular recognition process (either protein-ligand or protein-protein) conformational motions can occur in various degrees, ranging from small local changes (vibrational motions) to collective global changes (domain and allosteric motions). This protein dynamics plays an essential role in their functions, which is determined by the flexibility pattern in wild type [3], mutant [4] or allosteric sites [5] [6,7].In this context, flexibility is essential in the study of protein collective motions, interface formation of the proteins complexes, and protein function as a whole.
Molecular docking is a computational technique that simulates interactions between biomolecules, which can be two proteins (protein-protein docking), or a protein and a compound (protein-ligand docking). These approaches are important for drug development and the understanding of several biological processes.
Docking methods can be classified in three classes, according to flexibility: rigid, semi-flexible, and flexible. Most methods/ software falls into the two first categories due to the high computational cost for flexible docking. The different levels of approximations for docking simulations depend on the type and number of freedom degrees. The fully flexible - which considers all degrees of freedom - is the most interesting case where both molecules (ligand and receptor) change their conformations [8], however, this approach has high computational cost and still poses challenges.
Since flexible docking if highly demanding from both time and computational viewpoints, a number of residues with backbone flexibility and/or side-chain free torsions should be accounted in protein-protein docking protocols in order to access various conformational set [9]. These procedures attempt to mimic conformational selection and induced fit, respectively. In both cases, docking simulations must be accurate in regards to two main parameters: the search for binding configurations and result classification through scoring functions.
There are two different models to explain protein dynamics in molecular recognition: conformational selection (CS) and induced fit (IF). In CS model, the receptor undergoes global changes with a large conformation set presenting and equilibrium of both active and inactive forms. The active conformer is preferred by the ligand, which interacts and forms the interface. Induced fit (IF) models consider structural changes that receptors undergo after ligand binding, that are generally local sidechain rearrangements.
several computational approaches are being used to simulate CS and IF. Among them, elastic network models like Anisotropic Network Models (ANM) and Normal Modes Analysis (NMA) provide intrinsic flexibility data about receptor structure giving an overview of protein collective motions [10–13]. Rotamer libraries are also used to adjust and refine results after docking calculations. There are many different methods proposed to consider protein conformational change (both local and global) in molecular recognition. Some of the procedures applied to protein-ligand and protein-protein docking are described in the following sections.
PROTEIN-PROTEIN DOCKING
In the protein-protein docking approach the receptor protein is considered to be fixed, while the ligand protein moves (rotation and translation) in the binding mode search. The protein flexibility protocols in protein-protein docking are very similar to the previously discussed, and can be treated implicitly or explicitly. Implicit flexibility is performed through the use of soft scoring functions and/or representation of protein structure, which allows for a penetration degree between specific portions of the proteins, or the softening of their molecular surfaces[32,33]. Therefore an optimization stage is required to remove steric classes - Hex [34] uses spherical polar Fourier correlations to describe the protein structure and soft potential to protein flexibility, for instance. Soft docking is also able to describe local flexibility of side chains and small amplitude rearrangements for backbone and loops [32,33].
The explicit approach to protein flexibility is subdivided in partially flexible or fully flexibility methodologies [35]. Partially flexible methodologies are employed in several ways: some software’s apply rigid body protein-protein docking followed by subsequent refinements in complex interface to provide partial flexibility. These refinement steps may allow conformational changes of side-chains, loops and local interface backbone. ICMDISCO [36] combines soft docking – rigid-body motions and soft potentials followed by optimization of side-chain at complex interface. This method provides good results for targets that display small amplitude conformational change.
Bastard et al. [37], proposed a successful method to predict complex geometry considering loop flexibility in proteinprotein docking. They build an ensemble of possible loop arrangements (multi-copy), minimized the energy of ligand starting configuration and selected the best loop configuration, according to RMSD and average energy. HADDOCK uses different approaches to consider protein flexibility in docking protocol:
(i) rigid body with energy minimization,
(ii) semi-flexible refinement of interface residues by side-chain and backbone and
(iii) Cartesian refinement in solution [10].
Soft and semi-flexible docking cannot accurately model large amplitude conformational changes (global backbone and domain motions) when using one single conformation of complex partners. These global motions have been simulated with ensemble docking and on-the-fly protocols. In ensemble docking a set of pre-generated conformations obtained from different sources (simulation and/or experimental) [38] can be used instead of a single conformation. Trellet et al. [39], used proteinpeptide ensemble flexible docking methodology with a set of three distinct peptide conformations - extended, α-helix and polyproline-II. Similarly, Sahu et al. [40], used protein-protein ensemble docking to investigate the stability of α-synuclein dimers - they used two totally distinct conformations of a α-synuclein monomer: one with experimental α-helix structure, and the other with modeled β-structure.
Several protein-protein docking softwares of on-the-fly scheme have used NMA to account for global changes, due to the ability of such method in describing large amplitude motions [41]. Venkatraman and Ritchie Eigen Hex [13] developed an algorithm that uses normal modes analysis of a simple elastic network of a protein flexibility - initially performing rigid docking with later soft docking approaches. They used the swarm optimization algorithm to perturb docked conformation using NMA data in a pose-dependent way which allows moderate flexibility.
Fiber Dock [42] is yet another methodology that uses normal modes to account for backbone flexibility. This software interactively chooses relevant modes and minimizes the flexible protein structure along each one of them, achieving a flexibility refinement that allows side-chain and backbone conformational changes. As described in this work, rotamer libraries are used to side-chain flexibility and normal modes to backbone. Other methodologies as cNMA [12], Swarm Dock [43], etc, also use NMA to account for global changes during molecular recognition.
Figure (1)
Figure 1: Conformation sampling in molecular docking of protein RAN and Importin β through various methods. A) Importin β (cartoon) adjusts in RAN surface, increasing the contact area and, thus, promoting more specificity in binding. Elastic networks models allows to sample receptor conformers with large displacements through global flexibility data. This data can be achieved by performing NMA or ANM. B) On-the-fly adjustment of importin β in RAN. Accounting global flexibility while docking is performed increase the accuracy, but also with the increase of the computational cost. C) Importin β and RAN side chain adjustments. Side chain rotamers provide local optimization of molecular contacts, refining final solutions. This approach can be employed by a rotamer library or molecular dynamics. D) Importin β and RAN multiple sampling with molecular dynamics. This approach can be used to generate multiple conformers of both molecules and perform rigid ensemble docking.
illustrates the most common ways to sample conformational changes and a few of the methods used to describe them, summarizing the methodologies discussed in this topic.
PROTEIN-LIGAND DOCKING
Protein-ligand docking consists in predicting conformations of small molecules (ligands) in the binding sites of macromolecules (receptors). This is a relevant technique in the context if inhibitor/ modulator drug design for disease related proteins. Currently there are numerous programs that employ this methodology. The most cited protein-ligand docking programs are Autodock [14], GOLD [15] and Glide [16]. Until 2013, 25,87% of works published in this area used the Autodock program16,69% used GOLDand 11,38% Glideof the main computational challenges for proteinligand docking is protein flexibility, due to the macromolecule’s number of freedom degrees. Protein flexibility can be approached by four methodologies: (i) soft docking, (ii) selective docking, (iii) ensemble docking and (iv) on-the-fly dockingsimplest way to introduce partial flexibility to protein is soft dockingThe technique consists in a soft potential in the receptor-ligand interface, i.e., a smoothing of the Lennard-Jones potential in this region [17,18]. Small adjustments in the receptor-ligand interface are allowed through a closer approximation of the same. The programs Dock [19] and Glideuse a scaling factor of the Lennard-Jones radii . Others examples are the Lennard-Jones 8−4 potential in GOLDand smooth potential in AutoDock 3.0docking also interpreters the receptor as partially flexible, considering specific regions of the receptor, like side chains of active site, as flexible [20]. Rotamer libraries are one of the main tools used for this purpose. GOLDAutoDock 4and FITTED [21] use rotamer libraries as a set of energetically accessible conformations of side chains for selected residues - ICM [22] uses rotamer libraries combined with the Monte Carlo search algorithmIn 2012 Lima et al developed a methodology denominated GANM that combines genetic algorithms, NMA and rotamer libraries with proteinligand docking simulations. The use of rotamer libraries allows side chains of the active site residues flexible [23]
Ensemble and on-the-fly docking are approaches that can comprise conformational changes of large amplitude for the receptor. DOCK and FlexE [24] are examples of programs that use ensemble docking - which consists of a set of protein conformations, obtained either experimentally or computationally (through the use of techniques like NMA, molecular dynamics and principal components analysis), as opposed to the traditional approach, in which only one receptor structure is targeted for docking [25,26]. et al used experimentally derived conformations to perform flexible docking to the cytochrome c peroxidase - the ensemble of the receptor structure was built from multiple conformations present in the electron-density map of the cavity site of an apo structure of cytochrome. They used 583,363 compounds in 16 conformations of this protein, resulting in new ligand for the cytochrome c peroxidase [27]. Sperandio et al., used NMA to create a structure ensemble of cyclin-dependent kinase 2. Their protocol selected several receptor conformations suitable for docking [28]. Philot et al., also identified binding sites to human thioredoxin 1 from an ensemble of structures obtained through NMA to investigate the docking with three phenotiazinc drugs [29,30].
Lastly, on-the-fly docking treats the receptor as fully flexible, i.e., it changes the receptor conformations during the dockingDue to the large amount of freedom degrees present in this type of simulation, other strategies are applied in order to reduce the computational cost. The first strategy is to dock the ligand in a rigid conformation receptor, later changing the side chains of proteins using rotamer library and minimizing the complex [17]. This methodology is applied by ROSETTALIGAND [31]..
DISCUSSION & CONCLUSION
Due the importance of protein-protein and protein-ligand docking in the drug design process, new methods and tools that consider the global and local conformational changes have been more and more investigated and perfected during the past years. The methodologies used to confer flexibility in protein-ligand and protein-protein docking is similar and attempt to describe molecular recognition models. Conformational selection is often necessary to molecular binding relaxation, and induced fit is not mandatory but sufficient in certain casesTherefore, robust methods must employ techniques that simulate mainly CS followed by optimization steps that simulate IF. Different approaches treat the flexibility explicitly, involving local and global structural rearrangements that consider side-chain optimization and backbone displacement. Ensemble docking and on-the-fly strategies have been used to describe global conformational changes, being employed in combination with NMA, Molecular Dynamics, among others, to achieve this goal. Another strategy is to perform rigid docking as a first step, later optimizing the best solutions with post-processing algorithms that consider protein motions and flexibility. This strategy combines different tools and methods with sophisticated scoring functions to improve the accuracy of solutions.
Regardless of the methods used, we conclude that it is advantageous to take into account local and global structural changes in protein-ligand and protein-protein docking. Different sampling methodologies have improved the prediction capacity of molecular docking software’s in recent years. In spite of the resolution level (coarse or atomistic), elastic network models (mainly ANM and NMA) emerge as promising methods to compute protein flexibility, allowing better interface adjustments and overall results. Rotamer library is still a good approach to minor, but important, refinements. However, the field remains challenging - the improvement of the accuracy /computational cost ratio while maintaining robustness in molecular sampling can shed some light to new methods in following years.
ACKNOWLEDGEMENTS
The authors would like to thank the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior, Fundação de Amparo à Pesquisa do Estado de São Paulo and Universidade Federal do ABC for funding.