Structural Characterization, Docking and Dynamics Simulations of Canavalia Bonariensis Lectin
- 1. Department of Biochemistry and Molecular Biology, Federal University of Ceará, Brazil
- 2. Department of Fishing Engineering, Federal University of Ceará, Brazil
- 3. Department of Medicine, Federal University of Vale do São Francisco, Brazil
- 4. Department of Biochemistry, Federal University of Pernambuco, Brazil
- 5. Department of Biochemistry, Federal University of Santa Catarina, Brazil
- 6. Institute of Chemical and Geosciences, Federal University of Pelotas, Brazil
Abstract
Lectins are proteins that bind specifically and reversibly to carbohydrates. These proteins, in particular those from plants, are important tools in glycobiochemistry and glycobiology. Canavalia bonariensis Lindl is a species from the Leguminosae family, Papilionoideae subfamily, Phaseoleae tribe and Diocleinae sub tribe. It contains a glucose/mannose lectin, here in termed as CaBo, previously purified. The primary sequence of the lectin was determined by a combination of tandem mass spectrometry and molecular biology by amplification of the lectin gene isolated from the genomics DNA of the plant. CaBo presented high sequence similarity with lectins from the same sub tribe. From its primary structure, it was possible to predict the tridimensional structure of CaBo by homology with the lectin from Canavalia gladiata. The protein was subjected to ligand screening, both monosaccharides and dimannosides, by molecular docking, and the stability and binding dynamics of CaBo were assessed by molecular dynamics.
Keywords
• Plant lectin
• Canavalia bonariensis
• Diocleinae
• Molecular modeling
• Molecular dynamics
Citation
da Silva MTL, da Silva Osterne VJ, Simplício Nobre CA, Chaves RP, da Silva IB, et al. (2016) Structural Characterization, Docking and Dynamics Simulations of Canavalia Bonariensis Lectin. J Drug Des Res 3(1): 1023.
INTRODUCTION
In recent years, hundreds of plant lectins have been studied and various aspects regarding its molecular structure, binding specificity and recognition to distinct carbohydrates and glycoconjugates. The characteristics of lectins afford specific biological roles and distinct biological properties and also allow them to be used as molecular tools in the investigation of various cell characteristics. These proteins have been applied to scientific fields ranging from medicine to agriculture based on such properties as insecticidal activity [1-3], anti-inflammatory and proinflammatory activity [4-7], in vivo T cell activation and induction of apoptosis phenomena [8] and has a potential to be explored as a new tool in cancer research by recognition of specific epitopes exhibited on different cancer cells [9].
Many lectins have been purified from seeds of species of the sub tribe Diocleinae, known as ConA - like lectins, are composed of proteins with specificity for mannose/glucose. Structurally, these proteins show high similarity in that they are composed of monomers associated in dimeric or tetrameric forms by non covalent interactions. The molecular average weight is about 25 - 30 kDa per monomer, and the carbohydrate binding site is conserved, as well as the binding sites for metal divalent cations, usually calcium and manganese [10,11]. Although ConA - like lectins present well - preserved primary and tertiary structures, they exhibit variability in their properties and biological functions [12]. The ability of lectins to recognize carbohydrates and glycoconjugates has been conserved over time and the phylogenetic study of proteins based on their primary structures can follow the inference of the evolutionary process in these species.
The lectin from Canavalia bonariensis seeds (referred to as CaBo) was purified by single - step affinity chromatography and showed the same electrophoretic profile as that found for other lectins from the Canavalia genus. CaBo was found to be specific to mannosides, and its physicochemical properties were found to be similar to those of Diocleinae sub tribe lectins [13]. This study aimed to obtain the primary and tridimensional structures of CaBo and understand its structural features and dynamics.
MATERIALS AND METHODS
Molecular cloning of CaBo gene
Genomic DNA (here in after termed gDNA) from Conyza bonariensis was extracted from young leaves using the cetyltrimethylammonium bromide (CTAB) procedure [14], subjected to analysis of integrity by electrophoresis in 0.7% agarose gel (Mini Run GE - 100 Electrophoresis System) stained with 0.5 ug / ml ethidium bromide, and revealed in a gel imaging instrument (Gel Doc™ EZ Gel Documentation System, Bio - Rad). Genomic DNA was quantified with the NanoVue™ Plus Spectrophotometer (GE Healthcare Life Sciences) from absorbance at 260 nm and A260 / A280 ratio between 1.8 and 2.2.
For gene amplification, degenerated primers DG5’γ forward, and DG3’β reverse, were designed based on the published sequences of similar lectins available in the UniProt Database. PCR was performed in a thermo cycler (Axygen Biosciences, USA) programmed for an initial denaturation step (5 min at 94°C), followed by 45 cycles of 30 seconds at 94°C (denaturation), 30 seconds at 50°C (annealing), and 1 min at 72°C (extension). The last cycle was followed by a final incubation of 7 min at 72°C. Amplification reactions were carried out in a final volume of 25 µl containing 600 ng of gDNA template and 1 unit of Taq High Fidelity DNA polymerase (Thermo Scientific). Control samples containing all reaction components, except DNA, were used to assure that no self - amplification or DNA contamination occurred. The amplified PCR product was analyzed by agarose gel electrophoresis, cloned on pGEM® - T Easy Vector (Promega, USA) according to the manufacturer’s specifications, and used to transform Escherichia coli DH5α cells by heat shock. The transformant selection was made by blue/white screening of pGEM® - T Easy Vector, and the plasmid DNA was purified using the AxyPrep Plasmid Miniprep Kit (Axygen Biosciences, USA), according to the manufacturer’s standard protocol.
DNA sequencing
Plasmids were sequenced in an automatic MegaBACE sequencer (GE Healthcare) by the Sanger method. Primers were T7 sense promoter and SP6 antisense promoter. The reads were analyzed by the Phred - Phrap - Consed program [15]. The formed contigs were translated into amino acids using a DNA translation tool (http://www.vivo.colostate.edu/molkit/translate/), and amino acid sequences were compared with the data obtained by tandem mass spectrometry to confirm and complete the primary structure of CaBo. Multiple alignments with the sequence of CaBo and different sequences of lectins from the Diocleinae sub tribe were made using the Clustal Omega tool available online, and the generated file was uploaded to ESPript 3.0 [16].
Purification of Canavalia bonariensis lectin
Canavalia bonariensis seeds were ground to obtain a fine powder. Soluble proteins were extracted in 150mM NaCl containing 5mM CaCl2 and 5mM MnCl2 (1:10 w/v) under continuous stirring (4h, 25°C). Subsequently, the extract was centrifuged at 10000 × g at 4°C for 20min, and the supernatant was filtered on a filter paper (Whatman™, GE Healthcare, Little Chalfont, UK). The resulting supernatant (crude extract) was applied to a Sephadex® G - 50 affinity column (6.5×1.8cm, GE Healthcare, Little Chalfont, UK) previously equilibrated with the extraction solution. Unbound material (P1) was eluted with the same solution, and the lectin (P2) was eluted with 100mM Glycine buffer pH 2.6. The sample eluted with D - glucose solution was pooled, extensively dialyzed against distilled water, and freeze - dried, obtaining the pure lectin of Canavalia bonariensis (CaBo).
Molecular mass determination
The average isotopic mass of CaBo was determined by Matrix - Assisted Laser Desorption Ionization Mass Spectrometry (MALDI - MS) and analysis by Time-of-Flight (TOF) using an Autoflex Speed instrument (Bruker Daltonics, USA). The protein was solubilized in 50% acetonitrile (ACN) and 0.3% Trifluoroacetic Acid (TFA) to a final protein concentration of 1 pmol. The matrix used was α-cyano-4-hydroxycinnamic acid at 10 mg/mL solubilized in 50% ACN and 0.3% TFA at a ratio of 3:1 (matrix/ analyte) for analysis. The instrument was operated at 20 kV with reflector in linear mode and protein analysis in the range of 10,000 - 100,000 Da. The spectra obtained were processed with CompassTM 1.3 software using the SNAP algorithm to annotate monoisotopic peak [17].
Protein digestion and sequencing by mass spectrometry
Protein digestion was carried out as previously described by Shevchenko et al., (2006) [18]. The protein was submitted to SDS - PAGE, and the Coomassie - stained gel containing the bands was excised and bleached in a solution of 50 mM ammonium bicarbonate in 50% ACN. The bands were then dehydrated in 100% ACN and dried in a speedvac (LabConco). The gel was rehydrated with a solution of 50 mM ammonium bicarbonate containing trypsin (Promega) or chymotrypsin (Sigma) (1:50 w/w; enzyme: substrate) and incubated at 37°C overnight. The peptides were then extracted in a solution of 50% ACN with 5% formic acid and concentrated with the Labcongo™ Centrivap™ Vacuum Concentrator. The peptides were separated by a BEH300 C18 column (75 mm x 100 mm) (Waters Corp.) using a nano Acquity™ System and eluted with acetonitrile gradient (10% - 85%) containing 0.1% formic acid. The liquid chromatography apparatus was connected to a nanoelectrospray mass spectrometer source (Synapt HDMS System; Waters Corp). The mass spectrometer was operated in positive mode, using a source temperature of 80°C and capillary voltage at 3.5 kV. The LC - MS/MS experiment was carried out according to a data - dependent acquisition function selecting for the experiments of MS/MS double - or triple - charged precursor ions, which were fragmented by collision - induced dissociation (CID) using a ramp collision energy that varied according to the charge state of precursor ion. The data were processed and analyzed with Proteinlynx v 2.4 (Waters) using the peptide mass fingerprint (PMF) and the peptide fragmentation pattern as search parameters. To identify other peptides, the CID spectra were interpreted manually using the Peptide Sequencing tool in MASS LYNX 4.0 (Waters).
Phylogenetic analysis of CaBo
Amino acids sequence from lectins from Diocleinae sub tribe Canavalia ensiformis (Swiss Prot accession code : P02866), Canavalia bonariensis (Swiss Prot accession code : P55915), Canavalia maritima (PBD accession code : 2P34), Canavalia gladiata (PDB accession code : 1WUV), C. grandiflora (PDB accession code: 4L8Q, C. boliviana (PDB accession code : 4K20), C. lineata (Swiss Prot accession code : P81460), C. virosa (Swiss Prot accession code : P81461), Camptosema pedicellatum (Swiss Prot accession code : J9PBR3), Cratylia floribunda (Swiss Prot accession code : P81517), Dioclea violacea (Swiss Prot accession code : I1SB09), Dioclea guianensis (Swiss Prot accession code : P81637), D. sclerocarpa (Swiss Prot accession code : B3EWJ2), D. lehmannii, D. rostrata (Swiss Prot accession code : P58908) e D. virgata (Swiss Prot accession code : P58907) available in Gen Bank (http://www.ncbi.nlm.nih.gov/) were used for the construction of phylogenetic tree, along with the primary sequence of CaBo, obtained in this work by techniques of molecular biology and mass spectrometry. The alignment of the sequences was performed using the software Jalview 2.8.1 being afterwards subjected to a model test via MEGA6 software [19]. The phylogenetic reconstruction involving the maximum likelihood method was also made with the assistance of MEGA6 using the Bootstrap’s test as a support with random addition of 500 repetitions to verify consistency and confidence in topologies.
Phylogenetic analyzes were performed by Bayesian Inference from the use of Beauty and BEAST 1.8 [20] software, using two independent runs with 1,000,000 generations and sampling at every 100 generations, with parameter Gamma (G) and proportion of invariable sites (I). As out group in the analysis it was used the sequence from Bowringia mildbraedii lectin (BMA), chosen through similarity searches in databases of lectins from the Leguminosae family.
Homology modeling of CaBo tridimensional structure and model validation
The structure of the lectin from Canavalia gladiata (PDB id: 2OVU) solved at 1.5 resolution was used as a template to model the structure of CaBo using MODELLER 9v16 [21]. MODELLER implements homology modeling of protein structure by satisfying spatial restraints. For CaBo modeling, all MODELLER default parameters were used. Initially, twenty different models were generated and ranked based on MODELLER’s objective function (molpdf) and Discrete Optimized Protein Energy (DOPE) scores [21]. Several models with the lowest molpdf and DOPE scores were selected and submitted to analysis of stereo chemical properties (Ramachandran plots, steric overlaps, Cβ deviation parameters, rotamers, bonds and angles deviations) with PROCHECK [22]. QMEAN [23] and Z - score [24] were determined by the Protein Structure and Model Assessment Tools included in Swiss - Model [25], and side chain environment acceptability was assessed with the Verify3D server [26]. The model which returned the best results in all validation parameters was regarded as the most satisfactory model of CaBo. Molecular drawings of the model were prepared using PyMol (Schrödinger, LLC).
Molecular docking
Various sugars were docked to the modeled structure of CaBo, using CLC Drug Discovery Workbench, v. 3.0 (CLC Bio; Boston, MA, USA). This software tool uses a standard precision mode to determine the favorable binding poses and detects various flexible ligand conformations while holding protein as a rigid structure during docking (CLCbio®). The lowest energy conformation for each compound was obtained from the Protein Data Bank ligand library. CaBo carbohydrate - recognition domain location was determined by superposition with CGL (PDB id: 2OVU) and selected as the center of the binding site with a radius of 13 Å for all compounds. The number of iterations for each ligand was set at 5000. The PLANTSPLP algorithm was used to calculate the docking score [27]. The best ligand poses were selected based on the docking score, hydrogen bonds and hydrophobic interactions. In order to represent the interactions between CaBo and the best ligands, LigPlot +, v. 1.4.5 [28], was used to generate two-dimensional representations, and PyMol was used to generate the figures.
Molecular dynamics simulations
Molecular dynamics simulations were performed using the Groningen Machine for Chemical Simulations (GROMACS) package, v. 5.1.2 [29,30], with the GROMOS 54a7 force field [31]. The modeled structure of CaBo was used for the simulation studies. A cubic box was generated by the edit conf module of the package, and the protein was solvated with single - point - charge (spc) water model using the gen box module. α-methyl-mannoside (MMA) topology was generated by the ATB server [32,33] and checked manually. Sodium ions were added to neutralize the overall system charge wherever necessary. Afterwards, energy minimizations were performed using the steepest descent method and a maximum force of 10 kJ mol-1 nm-1 was chosen as the criterion for minimization. The minimization was followed by system equilibration under NVT followed by NPT ensembles with Parrinelo - Rahman isotropic pressure coupling to 1 bar and Nose - Hoover temperature coupling [34] to 300 K. Long - range electrostatic interactions were calculated by the Particle Mesh Ewald method (PME) [35] with a cutoff of 12 Å, and 15 Å was used to compute long - range Van der Waals interactions. Linear constraint (LINCS) [36] was utilized to constrain the bonds. Simulations were carried out for 2 ns for both the native structure and CaBo-α-methyl-mannoside complex. The coordinates were saved each 10 ps.
RESULTS AND DISCUSSION
Amino acid sequence of CaBo
For elucidation of the lectins complete amino acid sequence, the protein was digested with trypsin and chymotrypsin enzymes. Some peptides were identified through a search of NCBI and Swiss Prot databases using search tools for peptide fragmentation pattern (PMF). The other peptides were sequenced by manual interpretation of the spectra. The complete sequence of Canavalia bonariensis lectin was obtained by association of MS/MS technique and gDNA sequence translation of transformed clones containing the isolated lectin gene. When analyzing the gDNA sample sequence (forward and reverse), we obtained the nucleotide sequence inferred from the gDNA sample sequence, as shown in Figure (1).
Figure 1: Nucleotide sequence of CaBo precursor and translated amino acid sequence. Underlined sequences are removed during the circular permutation.
The resulting gene sequence translates to the partial pre-pro-protein sequence. The mature protein was obtained after the removal of the underlined regions and the circular permutation of the two other fragments, both of which occurred during posttranslational processing of the peptide [37].
The primary structure of the complete lectin was deduced with data from mass spectrometry and from the gene sequence of Canavalia bonariensis lectin (Figure 2).
Figure 2: CaBo primary structure obtained by combination of MS/MS sequencing and gDNA data.
The theoretical masses obtained from the protein sequence and from the visual observation of SDS - PAGE are in agreement with the mass determined by MALDI - TOF / MS (α = 25512; β = 12998; γ = 12534).
The full sequence of CaBo was deposited in UniProt under accession number P58906. The N - terminal sequence obtained by Calvete et al., (1999) [38] was compared with the amino acid sequence obtained in this study, and both exhibited 100% identity up to the 25th amino acid residue. Previous structural studies indicate that CaBo is a mixture of isolectins with modifications in their single chains and fragments [39]. In this study, we determined the primary structure of a single isoform of CaBo.
By comparing lectins from Canavalia bonariensis (CaBo) with other Diocleinae lectins, a high similarity can be observed in the primary structure of these proteins, especially those of the genus Canavalia species, with up to 91% similarity. This conservation of residues is characteristic of Diocleinae lectins. Throughout its sequence, changes were seen in a few amino acid residues common to most lectins from this genus. These modifications (A121 and A123) suggest that the lectin from Canavalia bonariensis is the most primitive among other Canavalia lectins and that this species is phylogenetically closer to a common ancestor that diverged and originated the lectins from the Canavalia, Dioclea, Cratylia, and Camptosema genera These results are also shown in the phylogenetic tree built from the amino acid sequences of Diocleinae lectins (Figure 3).
Figure 3: Phylogenetic analysis shows the evolutionary scale between Diocleinae lectins. The numbers in the branches indicate the confidence of the results on a scale from 0-100. The lectin of Bowringia mildbraedii (BMA) was used as out group.
Molecular modeling of CaBo
The amino acid sequence of CaBo was submitted to homology modeling with the structure of CGL (PDB id: 2OVU) as the template. The root mean square deviation (RMSD) of the mean coordinate positions of residues between CGL and CaBo was 0.416 nm, indicating a reliable prediction. CaBo demonstrated a jellyroll domain similar to other legume lectins [40,41]. The monomer of CaBo consists of an antiparallel β-sheet of six strands partially extended and another curved, antiparallel β-sheet of seven strands, both connected by one-third β-sheet consisting of two strands (Figure 4).
Figure 4: Overall structure of CaBo monomer. The lectin is shown in cartoon representation. Spheres represent the calcium (in gray) and manganese (in purple).
Previous results suggest that CaBo biological assembly is tetrameric and, similar to other Diocleinae lectins, is probably formed by two canonical dimers that consist of an association of the antiparallel β-sheets of different monomers, generating a continuous sheet of twelve strands [42,43]. Both metal - binding sites and carbohydrate recognition domains of CaBo were well conserved among ConA - like lectins. The metal binding site is located in the vicinity of the CRD, and coordination of calcium and manganese occurs with the participation of four amino acid residues. Manganese is coordinated by Glu8, Asp10, Asp19 and His24, while calcium is coordinated by Asp10, Tyr12, Asn14 and Asp19. The coordination of metals also has the participation of two water molecules for each metal, as found in several other lectin structures. These water molecules are responsible for an indirect connection of Ile32 / Ser34 to manganese and Asp208 / Arg228 to calcium (Figure 5).
Figure 5: Representation of CaBo metal binding site.
The coordination of metals stabilizes a cis - peptide bond between Ala207 and Asp208, which is important for carbohydrate binding activity [44-46]. The CRD of CaBo demonstrates similarity with other Canavalia lectins, and its details are addressed in the molecular docking section.
Model validation
The reliability of the CaBo model obtained by homology modeling was assessed by several validation parameters. The stereo chemical parameters were checked by PROCHECK, which did not indicate any serious problems. The Ramachandran plot demonstrated 100 % of the residues in favored and allowed regions of the graph. The QMEAN and Z - score of the model obtained by protein assessment tools were 0.82 and 0.548, respectively, both within the range of a high - quality model. The compatibility of the amino acid sequence and the tridimensional structure were obtained via the Verify3D program. As result, 100% of the residues have an average score higher than 0.2, which is considered a good value, suggesting that the side chain environment is acceptable. In summary, all validations demonstrated good results, and the predicted model was considered reliable.
Molecular docking
Interactions between CaBo and several sugars were tested by molecular docking (Data not shown). The results demonstrated that the lectin binds strongly to α-methyl-mannoside (Score: -44.50) among the tested monosaccharides and mannosyl-α1,6-mannose (Score: -60.84) among the tested dimannosides. Like other lectins, binding with sugars was mediated by an extensive network of Van der Waals, hydrophobic and hydrogen interactions. The α-Methyl-D-mannoside residue complexed in the CRD was stabilized by a network of H - bonds connecting Asn14, Leu99, Tyr100, Asp 208 and Arg228 residues to oxygen atoms O3, O4, O5 and O6 present in the carbohydrate molecule. Hydrophobic interactions involving the amino acid residues Tyr12, Gly98, Ala207 and Gly227 also contribute to the binding of lectin with this monosaccharide (Figure 6).
Figure 6: (A). Carbohydrate - binding site of CaBo interacting with α-methyl-mannoside, blue dashes represents polar bonds (B). LIGPLOT representations of hydrogens bonds and hydrophobic interactions around the ligand.
Binding to mannosylα-1,6-mannose involved H-bonds of residues Tyr12, Asn14, Thr15, Asp16, Leu99, Tyr100 and Asp208 and the O3, O5, O6 and O8 of the mannosides.
Hydrophobic interactions of residues Pro13, Gly98, Ala207 and Arg228 also contribute to the binding (Figure 7).
Figure 7: (A). Carbohydrate - binding site of CaBo interacting with mannosyl-α-1,6-mannose, blue dashes represents polar bonds (B). LIGPLOT representations of hydrogens bonds and hydrophobic interactions around the ligand.
CaBo binding with various mannosides suggests that this lectin can interact with glycoprotein-N-glycans and this binding could be the underlying mechanism for several biological activities reported to lectins [45,47,48].
Molecular dynamics
In order to evaluate the structural stability of the uncoupled lectin and CaBo - MMA complex during MD simulations, the RMSD for the protein backbone with respect to the initial structure was calculated. The first 1000 ps were considered an equilibration period after which the models became stable around 0.2 nm. Compared to the results of uncoupled CaBo, the RMSD value of the complex was higher throughout most of the simulation. This could be explained by conformational changes caused by MMA binding which affected the whole structure (Figure 8).
Figure 8: RMSD plots of uncoupled CaBo and CaBo-α-methylmannoside complex.
The root mean square fluctuation (RMSF) for the protein backbone was applied to distinguish the flexible regions of the protein and to determine the alterations caused by MMA binding. These data could help in further understanding CaBo ligand binding. The RMSF plot demonstrates that the structural changes occurred mostly in loops and binding - site regions. These regions showed changes in RMSF values, in particular the loop in 116 - 122 that was greatly stabilized in comparison with uncoupled CaBo. Changes in RMSF value were also evident in residues 17 - 24 and 175 - 186. Carbohydrate binding was responsible for all these changes because of the structural modification necessary for sugar accommodation in CRD (Figure 9).
Figure 9: RMSF plots of residues in uncoupled CaBo and CaBo-αmethyl-mannoside complex
Hydrogen bonding analysis is one of the most important factors in maintaining carbohydrate binding in the lectin. During the MD simulation, the average number of hydrogen bonds were around 4, indicating a strong bonding and corroborating the docking data and previous data about Canavalia binding to mannosides (Figure 10) [48].
Figure 10: Hydrogen bonding variations in CaBO-α-methyl-mannoside complex during the molecular dynamics simulation.
CONCLUSION
In the present work, we report the primary and tridimensional structures of a lectin extracted from Canavalia bonariensis seeds, using a combination of mass spectrometry, gene sequencing and molecular modeling. CaBo showed high identity with other lectins from the Diocleinae sub tribe, in both primary and tertiary structures. CaBo also demonstrated high affinity for mannosides, as demonstrated by molecular docking and dynamics. This affinity can open up possibilities for the use of CaBo on several biological tests as well as several glycomics and glycoproteomics applications.