Molecular docking and analysis of interactions between vascular endothelial growth factor ( VEGF ) and SPARC protein By :

The extracellular module of SPARC/osteonectin binds to vascular endothelial growth factor (VEGF) and inhibits VEGF-stimulated proliferation of endothelial cells. In an attempt to identify the binding site for SPARC on VEGF, we hypothesized that this binding site could overlap at least partially the binding site of VEGF receptor 1 (VEGFR-1), as SPARC acts by preventing VEGF-induced phosphorylation of VEGFR-1. To this end, a docking simulation was carried out using a predictive docking tool to obtain modeled structures of the VEGF–SPARC complex. The predicted structure of VEGF–SPARC complex indicates that the extracellular domain of SPARC interacts with the VEGFR-1 binding site of VEGF, and is consistent with known biochemical data. Following molecular dynamics refinement, side-chain interactions at the protein interface were identified that were predicted to contribute substantially to the free energy of binding. These provide a detailed prediction of key amino acid side-chain interactions at the protein– protein interface. To validate the model further, the identified interactions will be used for designing mutagenesis studies to investigate their effect on binding activity. This model of the VEGF–SPARC complex should provide a basis for future studies aimed at identifying inhibitors of VEGF-induced angiogenesis.


Introduction
Angiogenesis, the process of new capillary blood vessel growth from pre-existing vasculature, is a necessary physiological process in growth and development as well as in wound healing. Under normal circumstances angiogenesis occurs as a highly ordered series of events, spreading the vascular network only to the extent required by the demands of growing tissues [1]. However, in spite of the tight regulation, angiogenesis can occur not only in normal development and physiological processes but also in pathological processes such as tumor growth and metastasis [2]. The role of angiogenesis in tumor progression has been well established as a critical step in the transition of tumors from a dormant state to a malignant state through the supply of oxygen and nutrients to the proliferating cells [3], [4]. Many positively and negatively acting factors influence angiogenesis, which is a complex multi-component process involving the coordinated action of many growth factors and their receptors, cytokines, proteases, extracellular matrix proteins and adhesion molecules [5], [6].
Abnormal blood vessel growth in the eye is a leading cause for catastrophic loss of vision and extensive research has been done to understand the basic, underlying mechanisms of ocular angiogenesis [7], [8]. Vascular endothelial growth factor (VEGF)-related family of angiogenic factors are believed to play a central role in ocular angiogenesis as they regulate a wide variety of endothelial cell functions including cell survival, migration, differentiation and vascular permeability. Tumor cells typically induce new blood vessel growth by up-regulating growth factors such as VEGF, thereby overwhelming the effect of natural angiogenesis inhibitors [9], [10], [11]. The abnormal levels of VEGF causes proliferation of vessels into the retina leading to loss of vision [12], [13], [14]. Conventional treatment methods have met with limited success and inhibition of growth factor-induced new vessel formation and targeting of pathological vessels seem to be the best choice for treating angiogenesis-related ocular diseases [15], [16].

VEGF and SPARC in angiogenesis
VEGF, which is upregulated in many different tumor types, is the most well characterized angiogenic and vascular permeability enhancing factor. It is a critical regulator of both physiological and tumor angiogenesis [17], [18], and exerts its function by interacting with two high affinity tyrosine kinase receptors; fms-like tyrosine kinase (Flt-1/VEGFR-1) and the kinase insert domain-containing receptor (KDR/VEGFR-2) (Fig. 1). Although these receptors are homologous and share architectural similarity they are significantly different in their functional properties [19], [20], [21], [22]. As a result, the action of VEGF and the function it mediates depends on the site of its expression, the type of receptors present and the signals they initiate [23]. Normally a physiological angiogenic balance is maintained, with the endogenous inhibitors of angiogenesis countering the effect of proangiogenic molecules. The interactions and delicate balance between the different angiogenic and angiostatic factors are critical for regulating and treating angiogenesis [24], [25].
The glycoprotein SPARC (Secreted Protein Acidic and Rich in Cysteine) has been found to act as an angiogenesis inhibitor by regulating the activities of growth factors like VEGF and plateletderived growth factor (PDGF) [26], [27], [28], [29]. SPARC is a Ca 2+ -binding, multifunctional glycoprotein belonging to the class of matrix-associated factors that mediate cell-matrix interactions. The primary structure of SPARC contains four unique domains with distinct functional activities; it binds Ca 2+ at both the amino-and carboxyl-terminal domains. It has been shown to be involved in the regulation of important physiological processes through the modulation of cell-matrix and cell-growth factor interactions [30], [31]. Specific peptide regions have been identified in the protein that inhibit cellular proliferation and regulate the migration of endothelial cells. Recent studies have shown that SPARC binds to VEGF and inhibits VEGF-stimulated proliferation of endothelial cells [32]. It is our hypothesis that this binding site could overlap at least partially with the binding site of VEGFR-1, as SPARC acts by preventing VEGF-induced phosphorylation of VEGFR-1. In order to understand the basis for the multifunctional role of SPARC [33], [34], [35], further knowledge regarding its interaction with growth factors, especially VEGF, is important. Fig. 1. Schematic representation of VEGF receptors and ligands. The Flt-1 gene encodes both the full length receptor (Flt-1/VEGFR-1) and a soluble form (sFlt-1/sVEGFR-1). sFlt-1 is an endogenous inhibitor of VEGF, as it tightly binds VEGF with the same affinity as the full length receptor suppressing its angiogenic activity. The angiogenic activity of VEGF is due to its binding to VEGFR-1 and VEGFR-2.
Despite the available individual structural information regarding VEGF and SPARC, the atomic details of their complex structure and the exact structural domains that mediate their interactions and subsequent biological effects still remain unclear. In the present study, molecular modeling and protein-protein docking simulations have been carried out towards the twin goals of predicting the structure of the complex and identifying specific residues involved in the interaction. The resulting model will enable a better understanding of the role of this pair of proteins in angiogenesis, and ultimately assist in the design of molecules that can inhibit the effect of VEGF and thereby prevent the abnormal growth of new blood vessels.

Predictive docking
The growing number of individual structures in the crystallographic databases and the relatively small number of solved complexes has made predictive docking an important theoretical method [36]. Protein-protein complexes are especially important since these reactions are biologically abundant in nature and it would be desirable to understand them in detail. It has been estimated that 70% of proteins function through multiprotein complexes in yeast, although the number of interactions in humans is difficult to estimate [37]. Because of the inherent difficulty in obtaining crystal structures of protein complexes, computational docking techniques are proving to be an essential tool for predicting the structures of such protein complexes. Docking strategies usually rely on a two-stage approach: first, to generate a set of possible orientations of the two docked proteins and then score them in a way that the native complex will be ranked highly. In this study, the predictive docking tool FTDock was used to perform the docking simulations and predict the correct binding geometry (Fig. 2). FTDock and RPScore are available as part of the 3D-Dock suite from the Biomolecular modeling laboratory [38], [39]. In this study, docking is used in concert with experimental data including site-directed mutagenesis to identify the correct complex structure. The extensive rigid-body docking and the use of structural and biochemical data to filter the results, is expected to produce a reasonable model of the complex. The final complex structure is then studied to analyze the intermolecular contacts and to identify specific residue level interactions between the proteins. The results will be used to direct further mutagenesis studies to investigate their effects on binding activity. The goal of our work is to create a detailed model of the complex and offer a structural interpretation for the available experimental data.

Fig. 2.
Flow diagram outlining the protein-protein docking procedure.

Protein-protein docking
There are several programs available for protein-protein docking that attempt to predict the structure of docked complexes when the coordinates of the components are known [36]. Most of the algorithms are based on rigid-body docking methods, in which the larger protein is kept fixed and the smaller protein is rotated and translated to find the best geometric fit. In this study, FTDock 2.0 was selected for performing the docking simulations as it uses Fourier transform to rapidly evaluate the shape complementarities and also it has various post-docking processing methods to score the resultant complexes, including scoring based on electrostatics and experimental data. The program has good predictive ability, as the root mean standard distance of the obtained complexes can be within 1.0 Å from known crystal structures for cases where biological information is available [40]. The primary advantage over other programs is that FTDock allows filtering of thousands of docked complexes to a manageable extent based on available biochemical information.
The individual starting structures for the docking were obtained from the PDB database [41]: the structure of VEGF bound to the second domain of Flt-1 receptor (PDB code: 1FLT) and the structure of SPARC (PDB code: 1BMO), both with resolutions of 3.0 Å or better obtained by Xray diffraction. Starting from the known crystal structures, the end groups and disordered residues on both the proteins were first fixed using the Biopolymer module in Sybyl 7.0 [42]. The subsequent structure of SPARC was docked on to VEGF after removing the Flt-1 receptors from the starting structure. Before docking, the coordinates of proteins were preprocessed to remove hydrogen atoms and alternative atom records. Each molecule was digitized onto a 274 × 274 × 274 grid with a 0.7 Å grid unit and the surface thickness was set to 1.3 Å. During the docking simulation the larger of the two molecules, SPARC was held static and VEGF was translated and rotated with respect to SPARC to explore all possible orientations. The rotational degrees of freedom were explored by rotating the VEGF molecule in 12° angle step size and three rotations at each step were retained. The shape complementarity of the two molecules was evaluated from the overlap of the two grid functions. The docking run, which results in 10,000 docked complexes, was performed with the inclusion of electrostatic scoring for excluding false positive complexes. All the docking simulations presented in this study were performed on an IBM-p655 cluster comprised of 32 processors. However, the FTDock program available from the developers is not parallelized for multiple processors and therefore the capability of the resources used did not reflect on the computational time taken for the simulation.

Rescoring of complexes
The subsequent step to docking was to reduce the number of complexes in the list that need to be considered in order to select the complex that will be closest to the true structure. The complexes obtained from the initial docking run were ranked based on the surface complementarity score for each of the complex structure. The complexes were rescored using RPScore, which was available as part of the 3D-Dock suite. The output from the docking run was sorted and ranked based on this alternative ranking, using the residue-residue pair potential scores for the complex structures. This scoring scheme based on evidence from actual protein interfaces is defined as the log fraction of the actual frequency and the expected frequency of occurrence, and the value of the score for each pair is a measure of the likelihood that the particular pair occurs and thus helps to quantify the probability of a complex's existence [39].

Filtering using experimental data
Biochemical data was used to further filter the docking results, using the residues on the surface of the interacting proteins as candidates. The results obtained from RPScore were filtered based on the experimental information that SPARC and the peptide 4.2 from the extra cellular region of SPARC bind to VEGF and inhibit VEGF-induced proliferation of endothelial cells. Receptor activation studies have also shown that SPARC could prevent VEGF-induced phosphorylation of VEGFR-1, by selectively blocking the activity of this receptor [32]. These data suggest that the peptide 4.2 (aa 254-274) [29], [30] corresponding to the C-terminal region of SPARC should be in proximity to the receptor binding face of VEGF, which is primarily composed of acidic residues (aa 63-67) and some basic residues (aa 82-86) which mediate interaction with VEGFR-1 [43]. This biological information was used as the main constraint for filtering and selecting the docked complex. Additionally, it is known that residues from both the VEGF dimers are involved in receptor binding interactions with VEGFR-1 and in order to bind competitively to VEGF and inhibit the phosphorylation of VEGFR-1, the interacting surface between the two proteins should be sufficiently large and involve all the key residues. This was not considered as a necessary criterion, but used only to rank complexes that matched both the criteria. The top ranked 100 complexes obtained from this filter were visually analyzed and the residues implicated by experimental results as having an influence on the interaction were used to guide the manual selection of final docked complex. A model structure for the docked complex was selected that was most compatible with the available experimental information.

Molecular dynamics simulations
The best structural model for the ternary complex of VEGF-SPARC obtained from the docking procedure was subjected to MD simulation to refine the protein interface. However, no explicit constraint functions were used to maintain the initial docking contacts during the simulation. The structures were first energy minimized using 1000 steps of steepest descent and 2000 steps of conjugate gradient minimization using the Kollman all-atom force field implemented in SYBYL [44]. A distance dependent dielectric function was used with the dielectric constant set to 1 and the nonbonded cutoff was set to 8 Å. Energy minimization with classical force field can be used to remove unrealistically close steric clashes and large deviations from ideal geometry resulting from the conformational changes of amino acid side chains after docking, but molecular dynamics simulation is required to improve rotamer distributions. This energy minimized structure was used as the starting structure for the MD simulation. All MD simulations were performed with the AMBER 7.0 molecular simulation package [45]. A 3 ns simulation was performed under constant pressure conditions using the parm99 force field to describe the interactions between the protein atoms [46]. As in the minimization, the dielectric constant was set to 1.0 was used and an 8 Å residue charge group based cutoff for nonbonded interactions was used, with the nonbonded pairlist updated every 15 steps. This cutoff was chosen to reduce the computational time involved due to the large size of the system. The bonds involving hydrogen atoms were constrained using SHAKE algorithm [47]. The simulation used a 2 fs time step for a total simulation time of 3 ns. The final conformation obtained at the end of the MD simulation was used for identifying specific interactions at the interface, computing inter-residue distances and other calculations.

Results and discussion
To be plausible, the complex structure of VEGF-SPARC derived from protein-protein docking simulation must be consistent with the available experimental information and also should reflect the nature of protein-protein interactions in general [48], [49]. Predicting protein-protein interactions is inherently challenging owing to the difficulty in modeling the many forces that contribute to these interactions [50]. This leaves the burden of excluding false positives from the docking results and ascertaining whether the model obtained is reliable by using accurate scoring and filtering techniques. Apart from using surface complementarity and electrostatic filter, residue pair potentials and biochemical data were also included to score the docking orientations, as it has been shown to produce more accurate results than using geometric fit and electrostatic energy alone [40]. The most favorable solution obtained by this method was then refined through molecular dynamics to get the final docked model (Fig. 3) which was used to analyze the interactions at the protein interface.

Scoring and filtering the docked conformations
In order to upgrade these models to reliable predictions, which could be used with confidence for further experimental and computational work, refinement using biological data is done. Any docking program attempting to find a complex structure for two given molecules based on surface complementarity and geometric fitting would invariably return several docking poses between the two molecules [51]. However, a large percentage of the docking orientations would be biologically irrelevant and can be easily eliminated through simple visual inspection, if some experimental information regarding their interaction is known. In this study, the presence of specific experimental information regarding the binding of SPARC to VEGF proved very useful as a means of filtering the docking results. To narrow down the list of possible orientations to include only biologically relevant structures, we paid attention to the experimental information that the binding of SPARC to VEGF is mediated specifically through the peptide 4.2 region of SPARC [32], [52], [53]. Additionally, we also took into account the hypothesis that the binding site of SPARC on VEGF should overlap, at least partially, the binding site of VEGFR-1. A distant constraint was therefore applied to filter the structures to include only the orientations where the peptide 4.2, comprising of amino acid residues 255-274, was within 4 Å distance from the VEGFR-1 binding site. Only the complex structures that correlated well with this experimental information were retained and the top 100 orientations were analyzed and compared with one another to select the final docked conformation. Apart from this biochemical data, the selection of the final model for the docked complex was also based on factors like the area of surface contact, extent of interactions present and stability of the model. This led to the model structure of the VEGF-SPARC complex (Fig. 3) in which there are several intermolecular hydrogen bonds between the two molecules, apart from hydrophobic and other long-range electrostatic interactions.

Identification of side-chain interactions
Long-range non-specific electrostatic interactions as well as short-range electrostatics like hydrogen bonds and salt bridges are crucial for complex stability [54]. To determine the mode of action, information about which amino acid residues come into contact when the complex is formed is important to supplement the data obtained from mutagenesis and other experimental methods. Also as mentioned above, the information on amino acid residues is also important to validate our working hypothesis of an overlapping binding site for SPARC and VEGFR-1, in order for SPARC to inhibit the VEGF-stimulated phosphorylation of VEGFR-1. Results from previous mutagenesis studies have shown specific regions in VEGF responsible for receptor binding interactions with VEGFR-1 and VEGFR-2 [43]. The region composed of predominantly acidic residues (aa 63-67) mediates binding to VEGFR-1 and another region composed of basic residues (aa 82-86) mediates binding to VEGFR-2. These oppositely charged surface regions are present at distal ends of the monomer, but since VEGF molecules exist as disulfide-linked homodimers these two regions are in close proximity in the dimeric form creating a cluster of receptor binding determinants at each end of the VEGF molecule (Fig. 4). Apart from this cluster of residues, the VEGFR-2 binding determinants form two other hot spots: Phe 17′ and Gln 79; and Ile 46, Glu 64′ and Ile 83′, which are present on the same face that is responsible for receptor binding in VEGF [55]. Fig. 4. VEGF receptor binding interactions; red solid representation corresponds to the acidic residues that mediate VEGFR-1 binding (aa 63-67) and blue solid representation corresponds to the basic residues that mediate VEGFR-2 binding (aa 82-86).  In our efforts to identify the key residues that drive the interaction between VEGF and SPARC and stabilize their complex, the side-chain interactions between the amino acids listed in Table  1, Table 2, were identified as crucial for binding activity and these binding hot spots will be used to guide site-directed mutagenesis studies. The side-chain interactions involving these residues (Fig. 5) at the interface of VEGF and SPARC contribute a large fraction of binding free energy, highlighting their importance in stabilizing the protein complex.
The hydrogen bonding interactions involving several key residues in the binding hot spots are consistent with our hypothesis that SPARC binds to VEGF on the same receptor binding face that mediates its interaction with VEGFR-1. The prevalence of several non-polar amino acids is a characteristic trait of docking interfaces [49].

Computational alanine scanning
The role of individual amino acid side chains in stabilizing the complexes was further probed by computational alanine scanning studies, which identifies residues that are important for the stabilization of the complex, by determining the change in the free energy of binding when various residues in the wild type protein was mutated to alanine [56], [57]. The results from the alanine scanning experiments ( Table 3, Table 4) correlated well with the docking simulation results.
The experimental studies carried out previously showed that the peptide 4.2 region of SPARC inhibited VEGF-induced cell proliferation and the model of the VEGF-SPARC complex obtained from our docking studies showed that the residues in this region play a significant role in binding stability. The results from computational alanine scanning confirm that these residues are important for the stability of the complex. Positive values of ΔΔG means that the alanine mutation is predicted to destabilize the complex and negative values indicate a stabilizing effect. Computational mutation of Asp 63, Gln 79, His 86 and Gln 89 in VEGF as well as several residues in the region from 255 to 274 in SPARC, to alanine had unfavorable effects on the stability of the complex. This study indicates that the enthalpic contribution from the desolvation of amino acids, formation of novel H-bonds, van der Waals and electrostatic interactions involving these residues contribute to a favorable free energy of interaction between VEGF and SPARC, and offset the decrease in entropy from the loss of translational and rotational degrees of freedom upon binding.  (1) or not interacting directly but is buried upon binding (0); (column 4) ΔΔG (bind), predicted change in binding free energy upon alanine mutation; (column 5) ΔG (partner), predicted change in protein stability of the mutated partner upon alanine mutation.  (1) or not interacting directly but is buried upon binding (0); (column 4) ΔΔG (bind), predicted change in binding free energy upon alanine mutation; (column 5) ΔG (partner), predicted change in protein stability of the mutated partner upon alanine mutation.
It is also interesting to note that several important hydrogen bonding and hydrophobic interactions observed in our proposed model for VEGF-SPARC complex, not only validate our hypothesis of an overlapping binding with VEGFR-1, but also indicate the possibility of an overlapping binding site with VEGFR-2, as several of the critical residues responsible for VEGFR-2 binding are involved in SPARC recognition as well. The hydrogen bonding interactions involving residues Gln 79 and His 86, and hydrophobic interactions involving residues Phe 17, Arg 82 and Ile 83 which are also crucial for VEGFR-2 binding strongly support the involvement of the same hot spot residues on the VEGF surface, in mediating its interaction with the receptors as well as with molecules such as SPARC. The current model is consistent with known evidence of SPARC inhibition of VEGFR-1 activation [58] and agrees well with the experimental data that the region of residues 255-274 in SPARC possess anti-angiogenic functions through its binding with VEGF [32]. While the present docking model supports a nonselective inhibition of the activity of VEGF receptors, it does not explain the results from a previous study which found that SPARC did not inhibit VEGF-mediated activation of VEGFR-2 [32]. In any case, further studies using site-directed mutagenesis and other experimental methods will be required to verify the binding sites identified by our docking model.

Conclusions
In the current study, we have demonstrated the application of protein-protein docking simulation to build a complex structure of VEGF-SPARC starting from unbound proteins using the program FTDock. This work is based on the premises that, (1) starting from unbound structures, computer docking simulation can be used to build a set of atomic models of complexes, one of which will be close to the native complex structure, and (2) by applying proper filtering and scoring methods, it is possible to select the "correct" structure from the docking results. We have used electrostatics, residue pair potentials and biochemical information to filter and rank the docked models and build a reliable model of the complex structure. After filtering, the final model of the complex was selected that agreed best with the biological data and it was refined using molecular dynamics, to analyze the interactions and identify hot spot residues. These hot spots at the protein-protein interface, which are small regions that are crucial to binding, can be targeted by small molecules to mimic the protein-protein interactions. Thus, by combining biological information with computational docking, we have been able to put forward a model in which SPARC binds to VEGF near its VEGFR-1 binding domain. This model can be used for future experimental and computational studies to draw biological and functional conclusions. Using site-directed mutagenesis, to be carried out based on the interactions identified in this study, it should be possible to confirm the predicted mode of SPARC binding to VEGF. Knowledge of the interactions and important binding residues at the protein interface will enable new strategies to target VEGF and inhibit angiogenesis.