Conformational Analysis of Rat Seminal Vesicle Secretory Protein 4 , an Intrinsically Disordered Protein Having Interesting Pharmacological Properties

Rat seminal vesicle protein 4 (RSV4) is a member of the seminal vesicle protein family present in rats, 90 residues long. When secreted, it is involved in many functions related to the reproduction ranging from semen coagulation to sperm capacitation, but its fragments have also shown in vitro pharmacological properties such as anti-inflammatory and pro-coagulant activity, important in the cancer development. However, no three-dimensional model of this protein is yet available probably because of the presence of intrinsic disorder in the structure. In this article we report structural studies in solution of RSV4 by SEC (size exclusion chromatography) and CD (circular dichroism). In solution the monomer is highly flexible and poorly organized with the presence of highly fluctuating helical segments and, as also suggested by SEC, is classifiable as NU-PMG (natively unfolded pre-molten globule). The lack of a cooperative sigmoidal structural transition induced by thermal and GdmCl perturbation in monomer supports the poor structural organization expected for a NU-PMG conformational model. The structure of RSV4 monomer was modeled computationally and subjected to molecular dynamics simulations to study its conformational changes and energetic stability.


Introduction
RSV4 is a basic protein secreted from the rat seminal vesicle epithelium (Ostrowski et al., 1979;Pan and Li, 1982;Abrescia et al., 1986;Mansson et al., 1981).Its gene has been expressed in E. Coli (Mansson et al., 1979) and exists in two forms differing for a number of 20-base pair (bp) tandem repeats in an intron (Harris et al., 1983).RSV4 is a substrate for transglutaminase (Folk et al., 1980;Paonessa et al., 1984), and Ca ++ activated enzyme present in numerous cell types and biological fluids (Porta et al., 1988).RSV4 is an important member of the SV (seminal vesicle) protein family present in rats, and is responsible for many functions in reproduction ranging from semen coagulation, sperm hyaluronidase activity to sperm capacitation (Abrescia et al., 1985).Among the protein SV family, RSV4 has also procaogulant, antiinflammatory, antiapoptotic and immunosuppression (Galdiero et al., 1989;Mcdonald et al., 1983;Metafora et al., 1987) properties which are very useful for the treatment in several human diseases and in the cancer.Its sequence is 90 residues long (Pan and Li, 1982 ) and its some structural properties in solution have been studied (Mcdonald et al., 1983;Romano-carratelli et al., 2002).Some studies have demonstrated that this protein shows monomer-trimer equilibrium of self-association being important for the modulation of its activity (Stiuso et al., 1999;Caporale et al., 2004).In fact the trimeric form is responsible for its immune-modulatory activity whereas that monomeric for ita anti-inflammatory and pro-coagulant activity (Romano-Carratelli et al., 2002).
Many studies on the RSV4 (Ferranti et al., 1997;Metafora et al., 2001;Vilasi and Ragone, 2008;Abrescia et al., 1986;Porta et al., 1990;Di Micco et al., 2000;Stiuso et al., 1999;Porta et al., 1994;Metaofra et al., 1987;Suskiewicz et al., 2011) were the result of the collaboration for many years of several groups operating in Naples (Italy).This protein has always shown resistance to reveal details of its three-dimensional structure, despite being attempted many times its crystallization but with no results.Also two sequence-based crystallizability evaluators, (SECRET) (Smialowskiet al., 2006) and CRYSTALP2 (Kurgan et al., 2009) gave a negative result with an high probability (0.925).The most likely explanation is that deriving from a study of Vilasi and Ragone (2008), which predicted, through the use of numerous predictors, the presence in the monomer of low complexity segments.This observation led them to express negative views on gaining a tridimensional model due to extensive presence of structural disorder.Their reasoning was also supported by the peculiar amino acid composition of the protein quite similar to that of disordered proteins (Vilasi and Ragone, 2008;Tompa, 2002) and by analysis with various predictors of secondary structures that suggested the presence of only short organized segments proteins (Vilasi and Ragone, 2008).
Certainly, it is clear that the RSV4 has structural segments intrinsically disordered but, if you want try to model the protein, it is important to know where the organized residues are allocated, how many they are, and if the protein is a cooperatively organized globular structure in solution.Indeed, it is well known that in some disordered proteins a globular core can coexist with disordered segments, while other proteins are highly disorganized, but acquire structure through binding, others are unable to do so (Tompa, 2002).These extreme situations are usually suggested by experiments that evaluate their function without any explanation of the physical grounds that control the "being disordered" of these proteins in solution.Due to the still limited knowledge of the physical basis regulating folding, stability and interactions of these proteins, it is our opinion that it is necessary to have a structural framework as broad as possible, to explain their functional behavior (Vincenzi et al., 2015;Bergantino et al., 2015).
Aim of this work is to gain more structural information on the monomer of RSV4 having anti-inflammatory activity by experimental techniques in solution such as SEC and CD, and computational approaches such as molecular modeling and molecular dynamics (MD) simulations.

Chemicals
All chemicals, like solvents and reagents, were purchased from BDH or Sigma-Aldrich or Carlo Erba (Milan, Italy).

Purification of RSV4
RSV4 was obtained from the seminal vesicle secretion of adult rats (Fisher-Wistar) according to our previous protocols and subjected to further purification using an FPLC system by ion exchange chromatography to obtain protein samples rich of the monomer with a low time-dependent contamination of dimeric/trimeric forms (Romano-Carratelli et al., 2002).The samples were immediately used for far UV CD measurements to obtain far UV spectra best of those obtained in the past (Caporale et al., 2004) with better resolution.For this study we obtained the approval from the ethics committee at our institution (Second University of Naples).The preparations showed a single band on SDS-PAGE and its purity was also evaluated as already reported in other previous papers (Metafora et al., 1987;Abrescia et al., 1986;Porta et al., 1990;Sambrook et al., 1989;Chang et al., 1978).It is worthy of note that the native sequence of 111 residues contains a signal peptide of 21 aa which is further processed into a mature form of 90 residues (Pan et al., 1980).The concentration of purified samples of RSV4 was measured by molar absorption at 276 nm (Di Micco et al., 2000;Stiuso et al., 1999).

Circular Dichroism
CD measurements were performed by a Jasco J715 spectropolarimeter equipped with a Peltier thermostatic cell holder in the spectra region between 185 and 250 nm (Chang et al., 1978) using a temperature range from 20 to 80 °C in steps of 5-6 °C.The spectra were analysed using the CDtool software.The thermal transition was reversible by cooling of the solution.To calculate the structure content we used Dichroweb software (Whitmore and Wallace, 2004;2008), and CAPITO (CD Analysis and Plotting Tool) server that is specific for intrinsically disordered proteins (Uversky et al., 2002;Wiedemann et al., 2013).

Molecular Modeling and Dynamics
We have found similar protein in the sequence and structure database using BLAST.It evidenced that there are no proteins with acceptable sequence identity percentage that can be used as template to construct a theroretical model by comparative modeling methods.Therefore, we used a fold recognition strategy by Phyre2 program (Kelleyand Sternberg, 2009).This approach can be performed by two methods: (i) Normal, and (ii) Intensive.The "Intensive" method performs more steps in case of more complex targets like disordered proteins or for the proteins, which represent a very low identity scoreamong the already present protein structures.The energetic quality of RSV4 model was obtained by Prosa score (Wiederstein and Sippi, 2007).To evaluate if the obtained model showed or not globularity, we used the globularity score based on a method devised by our group (Costantini et al., 2007).The energetic stability and the possible conformational changes of the bets obtained model were analyzed by molecular dynamics (MD) simulations for 20 ns at room temperature with GROMACS software package (v3.3.1) and GROMOS43a1 force-field (Van Der Spoelet al., 2005).The monomer was put in cubic box, full of SPC216 water molecules.The simulations were conducted at neutral pH where Arg and Lys are positively charged, Asp and Glu negatively charged and His neutral.GROMACS routines (RMSD, RMSF, gyration radius, secondary structure analysis, cluster analysis) were utilized to study the evolution of the obtained trajectory.

Hydrodynamic Behavior of RSV4
Previous results evidenced partial and conflicting information about the real structural organization of the RSV4 in solution.In fact, this protein has been suggested as a natively disordered proteins (Vilasi and Ragone, 2008).The purpose of this article is to define more precisely the structural behavior of the RSV4 in order to define to which class of proteins belongs RSV4, as suggested by Uversky (2002).SEC shows (Fig. 1; the inset refer to the elution of RSV4) that RSV4 elutes with an apparent molecular weight (M r ) of 16580.This value is larger than the theoretical M r (from amino acid sequence) of 9758.However, this value is obtained on the assumption that the RSV4 is a globular protein, which is unlikely based on the bioinformatic predictions of disorder.From the plot of the elution fraction of each marker versus their Stokes radii (not shown), we determined graphically an R s for RSV4 of 1.73 nm.By using the Stoke's equation, V H = (4/3)π(R s ) 3 ,we calculated a hydrodynamic volume (V H ) for RSV4 of 21.67 Å 3 .Uversky (2002) showed that the log(R s ) versus log(M) dependencies for different conformations of globular proteins can be used to describe eight different classes of reference.The value of 1.34 nm found for Log(V H ) fits well the straight line corresponding to NU-PMG corresponding to more compact proteins, in details, similar to pre-molten globules with respect to their hydrodynamic characteristics.Hence, the obtained data suggests that RSV4 is hydrodynamically similar to a natively unfolded pre-molten globule.The proteins of this group are natively unfolded and are also called extended intrinsically disordered being extremely flexible and extended, with little secondary structure under physiological condition (Uversky, 2002).The inset shows the elution profile of RSV4.In details, we indicate the elution volume with Ve, the initial volume with Vo, the total volume with Vi and the molecular weight with Mr.

Behavior in Solution: Effect of Temperature on far UV CD
The structural properties of RSV4 were studied using CD spectroscopy (Fig. 2).The spectrum at 25°C is composed of a deep minimum at 200 nm and a negative shoulder at 220 to 230 nm.The typical signatures of secondary structures are missing, indicating mainly fluctuating disordered structure or random coil.A maximum of low intensity should be located between 180 and 185 nm.This is a characteristic pattern for disordered proteins.DichroWeb server (Whitmoreand Wallace, 2004;2008) indicated a high content equal to more than 60% (helix 5%, strand 31%, turns 17%).On the other hand, CAPITO web-server (Wiedemannet al., 2013) showed RSV4 in the region specific of "unfolded proteins" on the basis of its ellipticity value (Fig. 3).Moreover the signal at 222 nm showed a linear increase of ellipticity from 20°C to 80°C (Fig. 2).A possible explanation may be that the increase in temperature increases the hydrophobicity of residues favoring structural transients or organizations that might result from the formation of fluctuating helices of polyproline II, even considering that the increase of ellipticity is centered around 225 nm (Bhatnagar and Gough, 1996).This was also confirmed by DichroWeb (Whitmore and Wallace, 2004;2008) that evidenced a decreasing of the fraction of the disordered structures when the temperature is increased.However, what appears evident is the presence of a non-sigmoidal transition which proceeds monotonically, supporting a low, or lack of, cooperativitywhich should indicate a non-globular structure in agreement with the NU-PMG conformational model from SEC.

Behavior in Solution: Effect of Guanidinium Hydrochloride
To probe the whole structure in order to assess the presence of globular organization that should be missing or strongly reduced in a protein largely disordered, we decided to perturb its structure by using a strong denaturant.
Under perturbation, the presence of clusters/domains of organized structure should be expressed through cooperative sigmoidal transition(s), a characteristic of the globularity.The effect of increasing concentration of denaturant shows a continuous change in the morphology of the spectrum with a sharp decrease in its intensity (Fig. 4).This means that the protein, however, loses some structure, organized or not.The GdmCl is a strong perturbing agent of the protein structure that, in small globular proteins, is often highlighted with a deep single sigmoidal (cooperative) two-state transition, where the nucleation center is unique, but is also able to detect the presence of clusters organized, structural domains or multiple nucleation centers, by means of different transitions (multi-states) (Colonna et al., 1982), always sigmoidal, that add up according to the structural stability of each of them.RSV4 is a small protein and the effect of the denaturant should generate a very steep and cooperative transition assuming reasonably a two state transition.An answer may be given by the observation of the type of transition that can be obtained by following the denaturation at 222 nm, a wavelength sensitive to the presence of helix (as suggested by the structural predictions).The inset in Fig. 4 shows a non-cooperative monotonic transition, typically seen in the native pre-molten globule because of the low content of the residual structure (Uversky, 2009).In conclusion all data suggest that RSV4 is an intrinsically disordered protein with a conformation in solution similar to that of a NU-PMG but with some residual structure, reasonably alpha-helix.and 6M).The inset shows the effect of GdmCl followed at 222 nm.

Modelling of RSV4 Monomer
We have modeled the RSV4 monomer (Uniprot code: NP_036794) by Phyre2server (Kelley and Sternberg, 2009) that has found four templates able to cover the whole query sequence of RSV4.They were: (i) 1EJ5, (ii) 3R45, (iii) 4DNC, and (iv) 1NAU, that shared 38%, 23%, 17% and 57% sequence identity with different regions of RSV4, respectively.In details, these templates covered the whole query sequence collectively and the percentage covered by each template was 34%, 38%, 24% and 14%, respectively (Table 1).On the basis of the multiple alignment between RSV4 and four templates, Phyre2 (Kelley and Sternberg, 2009) created the complete model of RSV4 using poing mechanism and implementing a hybrid approach of both fold recognition and ab-initio methodology.The model obtained structure is composed by two alpha helices and long unordered regions (Fig. 5A).It shows an overall Prosa/Z-Score equal to -4.02 indicating a good quality model if we take into account the disordered degree of this protein.In details, the percentage of disordered regions of 60%, obtained by Dichroweb, is very close to that of our model, i.e. 68%.We evaluated also the globularity score, on the basis of a strategy previously developed (Costantini et al., 2007).The score resulted equal to 8.5 confirming that RSV4 is not a globular protein and hence is disordered.In particular, we report the PDB codes of four templates, the information about the four templates, the number of RSV4 residues aligned with the four templates with the related specific residues, the sequence percentage covered by each template on RSV4 and the sequence identity percentage between RSV4 and four templates.
Fig. 5. RSV4 model before (A) and after molecular dynamics (B).We report the -helices, -strands and the turns in blue, red and green respectively, whereas the rest of the protein chain is in cyan.

Molecular Dynamics of Monomer Model
During MD simulations the model (Fig. 5B) reached a stable equilibrated state after 4 ns with a final RMSD value lesser than 0.8 nm, computed by superposing the various structures and concerning all atoms (Fig. 6A).
The plot of the gyration radius evidences that the monomer become more compact during the simulation (Fig. 6B).We have also calculated the total H-bonds and they account for a total of 35 and 68 before and after the dynamics, respectively, still suggesting a more compact structure of the final model.However, during the whole simulation the architecture of the structure is kept stable, the irregular regions are more fluctuating in respect to the two helices (Fig. 6C) and a short β-sheet is formed (Fig. 5B).
To better understand the collective fluctuations of RSV4 during the MD simulation we performed a cluster analysis as reported in our recent paper (Vincenzi et al., 2015;Bergantino et al., 2015).It was performed on the basis of the RMSD values between the different conformations by selecting the structures sharing similar conformational features during the MD simulation.A number of 23 clusters (Fig. 6D) were obtained with seven most populated clusters (Fig. 7).Hence, RSV4 is a typical example of conformational ensemble having many residues in not regular secondary structure elements (helices and -strands).In fact, it does not explore the full space, as demonstrated from the compactness increase, but its conformers are flexible and go continuously from one cluster to the another.

Conclusions
The behavior of rat RSV4 in solution has already been studied (Stiuso et al., 1999;Caporale et al., 2004) but unfortunately no information is available on its tridimensional architecture.Considering its peculiar biological function in rat fertilization as well as its interesting pharmacological properties for humans, the knowledge of RSV4 structure is a critical point because it can improve the understanding of biological function as well as be useful for drug design.On the other hand, since RSV4 is a member of the intrinsically disordered proteins, it is also important to understand its structural details to have a good insight in its molecular mechanisms of action, still poorly understood.Our studies on RSV4 showed that i) in solution the protein is highly flexible with a strong propensity to be poorly organized with the presence of highly fluctuating helical segments, ii) its poor structural organization is supported by the presence of monotonic transitions with lack of cooperativity when the structure is perturbed by heat or GdmCl, iii) SEC and far UV CD experiments indicate that the protein reasonably belongs to the conformational class NU-PMG [39], and iv) its model presents few regular secondary structure elements (-helix and -strand) and long irregular segments.Hence, the 3D-model of RSV4 represents an interesting template to study the biological properties of disordered proteins and an useful tool to explore its pharmacological potential against human diseases.

Fig. 7 .
Fig.7.Superimposition of the clusters obtained for RSV4 during their molecular dynamics.We report the -helices and -strands in red and yellow, respectively, whereas the rest of the protein chain is in green.

Table 1 .
Details related to four templates used in the modeling procedure.