Bayesian Perspective in the Selection of Bean Genotypes

Changes in the relative performance of genotypes have made it necessary for more in-depth investigations to be carried out through reliable analyses of adaptability and stability. The present study was conducted to compare the efficiency of different informative priors in the Bayesian method of Eberhart & Russel with frequentist methods. Fifteen black-bean genotypes from the municipalities of Belém do São Francisco and Petrolina (PE, Brazil) were evaluated in 2011 and 2012 in a randomized-block design with three replicates. Eberhart & Russel’s methodology was applied using the GENES software and the Bayesian procedure using the R software through the MCMCregress function of the MCMCpack package. The quality of Bayesian analysis differed according to the a priori information entered in the model. The Bayesian approach using frequentist analysis had greater accuracy in the estimate of adaptability and stability, where model 1 which uses the a priori information, was the most suitable to obtain reliable estimates according to the BayesFactor function. The inference, using information from previous studies, showed to be imprecise and equivalent to the linear-model methodology. In addition, it was realized that the input of a priori information is important because it increases the quality of the adjustment of the model.


Introduction
The common bean (Phaseolus vulgares L.), a staple food in Brazil, is one of the most important sources of protein in human nutrition, especially for the low-income population (Rocha, Moda-Cirino, Destro, Fonseca Junior, & Prete, 2010). In recent years, Brazil has stood out in the international agricultural scenario as one of the largest producers and consumers of this Fabaceae member.
The genotype × environment interaction is one of the major challenges in the selection and recommendation of superior genotypes, as it changes their relative performance due to environmental variations. Thus, the study of this interaction allows to identify the ideal genotypes for planting in each environment, as a result, maximizes the productive potential of grains and reducing production costs.
Of all methodologies available, only those based on Bayesian inference allow the use of a priori information about the parameters of interest in the process of their estimation. In this method, the parameter is considered a random variable and all uncertainty about it can be represented by a probability distribution. Therefore, under the Bayesian approach, all information is useful and should be taken into account, unlike the classical statistical analysis that uses only information of real data, discarding subjective information (Gamerman & Migon, 1993).
Many researchers have shown that the use of Bayesian inference in adaptability is a robust and efficient statistical procedure that allows for greater accuracy in the selection and recommendation of genotypes (Couto, Nascimento, Amaral Junior, Viana, & Vivas, 2015;Nascimento et al., 2011;Teodoro, Nascimento, Torres, Barroso, & Sagrilo, 2015), allowing to identify genotype that presents high productivity, good adaptability and low sensitivity to adverse conditions. However, Resende, Silva, and Azevedo (2014) asserted that, depending on the a priori information entered in the model, Bayesian inference can be equal to or even present inferior results when compared with those originating from the 'classical' (frequentist) approach.
The present study was thus conducted to evaluate the influence of a priori distribution on the estimate of adaptability and phenotypic stability parameters obtained under the Bayesian approach of Eberhart & Russell's method. For this purpose, we considered informative priors, whose information originated from different sources, and little informative priors.

Genetic Material and Experiment Conduction
The data used in this study originated from experiments undertaken in the 2011 and 2012 crop years at the Experimental Stations of the Agronomic Institute of Pernambuco, in the municipalities of Belém do São Francisco and Petrolina (Table 1). This study involved 12 lines developed by EMBRAPA (Brazilian Agricultural Research Corporation) and three black-bean cultivars (BRS Esplendor, IPR Uirapuru, and BRS Campeiro). Trials were implemented in a randomized-block design with three replicates. Each experimental unit consisted of four 4-m rows with 50 × 20 cm spacing. Seeds were sown manually at the rate of three seeds per furrow. In the harvest period, the two center rows were harvested to determine grain yield per hectare.
Fertilization was performed based on the result of the soil analysis of each experimental area. The crop was irrigated by conventional spraying, and the control of weeds and pests was performed according to the need of the crop in each region.

Statistical Analyses
Grain-yield data were subjected to analysis of variance, and after the homogeneity of residual variances was checked, a combined analysis of variance was performed using Hartley (1950)'s maximum F test in the GENES software (Cruz, 2006), adopting the following model: Y ijk = μ + R/E k(j) + G i + E j + GE ij + ε ijk , where, Y ijk is the mean phenotypic value of the plot, µ is the overall constant; R/E k(j) is the effect of replicate k in environment j; G i is the fixed effect of genotype i; E j is the effect of environment j NID (0, σ E 2 ); GE ij is the effect of the interaction between genotype i and environment j NID (0, σ GE 2 ); and ε ijk is the experimental error NID (0, σ 2 ).
When a G×E interaction was detected, the grain-yield data were subjected to analyses of adaptability and stability by the methodology of Eberhart and Russell (1966) and by Bayesian approach (Couto et al., 2015;Nascimento et al., 2011).
Assuming independence between the parameters of these distributions, the combined a priori approach for each genotype is given by: To draw inferences about the parameters of interest, one must obtain their marginal a posteriori distributions. Denoting the vector of parameters for each genotype i by θ pi = β 1i ,β 2i ,σ i 2 , where p = 1, 2, 3, the marginal a posteriori distribution for parameter θ pi was obtained by the following integer model: P(θ pi |x) = ∫P(θ pi |x)dθ pi , which corresponds to the integer pertaining to all parameters of the vector, except the p-th component.
In the first analysis under Bayesian approach-Model 1 (M1)-the following estimates of the adaptability and stability parameters obtained in previous literature studies were considered a priori information: Bertoldo et al. (2009), Rocha et al. (2010, Oliveira et al. (2011), and Barili et al. (2015) (Table 1). In the second analysis, herein termed Model 2 (M2), informative a priori distributions were also considered, but the information originated from the frequentist analysis by the methodology proposed by Eberhart & Russell (1966).
For the adjustment of both models (M1 and M2), the information was entered through the values assumed for the parameters of the a priori distributions, termed 'hyperparameters'. These values were obtained from the average and from the variance of the sample composed of the estimates of the parameters obtained in the frequentist analysis, which resulted in the following distributions: σ i 2~G amaInv α i , β i (8) where, β 0i = estimates of β 0i ; β 1i = estimates of β 1i ; Var β 0i = variance of β 0i values; Var β 1i = variance of β 1i values; α i and β i = values obtained from the following ratios: , namely: The third model-M3-is characterized by the use of little informative a priori distributions; i.e., distributions that represent great variance. The following distributions were adopted: β 0i~N (μ 0i = 0, σ 0i 2 = 1000000), β 1i~N (μ 1i = 0, σ 1i 2 = 1000000), and σ i 2~G amaInv(α i = 0.0001; β i = 5,000).
The comparison between M1, M2, and M3 was based on the Bayes Factor (BF) (Kass & Raftery, 1995). According to Jeffreys (1961), BF can be interpreted as follows: BF ij < 1 shows strong evidence in favor of model j; 1 ≤ BF ij < 3 shows moderate evidence in favor of model i; 3 ≤ BF ij < 10 shows substantial evidence in favor of model i; 10 ≤ BF ij < 30 shows strong evidence in favor of model i; 30 ≤ BF ij < 100 shows very strong evidence in favor of model i; and BF ij ≥ 100 shows decisive evidence in favor of model i.
In the present study, the methodology was implemented in the R software (R Foundation, 2017) and the sample of the combined distribution was obtained by the MCMCregress function of the MCMC pack (Martin et al., 2011), which uses Gibbs sampler to obtain a sample of the marginal distribution of interest. The Bayes Factor, in turn, was calculated by the Bayes Factor function of the MCMCpack package.
With respect to the stability parameter (σ di 2 ), the samples of its marginal distribution were obtained indirectly, since this parameter represents a function of σ i 2 . When obtaining values for σ i 2 indirectly in each iteration, σ di 2 values are obtained by the following expression: σ di 2 = σ i 2 -(RMS/r , where, RMS = residual mean square provided by the analysis of variance; and n = number of replicates in the experiment.
The hypotheses of interest were tested by creating credibility intervals for the parameters. The intervals were obtained directly from the a posteriori marginal distribution of the parameters.
Because the Gibbs sampler is an iterative algorithm, its convergence must be verified. In this study, this step was performed by applying the criteria of Heidelberger and Welch (1983), Geweke (1991), and Raffery and Lewis (1992), implemented into the Bayesian Output Analysis (BOA) package of the R software (R Foundation, 2017).
In the Bayesian analysis of adaptability and stability, 250,000 iterations were considered in the Gibbs sampler algorithm for each parameter of the adopted regression model, with a burn-in period of 10,000 iterations. To obtain a non-correlated sample, we considered a spacing of five iterations between sampled points ('thinning'), which resulted in samples of the marginal a posteriori distributions of each parameter, under which the inference of each parameter was drawn.

Results
The analysis of variance for grain yield showed significance for the sources of variation genotype and environment, revealing variation between the genotypes, environments evaluated and genotype × environment interaction (Table 2). These variations in behavior suggest the need of an in-depth study of the behavior of these lines in the different environments by an analysis of adaptability and stability, making it possible to predict the behavior of each genotype in the different environments with greater detail. Note. * and ** Related to (p < 0.01) and (p < 0.05), respectively.
The estimates of adaptability and stability parameters considering the Bayesian analysis whose a priori information were obtained from previous studies in the literature (Bertoldo et al., 2009;Rocha et al., 2010;Oliveira et al., 2011;Barili et al., 2015) were obtained by the calculation of the a posteriori mean, and the credibility intervals were 95% (Table 3). Considering the results obtained using Model 2 (M2), the genotypes BRS Espplendor, BRS Campeiro, IPR Uirapuru, CNFP15193, CNFP15194, CNFP15198, CNFP15207 and CNFP15208 were considered to have an unfavorable specific adaptability to environments (β 1i < 1) (Table 3). Only 5 (CNFP10794, CNFP15171, CNFP15177, CNFP15178 and CNFP15188) genotypes were classified as of specific adaptability to favorable environments (β 1i > 1) and two lines (CNFP10104 and CNFP15174) were classified as having general adaptability The estimates of adaptability and stability parameters (β 1i and σ di 2 ) presented in the analysis using little informative priors (M3) were equivalent to those found in the analysis considering model M2. With respect to the Bayes Factor, a method that compares the two models in terms of quality of fit, the obtained values for both comparisons between models M1 and M3 and between M2 and M3 indicated that the entry of a priori information elevates the quality of fit of the model (Table 4). Bayes-factor values ranged from 9.20 to 16.35, indicating substantial (3 ≤ BF< 10) to strong (10 ≤ BF < 30) evidence in favor of the model considering a priori information. Specifically, considering M2, i.e., a priori information originating from the frequentist approach.

Discussion
The significant differences between the sources of variation show the existence of differentiated behavior between genotypes, environments and genotypes in the face of environmental changes (Table 2). These results corroborate many studies evaluating bean genotypes in different regions of Brazil (Barili et al., 2015;Torres et al., 2016;Torres Filho et al., 2017).
According to the estimations of the adaptability and stability parameters, the BRS Esplendor, BRS Campeiro and IPR Uirapuru cultivars presented general adaptability and low predictability, considering the Bayesian analysis whose a priori information was obtained from previous studies in the literature (Table 3).
Results found for cultivar BRS Esplendor agreed with those found by Rocha et al. (2010) and Barili et al. (2015) for adaptability and with those reported by Oliveira et al. (2011) for stability (Table 5). BRS Campeiro obtained the same results for adaptability found in the studies of Bertoldo et al. (2009) and agreed with Oliveira et al. (2011) for stability. With respect to cultivar IPR Uirapuru, only its adaptability agreed with the classification found by Barili et al. (2015), and only stability corroborated the results found by Oliveira et al. (2011). The determinations made by Bertoldo et al. (2009), in turn, were equal for both adaptability and stability.
Considering the results obtained using Model 2 (M2), except for BRS Esplendor, all genotypes showed stability (σ di 2 ) values greater than zero, indicating low predictability in the limits of the 95% credibility interval (Table 3). These results disagreed with those of six genotypes for adaptability and of 13 genotypes for stability, in comparison with the frequentist analysis.
The line BRS Esplendor showed that the estimate of stability parameter presented in the analysis using little informative priors (M3) disagreed with to those found in the analysis considering model M2. Similar results were also found by Nascimento et al. (2011), Couto et al. (2015 and Oliveira et al. (2018).
With respect to the BF, it showed a lower value for genotype CNFP10794 (14.82) and the highest when considering cultivar BRS Esplendor (16.35) ( Table 4). In the comparison considering M1, however, the BF values were lower than those obtained considering M2. This finding indicates that the a priori information obtained from previous literature studies did not contribute to the process of estimation when compared with the use of information from the frequentist approach. This result is corroborated by Resende et al. (2014), who stated that depending on the a priori information entered in the model, Bayesian inference can yield equal or even inferior results when compared with those provided by the 'classical' approach.
In studies of adaptability and stability, because of the reduced information used in the estimation process, which is given by the number of environments assessed, a priori information has a great impact. Additionally, because of the environmental differences in which genotypes are evaluated in the studies used for obtaining previous information, it is extremely important to evaluate the a priori information. Thus, the Bayesian approach provides greater precision of the data, allowing greater greater security in the indication of the genotypes, which can result in increased yields and reduced economic losses by producers. Table 5. Estimates of the stability and adaptability found through the methodology by Eberhart and Russel (1966)   Note. Negative σ² values were considered to be equal to zero; dashes indicate estimates that were not found in the literature consulted.