Relationship between Event Prevalence Rate and Gini Coefficient of Predictive Model

Predictive models are currently used for early intervention to help identify patients with a high risk of adverse events. Assessing the accuracy of such models is a crucial part of the development process. To measure the predictive performance of a scoring model, quantitative indices such as the K-S statistic and C-statistic are used. This paper discusses the relationship between Gini coefficients and event prevalence rates. The main contribution of the paper is the theoretical proof of the relationship between the Gini coefficient and event prevalence rate.


Introduction
Risk prediction models are currently being used in the health care sector to help care providers identify high-risk patients in order to implement diversionary interventions (Morgan et al., 2019;Henderson et al., 2021). Prediction model evaluation is a crucial step in the development of such models and is normally focused on discrimination and calibration (Steyerberg et al., 2010). Model discrimination refers to the ability of the model to discriminate between patients with and without the event of interest. Commonly used measures for evaluating model discriminative ability for binary events are receiver operating characteristic curves, concordance statistic (C-statistic), and precision-recall plots. Geometrically, the C-statistic is equal to the area under the receiver operating characteristic curve. C-statistic can also be interpreted as the probability that a randomly selected patient who had the event will have a higher predicted probability of having the event than a randomly selected patient who did not have the event. In ideal discrimination, in which predicted probabilities of the patients with the event are all higher than predicted probabilities of patients without the event, the C-statistic is equal to 1. Calibration, on the other hand, refers to the agreement between observed events and predictions. K-S statistics and Net Benefit can be used to evaluate model calibration but may not be appropriate in some circumstances (Morgan et al., 2019). As more and more machine learning algorithms are used in predictive models, especially in ensemble models, calibration may not be necessary for models that are only used for ranking, or when the predicted score cannot be interpreted as probability (Morgan et al., 2019).
Concentration curves and associated Gini coefficients are widely used tools for analyzing economic inequality. See Cowell (2011) andJackson (1992). Concentration curves are also appropriate for measuring predictive model performance (Morgan et al., 2021). Concentration curves display the relationship between the accumulative true positive rate and the accumulative population proportion when patients are ranked in descending order by predicted risk score. Concentration curves provide more insight than receiver operating characteristic curves, especially in the case of low-prevalence events when interventions are prioritized to only the highest risk individuals (Keya et al., 2020). For example, from Figure 1, below, one can easily see that the top 10% riskiest patients include around 50% of patients who have the outcome. Thus, if health care professionals provide outreach to the top 10% riskiest patients, then half of all patients who will experience the event will have been contacted. The Gini coefficient is defined as two times the area between the concentration curve and the diagonal line and gives a summarized measure of the model discrimination. A larger Gini coefficient means better model discrimination. As with C-statistic, Gini coefficients are rank order statistics; that is, if the risk score values change while the relative ranking of individuals within the population remains unchanged, then both the C-statistic and Gini score will remain unchanged.
Although concentration curves and Gini coefficients are valuable in model discrimination evaluation, they are affected by the prevalence rate of the event of interest. As far as the authors know, this paper is the first one that rigorously proves the relationship between Gini coefficients and event prevalence rates. The main contribution is the mathematical theoretical proof of the relationship between Gini coefficients and event prevalence rates through introducing a parametric equation. This formula provides an upper bound of Gini coefficient for evaluating predictive model performance.

Main Results
Assume that we have patients, and every patient is associated with a tuple ( , ), where = 1,2, … , and is the predicted risk score for patient , is the event status for patient , and We assume that 1 ≥ 2 ≥ …≥ since patients with high-risk scores are of interest in most situations. To facilitate the proof, we introduce some mathematical notations. We denote 0 as the event prevalence rate, 0 = , where is the number of patients who have the event of interest. The empirical distribution function of the scores of the event is the accumulative percentage of patients with the event and with scores at least . It is denoted as where is an indicator function with the definition: and is a parameter such that ∈ [ 1 , 2 ] , 1 = min ( , = 1,2, … , ) and 2 = max ( , = 1,2, … , ) . The empirical distribution function for the scores of all patients is denoted as We first prove a mathematical formula between Gini coefficient and the event prevalence rate for ideal discrimination. The ideal discrimination is defined as a set of tuples ( , ), = 1,2, … , , such that 1 ≥ 2 ≥ … ≥ and min ( | = 1) > max ( | = 0).
The Gini coefficient ( ) is defined as two times the area between the concentration curve and the diagonal line. Hence, we see that 0 ≤ . By geometry, from Figure 2, we have that Hence, we have = 1 − 0 for 0 ∈ (0,1). Thus, we have the following result.
Theorem 1. For the predicted risk scores with ideal discrimination, the relationship between Gini score and event prevalence rate is = 1 − 0 , 0 ∈ (0,1), where is Gini coefficient and 0 is the event prevalence rate.

Equation in
Theorem 1 is for ideal discrimination. However, in reality, not all predicted probabilities of the patients with the event are higher than the predicted probabilities of patients without the event. Thus, we have the following corollary.
Corollary 1. For any predicted risk scoring, the relationship between Gini coefficient and event prevalence rate is 0 ≤ ≤ 1 − 0 , 0 ∈ (0,1). The inequality ≥ 0 is due to its geometric meaning and inequality ≤ 1 − 0 is because there is at least one non-event subject having predicted score larger than the predicted probability of some subject with event. Corollary 1 gives an upper bound of Gini coefficient for any predicted risk scores.

Discussion
In order to determine the true accuracy of a predictive model, one must know how well a "perfect" model would behave. While it is theoretically possible for any model to achieve a C-statistic of 1, the authors have shown that the same cannot be said for a model's Gini coefficient. As shown in Theorem 1 and Corollary 1, the Gini coefficient of a predictive model with ideal discrimination increases as prevalence rates decrease. For example, if the event prevalence rate is 40%, then the maximum Gini coefficient the predicted risk score could attain is 0.6. However, if the event prevalence rate is 1%, then the maximum Gini coefficient the predicted risk score could reach is 0.99.
These findings suggest two important strategies for determining and comparing the performance of predictive models. First, no model should be benchmarked against a Gini coefficient of 1 because that is only possible for events that do not occur. Second, the relative performance of models trained on events with different prevalence cannot be determined by comparing their Gini coefficients alone. For example, it may not be true that a model predicting an event with 10% prevalence with a 0.6 Gini coefficient performs better than a model predicting an event with 50% prevalence with a 0.4 Gini coefficient, since the upper limit on the Gini coefficient is different in each scenario. Both of these strategies have practical applications in applied predictive modeling and provide insights into the relationship between the Gini coefficient and event prevalence for model evaluation.