Estimating Explained Variation of a Latent Scale Dependent Variable Underlying a Binary Indicator of Event Occurrence


  •  Dinesh Sharma    
  •  Amanda Miller    
  •  Caroline Hollingsworth    

Abstract

The coecient of determinant, also known as the R2 statistic, is widely used as a measure of theproportion of explained variation in the context of a linear regression model. In many real lifeevents, interests may lie on measuring the proportion of explained variation, rho^2, of a latent scaledependent variable U which follows a multiple regression model. But in practice, U may not beobservable and is represented by its binary proxy. In such situations, use of logistic regressionanalysis is a popular choice. Many analogues to R2 type statistics have been proposed to measureexplained variation in the context of logistic regression. McFadden's R2 measure stands out fromothers because of its intuitive interpretation and its independence on the proportion of successin the sample. It, however, severely underestimates the proportion of explained variation of theunderlying linear model. In this research we present a method for estimating the explained variationfor the underlying linear model using the McFadden's R2 statistics. When used in a real lifedataset, our method estimated rho^2 of the underlying model within an acceptable margin of error.


This work is licensed under a Creative Commons Attribution 4.0 License.