Modeling Corporate Default Rates

In this paper, I propose a model for predicting annual one-year high yield default risk. My work is based on the earlier work of Hampden-Turner (2009). My model forecasts monthly default rates using four predictors, each with various lags: Libor 3-month/10-year Slope, U.S. Lending Survey, U.S. Funding Gap, and Gross Domestic Product Quarter-over-Quarter Growth. Forecasts of future corporate default rates are useful for evaluating the attractiveness of credit market investments and for estimating value-at-risk on credit portfolios. I present results of out-of-sample predictions of annual default rates. I also address some imperfections of the Hampden-Turner formulation through utilization of more rigorous selection of variable lags and a logistic transformation of predicted default rates. I demonstrate that estimates of future default probabilities are useful for predicting changes in high yield credit spreads. Keyword: default, financial crisis


Introduction
In this paper I present a model for generating twelve consecutive monthly predictions of high yield default rates and compare out-of-sample model predictions with observed historical default rates.In addition, I demonstrate in historical testing that predicted default rates, but not trailing default rates, can predict directional moves in high yield credit spreads.
The paper begins by providing some perspective on historical default rates and presents a brief description of previous attempts to predict corporate default rates.I next describe in detail the original Hampden-Turner (HT) model.The HT model takes market prices and economic indicators as inputs and generates monthly default rate predictions for the subsequent twelve months.However, the model, in its original form, has some limitations.These include the possibility of generating negative default rates and relatively poor out-of-sample performance.I address these and other shortcomings of the HT model by developing a new model, called HT-2.0.This enhanced model includes a logistic transformation of predicted default rates to avoid negative values, imposes a penalty for large regression coefficients, and uses "pre-whitening" to improve selection of lags for the input variables.I then show the improvement in out-of-sample predictions of historical default rates using the HT-2.0 model.
To demonstrate the usefulness of models for estimating future default rates, I show how forecasts of default probabilities, but not current default rates, are useful for predicting subsequent changes in high yield credit spreads.I also use the model for my simulations of expected losses from default on credit portfolios.
Rating agencies typically calculate the current annual default rate as the percentage of high yield firms that have defaulted over the past twelve months.For example, Moody's Investors Service publishes monthly trailing high yield default rates calculated as the ratio of the number of firms rated below Baa/BBB that have defaulted during the trailing 12-month period to the total number of non-defaulted firms rated below Baa at the beginning of that period (Note 1).Since the universe of rated firms differs among the various rating agencies, it is not surprising that rating agencies typically report different trailing annual default rates.For example, Figure 2 displays annual default rates from Standard & Poor's (Vazza & Kraemer, 2013) and Moody's (Ou, Chu, In, & Metz, 2013) for all firms and for high yield firms only.Although default rates reported by Standard & Poor's appear to be slightly higher than those from Moody's, they typically rise and fall in tandem.This is illustrated graphically in Figure 3, which displays the speculative annual default rates reported by S&P and Moody's.Although trailing default rates are of some interest to investors, projections of future default rates are even more relevant for performance of corporate markets, particularly those in high-yield.Moreover, expected default rates are of interest to lenders, risk managers, and other counterparties to credit-based transactions.The goal is to generate accurate forecasts of monthly default rates for the next twelve months and to demonstrate their usefulness for making investment decisions.
There have been several approaches to modeling future default rates.Most begin with the observation of the current 12-month trailing default rate.One approach to projecting the default rate for the next 12 months is to generate stochastic future default rates using a model that relies on mean-reverting properties calibrated to the historical properties of historical default rates.The key assumption of this type of model is that default rates follow certain stochastic process and therefore the time-series record of actual default rates is a sample path generated by that process.Although this approach can be useful for simulation purposes, it constrains us to use mean default rates as expected levels of future default rates.To derive a statistical model with predictive power, others have adopted an alternative approach that incorporates econometric factors leading default.Examples include linear models detailed in Fons (1991), HelIge and Kleiman (1996), Jonsson and Fridson (1996).These authors have identified macroeconomic variables of explanatory power whose effectiveness is evaluated by calculating root-mean-squared errors between predicted and obtained default rates.An alternative model proposed by Keenan, Sobehart and Hamilton (1999) incorporates the effect on default rates of changes in the universe of issuers, both in terms of their credit ratings and the time since they first came to market (the "aging effect").Their model also captures macroeconomic conditions as measured by the industrial production index and interest rate variables.Finally, Hampden-Turner (2009) has developed statistical models to predict future default rates from one to twelve months using least-squares regression and vector autoregressive models.
Unfortunately, most existing models that show good performance, including the Hampden-Turner (HT) model, are validated in-sample.However, I find most studies do not evaluate out-of-sample forecasts over a suitably long period (e.g., cover an entire credit cycle), and it remains unclear how well these models perform, especially in periods of high default rates.In this paper, I first describe the Hampden-Turner model, pointing out its advantages and limitations.Then, building upon the HT framework, I apply statistical approaches to address that model's limitations, while also explaining out-of-sample validation on the enhanced model.Finally, I show how an accurate model of default prediction can provide useful information regarding the attractiveness of investment in high yield corporate debt.

The Hampden-Turner Default Model
Since my model takes its starting point from the Hampden-Turner formulation, I present that model briefly in this section.The HT default model fits and predicts monthly default rates using lagged versions of the following four predictors:  Libor Slope: designated as LIB, which is the yield spread between 10-year and 3-month LIBOR rates divided by the term difference between 10 years and 3 months (i.e., 10.0-0.25)(Note 2). The U.S. Lending Survey: denoted LS.The U.S. Federal Reserve sends lending surveys quarterly to gather opinions from banks' senior loan officers on bank lending practices.The survey estimates the net percentage of domestic banks tightening standards for commercial and industrial loans to large and middle-market firms.Banks tighten loan standards when financial conditions are deteriorating or are expected to worsen, thus leading to tougher environment for high yield credits and higher default rates. The U.S. Funding Gap: denoted FG.FG is "the macroeconomic equivalent of final free cash flow.It is the net cash flow a company receives (or requires) after capital expenditures, dividend payments, mergers and acquisitions, and net equity issuance.FG is typically negative in a bull market, indicating that corporations' need to increase financing, and positive in a bear market, as consolidation occurs and spending is reduced. GDP Quarter-over-Quarter Growth: designated as GDP.GDP is the market value of all officially recognized final goods and services produced within a country in a year.GDP growth is indicative of strong economic conditions, portending good corporate performance and vice versa.
Of the four input variables, only LIB is collected monthly, while the other inputs are available quarterly.The HT model uses a simple linear interpolation to generate monthly values for variables LS, FG, and GDP to be paired with monthly values of LIB between their quarterly updates.For example, to interpolate the value of GDP for a given prediction month t one month after the last GDP update at t-1, I use GDP t-1 and the corresponding value of GDP t+2 that will be reported next.That is, (2) Because of the linear interpolation, calculation of monthly data at time t requires data ahead at time t+2.As a result, the predictor values at time t can only be used to predict default rates later than t + 1 or t + 2 unless it is a Since the longest lag is for default rates 49 months prior, the first month of prediction requires 49 months of previous default rates (i.e., t ≥ 49).Unlike the lagged linear regression, the minimum lag in the VAR formula is four months.Therefore, in order to make 12 out-of-sample predictions of г t , I need predictions of LIB, LS, FG as well as GDP at least for months five through twelve.HT proposes to fit a VAR model to each of these time series.
For example, LIB  ti (1 i  4) can be predicted by fitting the following VAR up to time t: Then, to predict LIB t j (5 j  12), HT treats LIB  ti (1 i  4) as already observed data and applies the VAR formula in Equation 6iteratively to generate estimates of LIB for months five through twelve.Otherwise, the general training procedure for the VAR model is similar to that for the OLS predictions in Figure 4 for which the model is trained up to the end of each year and used to predict trailing 12-month default rates for each month in the following year.The resulting estimated trailing 12-month default rates from the VAR model and actual default rates appear in Figure 6.Hampden-Turner's vector autoregressive model

Issues with the Hampden-Turner OLS and VAR Models
There are two statistical issues related with the linear and VAR regression models as implemented in the Hampden-Turner model.First, I find that both models may give rise to negative default rate predictions.In addition, I observed that although the VAR model performs well fitting default data in sample, it exhibited poor out-of-sample prediction performance.For example, Figure 7 shows actual default rates from 1996 to 2013 and out-of-sample predictions from the VAR model (left panel) and the OLS regression (right panel).Neither the VAR or the OLS model capture the actual annual rates well and both models predict negative annual default rates in year 2006.Also, comparison of out-of-sample performance in Figure 7 shows that the simple lagged regression performs better than the VAR model at predicting annual default rates.Despite the advantage of predicted default rates over trailing default rates for estimating changes in high yield spreads over the next year, neither predictor is satisfactory.A more relevant question for potential investors high yield bonds is, "How much yield will I receive for taking on a given level of default risk?"For example, even if default rates are relatively high, an investor may be well compensated by outsized yield spreads to Treasuries (e.g., think 2009).Conversely, if spreads are tight, defaults may be low and investors may still earn attractive returns owing to few defaults.Consider the historical series of average high yield corporate bond spreads in the top panel of Figure 13.Clearly, spread levels vary widely over the cycle.However, the absolute level of spreads does not indicate whether investment in high yield is attractive nor provides reliable signals regarding the future direction of spread moves.A large determinant of those returns depends on the expected default rate over the investment horizon.
To determine if the ratio of high yield spreads to default provides useful information to investors, I first plotted ratios of average high yield spreads to predicted default rates from the HT-2.0 model since 1995.These appear in the lower panel of Figure 13 (Note 6).The assumption is that the ratio of the current high yield spread to predicted default rate is indicative of the attractiveness of high yield returns.Consider first the left-hand panels of Figure 14.The upper panel shows one-year changes in high yield spreads as a function of ratios of the current high yield average spreads to trailing 12-month default rates.In that plot, the green circles represent changes in spreads when the spread-to-default ratios are below average, with the orange squares plotting changes when the ratios are above average.Points are determined monthly, with the vertical blue line showing the average spread-to-default ratio over the period from 1995 to 2013.The scatterplot reveals that ratios using the current trailing 12-month default rate have little ability to forecast high yield spreads one year later.This is confirmed in the bar chart in the lower left-hand panel that presents probabilities of spreads widening or tightening if the ratio of high yield spreads to trailing default rates are above average or below average, respectively.That is, probabilities of spreads widening or tightening are independent of the ratio of high yield spread to current default rate (i.e., probabilities are roughly 50% for all ratios), falling at or near the dashed chance performance line.In contrast to the results using ratios of spreads to trailing default rates, the panels on the right in Figure 14 demonstrate that ratios of spreads-to-default using 12-month default rates are highly related to one-year changes in high yield spreads.That is, when ratios of high yield spreads to predicted default rates are above average, high yield spreads one year later are tighter 85% of the time.When ratios are below average, spreads are wider 65% of the time.The histogram of percentages of spreads widening or tightening as a function of the spread-to-default ratio in the lower right panel of Figure 14 confirms the above-chance performance over the entire range of ratios.The figure also reveals that the size of the ratio has little effect on directional accuracy, except if the ratio is above or below zero.In particular, notice how the directional changes in spreads reverse from widening to tightening on either side of the average spread-to-predicted default ratio.Finally, note that when there are "errors" in the signal from the spread-default ratio (i.e., spreads tightening when ratios are below average and vice versa), average "losses" are smaller than average gains when "correct."For example, when the spread-to-default ratio is above average, the average spread tightening is 203bp, whereas when spreads rise, they rise only 114bp on average.In fact, given the spread-to-predicted default ratio in December 2013 indicated by the circle in upper right panel of Figure 14, the expected spread tightening by December 2014 is: Similarly, if the ratio of high yield spreads to predicted default is less than average, then historical analysis suggests that the average high yield spread will widen by: 65%  (247bp) The results presented in Figure 13 and Figure 14 are intended to demonstrate the usefulness of modeling predicted default rates.I will continue to update my twelve-month default predictions on a monthly basis and use the model for my simulations of expected losses from default on credit portfolios.

Summary
I described a model for predicting 12-month default rates and examined its performance in out-of-sample testing since 1994.The default forecasting model is called HT-2.0 as it is an extension of the model first described by Hampden-Turner (2009).HT-2.0 takes market prices and economic indicators as inputs and generates monthly default rate predictions for the next twelve months.The paper begins by providing some perspective on historical default rates and includes a brief discussion of previous attempts to forecast corporate default rates.I next described in detail the original Hampden-Turner model and addressed some limitations of that model.These include the possibility of the original model to generate negative default rates and that, when I use the model to generate out-of-sample predictions, I observe relatively poor performance.I overcame the shortcomings of the original HT model in HT-2.0 by adopting a logistic transformation on predicted default rates to avoid negative values, by imposing a penalty for large regression coefficients, and by using statistical "pre-whitening" to improve estimates of optimal lags for the input variables.I then show the improvement in out-of-sample predictions of historical default rates using the HT-2.0 model.
To demonstrate the usefulness of estimating future default rates, I show how forecasts of future default probabilities, but not current default rates, are useful for predicting subsequent changes in high yield credit spreads.

Figure 1 .
Figure 1.Annual default rates from standard & poor's and moody's Source: Standard & Poor's and Moody's Investors Service.
Figure 5 il data up to (the red lin

Figure 5 .
Figure 5. Actual default rates (dark blue) along with fitted and predicted monthly default rates via Hampden-Turner's vector autoregressive model Figure 9.

Figure 13 .
Figure 13.Comparison of predicted changes in high yield spreads based on current default rates (left panels) or predicted default rates (right panel).percentages of falling in each cell and average spread changes are also shown