Testing Equality of Nonparametric Quantile Regression Functions

This article proposes a new approach for testing the equality of nonparametric quantile regression functions based on marked empirical processes. We develop test statistics that posses better Type I and power properties in comparison to all available procedures in the literature. Simulation results also indicate that our tests have superior local power properties over existing tests. A data analysis is given which highlights the usefulness of the proposed methodology.


Introduction
Comparing groups based on independent samples has been a fundamental problem in statistics.In the context of regression analysis, this is typically done by comparing the conditional means of the groups.In parametric regression, comparison of model parameters of a predefined parametric model provides information about differences among the groups.When a parametric specification is not appropriate, one compares nonparametric mean regression functions to detect any differences between groups (Hall & Hart, 1990;King, Hart, & Wehrly, 1991;Kulasekera, 1995;Neumeyer & Dette, 2003).The above mentioned procedures rely on the assumption that the errors of the assumed model have finite variance.In practice this assumption may not hold, especially with heavy tailed error distributions.Furthermore, these procedures are highly sensitive to extreme observations which can lead to spurious results.To overcome these difficulties in mean regression procedures, Koenker and Basset (1978) in their pioneering work proposed the quantile regression framework where, one regresses the conditional quantiles of a response on covariates rather than regressing the conditional mean.In this article we develop a flexible and robust testing procedure to compare groups within the quantile regression framework.Our method uses nonparametric quantile regression functions to construct test statistics that can detect differences at targeted quantiles in two or more conditional distributions.
Suppose we observe data from k independent groups in the form (X i j , Y i j ), i = 1, . . ., k, j = 1, . . ., n i , where X is a continuous covariate supported on [0, 1] and Y is a continuous response.For τ ∈ (0, 1), let g τ (X) denote the τ th conditional quantile function of Y given X (i.e., P[Y ≤ g τ (X)|X] = τ).We model g τ (•) nonparametrically with the following nonparametric quantile regression model: Y i j = g τ,i (X i j ) + i j , i = 1, . . ., k, j = 1, . . ., n i (1) where g τ,i is the τ th conditional quantile function of Y given X for the i th group and i j is a sequence of independent random variables assumed to be identically distributed within each group.We further assume that the τ th conditional quantile of i j given X i j is zero.We are interested in testing whether the conditional quantile curves are the same across the k groups or not for a fixed τ ∈ (0, 1) .That is, testing H 0 : g τ,i , . . ., g τ,k = g τ vs H 1 : g τ,i g τ,i for some i i over the support of the covariate X.Sun (2006) and Dette, Wagner, and Volgushev (2011) have compared nonparametric quantile regression functions in the form of the above hypothesis.The former propose a test based on an orthogonal moment condition of residuals which holds under the null hypothesis and the latter uses a test derived from the L 2 distance of the estimated quantile regression curves.Both methods have non-trivial power against local alternatives that converge to the null at a rate ( √ nh 1/4 ) −1 with a scalar covariate.Here n = k i=1 n i is the total sample size and h is the bandwidth (converging to 0) that is used to estimate the nonparametric quantile regression functions assuming the null hypothesis is true.In contrast, we propose a new test procedure that can detect alternatives converging to the null at n −1/2 rate.Our method is based on a marked empirical process of residuals.We note that marked empirical processes based on residuals have been used in the context of testing the equality of mean regression functions by Delgado (1993), Kulasekera (1995), andNeumeyer andDette (2003).
The remainder of this article is organized as follows.Section 2 describes the test procedure and presents the bootstrap method that we use to determine the critical values of our test statistics.Section 3 presents the results of the simulation study that investigates the finite sample performance of our method along with a real data example.Concluding remarks are given in Section 4.

Construction of Test Statistics
For fixed τ ∈ (0, 1) consider the nonparametric quantile regression model in (1).Let η i j = I Y i j < g τ,i (X i j ) − τ and Here I(•) denotes the indicator function and g τ is a weighted average of g τ,i s defined as with λ i (t) ∈ (0, 1) and k i=1 λ i (t) = 1 (Sun, 2006).Under H 0 , we see that U i j = η i j and are independent mean zero random variables.Now consider the process for t ∈ (0, 1).Under H 0 , R i (•) is a mean zero random process.If the null hypothesis is false, R i (•) will have a non-zero mean function which can be used as a basis to detect departures from H 0 .To construct test statistics based on process (3), we need an efficient estimator of R i (t) under H 0 .Estimation of R i (t) under H 0 solely depends on estimating the quantile regression functions g τ,i (•).If H 0 is true, all g τ,i (•)'s are the same across the k groups.Therefore, an efficient nonparametric estimator of the common quantile regression function g τ (•) can be constructed by pooling the data from all groups.For independent data, several nonparametric quantile regression function estimators are available in the literature (Yu & Jones, 1998;Takeuchi, Le, Sears, & Smola, 2006;Dette & Volgushev, 2008;Bondell, Reich, & Wang, 2010).In this paper we use the local linear nonparametric quantile regression function estimator proposed by Yu and Jones (1998).Local linear estimators are popular in practice due to good finite sample performance at boundaries of the support (Fan & Gijbels, 1996).The later estimators are proposed to eliminate the 'quantile crossing' problem in estimating quantile regression functions at multiple quantiles.Since our testing problem targets a fixed τ ∈ (0, 1), this problem does not affect our method.
Let ĝτ (•) be the local linear nonparametric quantile regression function estimator of the common quantile regression function g τ (•).We define ĝτ (x) = â where â and b minimize Here ρ τ (u) = u{τ − I(u ≤ 0)} is called the "check function" and h is a smoothing parameter and K(•) is a symmetric density function on [−1, 1].For i = 1, . . ., k, define the marked empirical process where Ûij = I[Y i j ≤ ĝτ (X i j )] − τ.The process Ri (•) in ( 5) is the building block of our test statistics.Based on these processes, many test statistics can be constructed to test the equality of quantile regression curves.For example, we can define for i, j = 1, . . ., k, For a size α test, we can reject H 0 for values of the above statistics exceeding the (1 − α) quantiles of their respective null distribution.To improve finite sample performance of our test statistics we will use a bootstrap method to compute critical values for our tests.

Bandwidth Selection
Bandwidth selection is an essential part of any nonparametric smoothing procedure.In particular, any testing method that involves nonparametric estimation of functions requires careful attention on bandwidth selection.For local linear nonparametric quantile regression function estimation (the one we have used in our tests), Yu and Jones (1998) have developed a useful rule-of-thumb bandwidth by minimizing the asymptotic mean squared error (AMSE).This bandwidth, h τ , is calculated as follows: where φ(•) and Φ(•) are the probability density function and the distribution function of the standard normal distribution and h mean is the optimal bandwidth for local linear nonparametric mean regression estimation.This is a computationally easy way to calculate AMSE optimal bandwidths.For example, at τ = 0.25, h τ = h mean (1.13) which can be easily obtained using the statistical package R (R Development Core Team, 2010) by h τ = d pill(x, y)(1.13).However, we note that this bandwidth selection method is based on several key assumptions such as the conditional density function of the dependent variable can be approximated by normal densities (Yu & Jones, 1998).Bandwidth selection using other criteria such as power optimization is still an open problem in the quantile regression setting.

Computing Critical Values
Since the exact null distribution is difficult to find analytically, we used a bootstrap method to compute critical values for our test statistics.For quantile regression, the classical bootstrap method do not work since the assumption of zero conditional mean is not true in quantile regression models.To overcome this difficulty Sun (2006) proposed a bootstrap method using a two point wild bootstrap.This method is a modified version of the wild bootstrap of Härdle and Mammen (1993).Since we are comparing our test procedure with that od Sun (2006) we used the bootstrap procedure outlined in Sun (2006) in our simulation study.This method ensures that the τ th conditional quantile of the bootstrap residuals given the data is zero, while the second and the third moments are matched with that of the true residuals.
Let ˆ i j = Y i j − ĝτ (X i j ) denotes the estimated residuals.Here ĝτ (•) is the estimated common quantile regression function using (4).Let b i j denotes the bootstrap resamples of ˆ i j constructed according to the procedure in Sun (2006).Then we create the bootstrap samples {X i j , Y b i j }, where Using these bootstrap samples we then construct B bootstrap replicates of our test statistics.For example, {T Max,b } B b=1 .Then these bootstrap replicates were used to obtain (1 − α) quantiles of the null distributions of the respective test statistics.

Simulation Results
In this section we discuss the results of our simulation study.We generated data from the following heteroscedastic error model with X ∼ U[0, 1] and ∼ N(0, 1).We conducted our simulations for the case of comparing three groups (k = 3).All simulations were repeated 1000 times and we used equal sample sizes for each group.Three sample sizes n i = 25, 50, 100 were considered and five quantiles τ = .05,.25,.5, .75,.95were examined.It is easy to see that the τ th conditional quantile function for model ( 7) is of the form g τ,i (•) = g(•) + σ(•)Z τ where Z τ is the τ th conditional quantile of given X.By choosing variety of functions for g i (•) and σ(•) one can create quantile curves having various degrees of smoothness.For example, with g(x) = x and σ(x) = cos(2πx), we get the median being linear and more curvature at extreme quantiles as shown in Figure 1.We have focused mainly on the test statistic T Max (defined in section 2) in our simulation study as it is the natural choice when comparing more than two groups.However, for comparison purposes we present results for two other statistics, namely, T 1 and T 12 .The simulation study consists of three parts: 1) Investigate the Type 1 error and power properties of our test statistics in comparison to Sun (2006).
2) Investigate how our test procedure can be used as an alternative to nonparametric mean regression procedures in comparing groups, especially in the presence of extreme observations.We compared our approach at τ = 0.5(conditional median) with the nonparametric mean regression test proposed by Neumeyer and Dette (2003).
Both Type 1 error and power of the two methods were studied with 100% normal errors and with a mixture of 80% normal and 20% Cauchy errors.
3) Examine local power properties of our test statistics in comparison to Sun (2006).

Analysis of Type I Error and Power
In this section we compare the Type 1 error of our test statistics with Sun (2006).We present two cases: Case 1 represents a situation of heteroscedastic error variation where the variability increases linearly Next we examine the power of our test statistics and compare it with the test statistic proposed in Sun (2006).
To minimize the complexity of the experiment we used a constant variance function for this example.Table 2 summarizes the results.Both our test statistic and Sun's method have reasonable power even with small sample sizes as shown in Table 2.As sample size grows both methods approach 100% power in for mid quantiles.At extreme quantiles, our T Max statistics performs better compared to Sun's procedure.For example, at the 95 th percentile, the difference in power is about 14% in favor of our T Max statistics.

Comparison With Nonparametric Mean Regression
As described in section 1, the mean regression approach, although quite popular in comparing groups, will have severe drawbacks in the presence of extreme observations.Therefore, we examine how our test procedure can be used as an alternative for mean regression testing methods when the data is contaminated with extreme values.
To this end, we compare our testing procedure at τ = 0.5 (conditional median) with the non parametric mean regression test proposed by Neumeyer and Dette (2003) under two error structures.We used model ( 7) with ∼ N(0, 1) and ∼ 0.8N(0, 1) + 0.2Cauchy.The first error structure mimics the classical regression setup and the second error structure creates outliers by having 20% of the errors come from the standard Cauchy distribution.The choice of the contamination error distribution as Cauchy was motivated by its well-known heavy tail structure.Although any other heavy tail distribution could also be used, we feel that the Cauchy case will serve as a "lower bound" for the power properties of our tests.The weights of the mixture (0.8 and 0.2) was chosen to provide a reasonable balance between the level of contamination and the data.In order to be compatible with the simulation study of Neumeyer and Dette(2003) we used the same functions in their study and focused on comparing two groups (k = 2 case) with σ(t) = 0.5.Table 3 summarizes the results of the experiment with normal errors.Both quantile regression approach (T Max and S UN) and mean regression approach (ND) performed quite well under the normal error structure.The Type 1 error of ND is closer to the 0.05 nominal level than the other two procedures.With linear alternatives (3 rd and 4 th examples in Table 3), T Max performed as good as the mean regression method in large samples.The power of the quantile regression methods is slightly lower than the mean regression method when dealing with high frequency alternatives (last two examples in Table 3).This is not surprising because the mean regression approach is designed under the normal error assumption with its power properties optimized based on moment conditions of normal errors.
Next we examine the Type 1 error and power properties of these methods under a mixture error structure ( ∼ 0.8N(0, 1) + 0.2Cauchy.).We used the same examples as in Table 3.The results are summarized in Table 4.In contrast, the values in Table 4 clearly shows that the mean regression method breaks down, especially the power of detecting the difference in the two regression functions.For example, in testing g 1 (t) = sin(2πt), g 2 (t) = sin(2πt) + t, with a sample size n = 100, the power of the mean regression method drops dramatically from 100% (with normal errors) to 28% (with a mixture of normal and Cauchy errors).In comparison, the power of the proposed method T Max in particular, remains very stable (97% to 93%).

Local Power Properties
The goal of this experiment is to provide evidence that our proposed method has a faster rate of convergence than the test statistic proposed by Sun(2006).We examine the power properties of our test procedure with local alternatives converging to the null at ( √ n) −1 rate.We compare our results to that of Sun (2006) .We restricted out simulation to the k = 2 case to save computational time.We used model (7) with g 1 (t) = e t , g 2 (t) = g 1 (t) + δ/ √ n and σ(t) = 0.1 (homosedastic errors).Here δ is a constant that controls the separation of the two functions and we chose δ = 0.3 in our simulation.The sample sizes for each group were chosen large enough to make the testing problem difficult.Three sample sizes n 1 = n 2 = 500, 2500, 5000 were considered and three quantiles τ = .05,.50,.75 were examined.All simulations were repeated 1000 times and the results are given in Table 5.Sun (2006).Our tests remain well above the nominal α = 5% level whereas the power of Sun's test converge down to α.As suggested by one of the referees, in order to get a more general conclusion we repeated the experiment under another setting.We used different functions: g 1 (t) = sin(2πt), g 2 (t) = g 1 (t) + δ/ √ n and σ(t) = 0.1 and used α = 0.01.As above, all simulations were repeated 1000 times and the results are given in Table 6.As expected, when the α value is low (0.01), the power goes down in both methods, but our method exhibits higher power than the nominal α level.Since the alternative hypotheses that we used in these two experiments are converging at ( √ n) −1 rate, these results clearly demonstrate that our test statistics have a faster convergence rate than Sun (2006).

Data Analyses
We now present two applications of our test procedure to real data: 1) Analysis of maximum wind speed of tropical cyclones in the Atlantic basin.
2) Analysis of managerial expenses of Japanese chemical firms.

Example 1: Tropical Cyclones in the Atlantic Basin
As an application of our proposed methodology we consider a data analysis on Atlantic basin tropical cyclones in the HURDAT data set, which is available on the National Oceanic Atmospheric Administration's website (www.aoml.noaa.gov/hrd/hurdat).The current data contain information on storms between 1851 and 2010.A changepoint analysis of the data through 2008 conducted in Robbins, Lund, Gallagher, and Lu (2011) found changepoints in several measured variables in the early to mid 1960's.Satellites were fully implemented for cyclone surveying in the mid 1960 s, and their introduction also likely improved the accuracy of measured wind speeds.For the above reasons we consider only data from 1968-2010.A changepoint test in Robbins, Lund, Gallagher, and Lu (2011) indicated a significant change in the frequency of storms around 1995.We therefore divide storms into two groups 1968-1994 (n 1 = 258) and 1995-2010 (n 2 = 236).
There has been much interest in whether storms have become stronger in recent years.There are many measures of storm strength considered in the literature (Emanuel, 2005;Briggs, 2008).Here we simply measure storm intensity as the maximum observed wind speed of the cyclone.We investigate the behavior of the 0.9 quantile of maximum windspeed as a function of position within the hurricane season.In particular let X = (storm number in i th year)/(total number of storms in year i), e.g., the fourth storm in a year with 11 storms would have an associated X value of 4/11. Figure 2 contains a plot of the data.The overlay of the nonparametric estimates of the quantile curves indicates that the basic shape of the quantile curve is the same for the two groups, with the strongest storms tending to occur near the middle of the season and relatively fewer strong storms occurring at the beginning of the season.Applying the test statistic T Max to the data indicates a significant difference between the two curves at the 0.05 significance level (test stat = 6.813, cutoff = 6.416).Since the curve for the more recent storms (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010) is systematically above that of the storms from 1968-1994, this provides some evidence that the .9quantile of maximum wind speed has increased in recent years.We note that the procedure of Sun (2006) did not detect a significant difference between the two curves (test stat = 0.095, cutoff = 0.166).Furthermore, the nonparametric mean regression test of Neumeyer and Dette (2003) also did not detect a difference between the two groups.Our test provides different conclusions than either of those techniques.

Example 2: Managerial Expenses of Japanese Chemical Firms
We analyzed the Japanese chemical industry data (Yafeh & Yosha, 2003) which is publicly available at the Econometric Journal website at http://www.res.org.uk.Data consists of various economic factors collected on 185 chemical firms listed in the Japanese stock market.The dependent variable is a measure of expenses on managerial private benefits.The study was focused on investigating the hypothesis that "concentrated shareholding is associated with lower expenditure on managerial private benefits".Ownership concentration of the firm is measured by the variable TOPTEN, which gives the percentage of ownership that belongs to the top 10 shareholders.Our investigation was to check whether this association is different among firms that has high/low leverage (the ratio of debt to debt plus equity).We divided our data into two groups based on the leverage measurements of the firms.
Firms with leverage values above the sample median are categorized in to 'high' and the rest are into 'low' groups.As shown in Figure 3, there is a negative association between expenditure on managerial private benefits and the ownership concentration in both groups.The nonparametric mean regression estimates (solid and dotted lines) suggest that the two groups are different.A hypothesis test (Neumeyer & Dette, 2003) revealed that the two mean regression curves are in fact significantly different at .05 level.We then used our test procedure and the one proposed by Sun (2006) at τ = 0.5 to check whether the conditional medians of the two groups are the same or not.All of our test statistics were able to capture a difference between the two conditional medians.However, the test proposed in Sun (2006) did not reject the null hypothesis of equality of conditional median curves.We used the bandwidth selection methods which we adopted in our simulations.For computing bootstrap critical values, we used B = 1000 repetitions to increase accuracy.Table 7 summarizes the results of our testing at several conditional quantiles.This analysis suggests that the differences between two conditional distributions exists at mid to upper quantiles.No differences was observed at quantiles below the median.It provides a more elaborate picture of the actual differences in the two groups compared to the nonparametric mean regression approach of Neumeyer and

Figure 3
Figure 3 depicts how managerial expenses (Y) change with ownership concentration (TOPTEN) in the two groups.
Sun (2006)epresents a situation where the variability increases non-linearly.The critical values are computed for a α-level test with α = 0.05.As shown in Table1, the Type I errors of our test statistic T Max underestimate the level of the test for small sample sizes but gets to the nominal 5% level as the sample size increases.In comparison, the test statistic proposed inSun (2006)(denoted by S UN), underestimate the level at upper quantiles even for large sample sizes.This pattern is more prominent in Case 2, where the error variance is nonlinear.

Table 2 .
Empirical Power based on 1000 simulated samples

Table 3 .
Type I error and power with 100% normal errors

Table 4 .
Type I error and power with 80% normal and 20% Cauchy errors

Table 5 .
Comparison of power with local alternatives with α = 0.05 As shown in Table 5, our test statistics have much better local power properties compared to

Table 6 .
Comparison of power with local alternatives with α = 0.01