M-Estimators in Regression Models

Regression analysis plays a vital role in many areas of science. Almost all regression analyses rely on the method of least squares for estimation of the parameters in the model. But this method is usually constructed under specific assumptions, such as normality of the error distribution. When outliers are present in the data, this method of estimation, results in parameter estimates that do not provide useful information for the majority of the data. Robust regression analyses have been developed as an improvement to least square estimation in the presence of outliers. The main purpose of robust regression analysis is to fit a model that represents the information of the majority of the data. Many researchers have worked in this field and developed methods for these problems. The most commonly used robust estimators are Huber’s M-estimator, Hampel estimator, Tukey’s bisquare estimator etc. In this paper, an attempt is made to review such type of estimators and a simulation study of these estimators in regression models is carried out. R code has been written for the purpose and illustrations are provided.


Introduction
The theory of robustness developed by Huber and Hampel (1960) laid the foundation for finding practical solutions too many problems, when statistical concepts were vague to serve the purpose.Robust regression analyses have been developed as an improvement to least squares estimation in the presence of outliers and to provide us information about what a valid observation is and whether this should be thrown.The primary purpose of robust regression analysis is to fit a model which represents the information in the majority of the data.Robust regression is an important tool for analyzing data that are contaminated with outliers.It can be used to detect outliers and to provide resistant results in the presence of outliers.Many methods have been developed for these problems.Many researchers have worked in this field and described the methods of robust estimators.The class of robust estimators includes M-, L-and R-estimators.The M-estimators are most flexible ones, and they generalize straightforwardly to multiparameter problems, even though they are not automatically scale invariant and have to be supplemented for practical applications by an auxiliary estimate of scale any estimate.In this paper, an attempt has been made to make an elaborate study of the some of the M-estimators.Section 2 deals with the descriptions of the M-estimators.The redescending M-estimators are presented in the section 3. A simulation study of these estimators providing certain numerical illustrations by using R software is presented in the last section.

M-estimator
The class of M-estimator was introduced by P.J.Huber in 1964; subsequently, such estimators have been discussed extensively by several authors, Andrews et al. (1972), Bunke andBunke (1986), Hampel et al. (1986), Lecoutre andTassi (1987), Robusseeuw andLeroy (1987), Staudte andSheather (1990), Rieder (1994), Jureckova and Sen (1996), Antoch et al. (1998), Dodge andJureckova (2000), Jureckova and Picek (2006) and others.M-estimator T n is defined as a solution of the minimization problem, n i=1 ρ(X i , θ) := min, with respect to θ ∈ Θ where ρ(•, •) is a properly chosen function.The class of M-estimator covers also the maximal likelihood estimator of parameter θ in the parametric model From ( 1) and ( 2) that the M-functional corresponding to T n , is defined as a solution of the minimization or as the solution of the equation The function T(P) is Fisher consistent, if the solutions of ( 3) and ( 4) are uniquely determined.

M-estimator of Location parameter
An important special case is the model with the shift parameter θ, where X 1 , X 2 , ..., X n are independent observations with the same distribution function F(x − θ), θ ∈ ; the distribution function F is generally unknown.M-estimator of location parameter T n is defined as a solution of the minimization and if ρ(•) is differentiable with absolutely derivative ψ(•), then T n solves the equation The corresponding M-functional T(F) is Fisher consistent, provided the minimization have a unique solution θ = 0, i.e., the solution of the equation is,

Asymptotic properties of M-estimator
A fairly simple and straightforward theory is possible if ψ(x, θ) is monotone in θ.Assume that ψ(x, θ) is measurable in x and decreasing in θ, from strictly positive to strictly negative values.Put Clearly,−∞ < T * n ≤ T * * n < ∞, and any value T n satisfying T * n ≤ T n ≤ T * * n can serve as our estimate.Note that at the continuity points t of the left-hand side.The distribution of the customary midpoint estimate 1/2(T * n + T * * n ) is somewhat difficult to work out, but the randomized estimate T n , which selects one of T * n or T * * n at random with equal probability, has an explicitly expressible distribution function It follows that the exact distributions of T * n , T * * n , and T n can be calculated from the convolution powers of If λ exists and is finite for atleast one value of t, then it exists and is monotone for all t.Assume that there is a t 0 such that λ(t) > 0 for t < t 0 and λ(t) < 0 for t > t 0 .Then both T * n and T * * n converge in probability and almost surely to t 0 .Consider the following conditions, (C1) ψ(x, t) is measurable in x and monotone decreasing in t.

Redescending M-estimator
Redescending M-estimators are very popular Ψ-type M-Estimator which has Ψ functions that are non-decreasing near the origin, but decreasing toward 0 far from the origin.Their Ψ functions can be chosen to redescend smoothly to zero, so that they usually satisfy Ψ(x) = 0 for all x with |X| > k,where r is referred to as the minimum rejected point.When choosing a redescending Ψ functions we must take care that it does not descend too steeply, which may have a very bad influence on the denominator in the expression for the asymptotic variance where F is the mixture model distribution.This effect is particularly harmful when a large negative values of Ψ (x) combines with a large positive values ofΨ 2 (x), and there is a cluster of outliers near x.First we introduce Hampel's three-part M-estimator, it has Ψ functions which are odd functions and defined for any x by: Tukey's biweight or bisquare M-estimator have ψ functions for any positive k, which defined by Huber proposed function in 1964,that is For regression analysis, some of the redecending M-estimators can attain the maximum breakdown point.Moreover, some of them are the solutions of the problem of maximizing the efficiency under bounded influence function when the regression coefficient and the scale parameter are estimated simultaneously.Hence redecending M-estimators satisfy several outlier robustness properties.
This section presents the simulation results to check the performance of Huber M-estimator as compared to other well known redescending M-estimators.The simulation study is carried out in three stages.First stage is the normal situation; consider the following linear regression model in which u i ∼ N(0, 1) and the explanatory variables are generated as x i ∼ N(100, 2) using R software and then the values of y i 's are evaluated for the specified values of β 0 = 2 and β 1 = 2.Then, values β 0 and β 1 are computed under various methods of estimators by using R software.In the second stage, 10% of the y i observations are replaced by the values generated from N(10,5), which are referred as outliers in y-direction.After that, as usual, computations performed to estimate the values of β 0 and β 1 .In the last stage, 20% of the x i observations are replaced by the values are generated from N(10,5) which are also referred as outliers in x-direction with the same observations available in the second stage.
The estimated values of β 0 and β 1 in different stages are summarized in Table 1 for the value of n fixed as 50.The same procedure is repeated for n=100 and n=500, and the results arrived by using R software, are presented in Table 2 and  3. From these tables it is clear that the results of the redescending M-estimator are very similar to that of ordinary least square estimator in normal situation.The redecending estimators are not affected by the outliers in both second and third situations while the ordinary least square estimator is affected in these situations.

Conclusion
The performance of robust estimators has been assessed in regression model.Estimators and results are obtained by using R software.It is interesting to note that the class of M-estimators is found to yield essentially the same results as the method of least square estimator in normal situation.When outliers are present in the data; least square estimator does not provide useful information for the majority of the data but not in the case of robust estimators.That is, it is observed that the M-estimators are not affected by outliers.The study establishes the fact that the performance of M-estimators are almost same as the method of least squares in normal situations and also in the presence of outliers.Hence it is concluded that the robust statistical procedures can be considered as modification of the classical procedures and such procedures may not fail when there are small deviations from the assumed conditions.

Table 1 .
Simulation Results of Regression with Intercept, for n=50

Table 3 .
Simulation Results of Regression with Intercept, for n=500