A Monte Carlo Study to Assess the Impact of Kurtosis on Statistical Power of Wald-Wolfowitz Test

In this study the impact of kurtosis on the statistical power is researched. For this purpose totally twenty distributions are handled. In this study Wald Wolfowitz test is benefitted from in order to examine the impact of kurtosis on the statistical power and the α significance level is determined to be 0.05. The sample sizes used in the study are equal and small sample sizes from (5,5) to (20,20); in addition, the means in the study were taken as follows: while μ1 = 0, μ2 = 0.5, μ1 = 0, μ2 = 1 and μ1 = 0, μ2 = 1.5. As per the results obtained from the study the coefficient of the skewness in the same sample size is held fixed and is taken as 0 and therefore depending on the decrease of the kurtosis coefficient statistical power decreases generally in the sample sizes and for all of the sample sizes it is observed that the statistical power increases in parallel with the increase in the mean of second sample providing that the mean of first sample is fixed.


Introduction
Parametric tests are based on parametric models.Parametric tests used to test the equality of two independent population means are valid only when the assumption of normality is satisfied (Rayner & Best, 2000).When the condition of normality is not satisfied, the nonparametric equivalent to test the difference between two independent population means is a test for the difference between two population medians (Gibbons, 1971).However, when the assumption of normality is satisfied, applying both parametric and nonparametric tests on the same data set, the former tests yield higher power and the latter shows higher type II error.Thus; when the assumption of normality is not satisfied, nonparametric tests provide significant advantages but, nonparametric tests do not use all of the information provided by the sample, are less effective than their equivalent parametric test and even when they provide the same power they require a bigger sample size (Harwell, 1988;Wright, 2005;Walpole, R. H. Myers, S. L. Myers, & Ye, 2007;Jiang, 2010).
Deviations from normality are measured by the kurtosis of the relevant distribution.Kurtosis is a statistics that measures the extent to which a frequency distribution is concentrated about its mean; it is a measure of peakedness or flatness of a distribution especially with the concentration of values near the mean.Kurtosis has a negative impact on the power of the test, and very small kurtosis has an impact on the type I errors (Wilcox, 1995).
When data exhibits kurtosis or skew distributions, researchers want to know the statistical power of the test they will be using, and what sample size to use to achieve a certain power, and also what the ratio of the two distribution means should be to yield the desirable power.
In this study, Section 2 discusses the concept of statistical power.Section 3, gives the impact of kurtosis and skewness on the statistical power in nonparametric tests.Section 4 gives a Monte Carlo simulation study using Wald Wolfowitz (WW) test, when kurtosis of the two distributions is different, but skewness is fixed.Simulation results are given in Section 5, and in Section 6 we give some concluding remarks.

Statistical Power
Two types of errors are encountered in hypothesis tests, namely type I error and type II error.The power of a test is defined by 1-β; thus the smaller the probability of type II error the greater the power of the hypothesis (Boslaugh & Watters, 2008;Brink, 2010).The power of a statistical test is the probability of the corresponding test to provide a result that has a statistical meaning (Cohen, 1988).
The power function of a test is generally defined by (Geyer, 2001): where Θ A is space of alternative hypothesis.
The statistical power of a test is a function of the population sample size, which is used to define the statistical significance; and also is the function of some other criterions (Murphy & Myors, 2004).
To determine type II error, β, a significance level, alpha, an effect size and sample size are required.If the power of a test is 0.8 or greater, it means that a sufficient power is provided in order to determine the possible impact which may occur.However, if the value is smaller than 0.8; the sample size is extended (Wright, 2005).As the sample size increases the probability for type II error occurrence decreases, and thus the power increases.As α significance level increases, the probability of type II error increases and thus the power decreases (Wilcox, 2009).
The relationship between type I and type II errors are given below: • Type I error and type II error are directly related.
• By adjusting the area of the critical region, that is, the critical values, the probability of type I error may be decreased under any circumstances.
• The increase in the size of the sample will decrease both α and β at the same time.This means that typeI error will decrease and type II error will increase, consequently the power will increase.
• While the null hypothesis is wrong, β reaches the maximum value when the real value of a parameter becomes close to an assumed value.As the distance between the hypothetical value and the real value becomes greater, the value of β will be proportionally smaller (Walpole, R. H. Myers, S. L. Myers, & Ye, 2007).
As it is seen, decreasing both of the error types is possible by increasing the sample size (Spiegel & Stephens, 1999).

Normal Distribution, Skewness and Kurtosis
The normal distribution is defined by the mean μ and standard deviation σ.A special type of the normal distribution which is named as "standard normal distribution" is μ = 0 and σ = 1 (J.G. Ramirez & B. S. Ramirez, 2009).The probability density function for a standard normal random X variable is as given below (Handcock & Morris, 1999;Balakrishnan & Nevzorov, 2003;Eaton, 2007;Freedman, 2009): A distribution deviates from the normality when the skewness differs than zero or kurtosis differs than 3 (Wright, 2005).Vogt (2005) defined the skewness a measure that reflects the degree where a point distribution is asymmetric or symmetric (Vogt, 2005).Skewness is as a measure lack of symmetry in a probability distribution.Skewness is measured by the following equation (Balakrishnan & Nevzorov, 2003): where μ 2 and μ 3 are the second and the third moments related to the mean, respectively; γ 1 (Equation 3) takes on the value of zero for the symmetric distribution, takes on positive values for positively skewed distributions, and negative values for negatively skewed distributions (Wright, 2005;Everitt, 2006).
Kurtosis is an indicator of the degree where a point distribution has reached its top point (Vogt, 2005).Bai and Ng (2005) have denoted the kurtosis coefficient with γ 2 .As follows: In normal distributions, γ 2 − 3 is zero (Bai & Ng, 2005).γ 2 takes the value of 0 for a normal distribution.κγ 2 [Equation 3] is positive for a distribution which has high level of kurtosis and is negative for a distribution which has less kurtosis (Everitt, 2006).The distributions of which the kurtosis coefficient is negative are called platykurtic distributions and distributions with kurtosis coefficient is positive are called leptokurtic distributions (Balakrishnan & Nevzorov, 2003).

Simulation Study
In this study, a Monte Carlo study is conducted using SAS 9.00 computer software.Random numbers are generated from a standard normal distribution N(0,1) using RANNOR procedures (Fan, Felsovalyi, Sivo, & Keenan, 2003).Fleishman's power function was employed to produce random numbers having zero mean and unit standard deviation.PROC NPAR1WAY procedure is used to show the power simulations.The data is generated by using the Fleishman's power method and this method is summarized as the following equation: Where Z has a standard normal distribution, and a, b, c and d are constants chosen in such a way that X has the desired coefficients of skewness and kurtosis.(Fleishman, 1978, cited in Lee, 2007) showed that a = −c and the constants b, c and d can be determined by simultaneously solving the Fleishman Equations for the specified values of skewness, γ 1 , and kurtosis, γ 2 .The equations are solved by using a modified Powell hybrid algorithm and a finite-difference approximation to the Jacobian.The values of a, b, c and d are then substituted into Equation 5to transform the standard normal variable Z to X (Stewart, 2009).As shown in Table 1, the constants a, b, c and d may be chosen such that X has a distribution with specified moments of the first four orders, i.e., the mean, variance, skewness, and kurtosis (Luo, 2011).
Wald-Wolfowitz run test is used to examine whether two random samples come from populations having same distribution.This test can detect differences in averages or spread or any other important aspect between the two populations.
In the study, three different ratios of the mean are used; 20 × 16 × 3, that is totally 960 syntaxes are written and for each syntax, and maximum iterations was set to 30.000 iterations.
The following steps were followed in the simulation: • Twenty population distributions with different skewness and kurtosis values were generated running SAS/ RANNOR program.
• The significance level was selected at α = 0.05 for this study.
• The null and alternative hypotheses for the comparison of WW test simulations were as follows: H 0 : Two population distributions are similar.
The formula which will be used in WW test statistics for small samples was used.Here, we define small sample size as samples of sizes less than 20.Firstly, data of the n 1 + n 2 are arranged in an ascending order.Next, runs were obtained as follows: the data obtained by the first sample are underlined and the data series obtained by the second sample are crossed out.Therefore, the aggregate numbers of sets are determined (Kartal, 2006) • Two independent samples with 16 different sample sizes were randomly obtained by 20 population distributions from (5, 5) to (20, 20) sample sizes.
• WW test statistics values were calculated for the corresponding samples.
• These test statistics were compared with the table of critical values of WW test to determine whether or not the null hypothesis (H 0 ), claiming that two population distributions are similar to each other, will be accepted.
• This procedure was repeated 30.000 times for each possible condition and the numbers of rejections of the null hypothesis for WW test were determined by running SAS/RANNOR command.
• The percent of the number of rejections was computed and compared to the preset alpha level of significance.
The initial result gives the researchers the value of statistical power.

Simulation Results
According to the simulation results show that in all of twenty distributions,providing that the mean of first sample is fixed, the increase in the mean of second sample has a positive impact on the statistical power.
Simulation results are given in Table 2, Table 3 and Table 4.The mean of first sample, 0 and the mean of second sample, 0.5 are given in Table 2.The mean of first sample, 0 and the mean of second sample, 1 are given in Table 3 and the mean of first sample, 0 and the mean of second sample, 1.5 are given in Table 4.
Simulation results show that if the skewness coefficient (γ 1 ) is held fixed, depending on the decrease of the kurtosis coefficient(γ 2 ), in almost all of the sample sizes the statistical power of WW test decreases.Table 2 shows that the decrease is not very apparent when the mean of first sample, 0 and the mean of second sample, 0.5.Moreover, it is observed that the statistical power remains fixed in some sample sizes and decreases in some sample sizes.For example, when the mean of first sample, 0 and the mean of second sample, 1 in Table 3 and mean of first sample, 0 and the mean of second sample, 1.5 in Table 4, respectively, if the skewness coefficient (γ 1 ) is held fixed, depending on the decrease of the kurtosis coefficient(γ 2 ) it is concluded that there is an apparent decrease in the statistical power of WW test.
Simulation results also showed that providing that the mean of first sample is fixed and 0, as the mean of second sample increases, the statistical power of WW test also increases.According to the simulation results, when the mean of first sample is fixed and 0 and the mean of second sample increases from 0.5 towards 1, statistical power of WW test is higher than when the mean of second sample increases from 1 towards 1.5.

Concluding Remarks
When the distributions which were subjected to the analysis and the means of first and second sample are regarded, it was observed that in all distributions and in all sample sizes, the statistical power increased as the mean of first sample is fixed and 0, and the mean of second sample increased.It is concluded that, in all of the distributions, the greatest power increase occurred when the mean of the first sample is fixed and 0, and the mean of the second sample increased from 0.5 to 1.
Excluding a few exceptional cases, it is one of the conclusions obtained from the study that when the sample sizes were increased the statistical power value was increasing.In all of the distributions in which μ 1 = 0 and μ 2 = 0.5 in passing from (6, 6) sample size to (7, 7) sample size; in passing from (10, 10) sample size to (11, 11) sample size; in passing from (11,11) sample size to (12, 12) sample size and in passing from (16, 16) sample size to (17, 17) sample size it was observed that the statistical power values of the WW test decreased and the in other sample sizes as the sample sizes increase it was observed that the statistical power values of WW test was increasing.Similarly, the statistical power of WW test decreased for all sample pairs, from (5, 5) to (19,19), when the means were μ 1 = 0 and μ 2 = 1 and μ 1 = 0 and μ 2 = 1.5.When all of the distributions are regarded, the greatest statistical power values for the WW test were viewed in (20, 20) sample size and the smallest statistical power values were viewed for (7, 7) sample sizes.
The highest statistical power value observed is 0.814 for samples of size (20, 20) and for μ 1 = 0 and μ 1 = 1.5, γ 1 = 0 and γ 2 = 3.75.The smallest statistical power value observed in the study is the 0.008 for samples of size (7, 7) and μ 1 = 0 and μ 1 = 0.5, γ 1 = 0 and γ 2 = 1.00.Thus, it is concluded that the statistical power of WW test decreases as the kurtosis coefficients decreases, given that the skewness coefficient is fixed to zero.
Simulation results show that for skewness of zero and fixed sample size, the statistical power decreases as the coefficient of kurtosis [Equation 4] decreases.No special pattern is observed for fixed γ 4 and increasing sample sizes.However, the statistical power is very low when the mean of second sample increased from 0.5 to 1.The . The WW test statistics, r, is equal to the number of series in all data sets.At α = 0.05 significance level, the tables of lower critical values of r in the runs test and upper critical values of r in the runs test, prepared for WW test are looked.If r is between the values of the tables of lower critical values of r in the runs test and upper critical values of r in the runs test for (n 1 , n 2 ) sample sizes, H 0 is accepted.If r is lower than the values of the tables of lower critical values of r in the runs test or higher than the values of the table of upper critical values of r in the runs test, for (n 1 , n 2 ) sample sizes, H 0 is rejected.