A Class of Alternative Estimators in Probability Proportional to Size Sampling with Replacement for Multiple Characteristics

In this paper, an alternative class of estimators in probability proportional to size (pps) with replacement sampling scheme for multicharacter surveys in which the study variables are poorly correlated with selection probabilities is developed. This is achieved by redefining the selection probabilities. Some existing estimators have been shown to be special cases of the proposed class. Numerical illustrations show that some transformations from the proposed class are more efficient than existing estimators under a super-population model.


Introduction
It is well known in large scale sample survey that interest is on estimating parameters relating to several population characteristics.However, only one measure of size is usually used in selecting primary sampling units in pps scheme.It may sometimes happen that some of these study variables are poorly but positively correlated with selection probabilities, thereby rendering the existing estimators inadequate.Rao (1966) proposed some alternative estimators and showed them to be more efficient than the usual estimators.For the purpose of comparing his estimator with others he assumed the correlation to be equal to zero.Bansal and Singh (I985), Amahia et al. (1989), Grewal (1999) and others have proposed estimators for characteristics that are poorly correlated with selection probabilities.Their estimators have taken into consideration the correlation coefficient between the variable of interest and selection probabilities even though this correlation may be very small.For easy reference, we define these existing estimators as follows: For a sample of size n selected using a pps with replacement sampling scheme, the conventional estimator may be defined as where p i1 = p i (1.2) are respectively the selection probabilities substituted in (1.1) to give the conventional estimator, those proposed by Raos (1966), Bansal and Singh (1985) and Amahia et al. (1989) respectively.A major feature in Amahia et al. (1989) estimator which made it more attractive than Bansal and Singhs (1985) estimators was that while N i=1 p i3 1, the selection probabilities defined in (1.5) will sum up to 1, qualifying it to be a selection probability.
Singh, Grewal and Joarder (2004) have proposed a general class of estimators for the estimation of population total in multi character surveys.Their general class of estimators is of the form, where H(p i ) is a function of p i and satisfies certain regularity conditions defined as: The first and second partial derivatives of H with respect to p i exist and are assumed to be known constants for p i = 1 N .They have shown that all the estimators defined above are special cases of their general class even where selection probabilities do not sum up to unity.They however did not consider what will be the behavior of the expected variance of this class under a super-population model.
In this paper, we develop a class of alternative estimators of the form with p α i satisfying the following boundary conditions: (ii) Ŷp α reduces to Rao's (1966) estimator, ŶR for p α i = 1 N and to the conventional estimator, ŶC for p α i = p i .

The Proposed Class of Alternative Estimators
Let {p i } N i=1 be probabilities, 1] be such a continuous function that satisfies the following boundary conditions: with a continuous one to one function g : [0, 1] → [0, 1] fulfilling g(0) = 0 and g(1) = 1.
Remark If g takes the same value twice, that means that a state of estimation is repeated twice unnecessarily.Hence a continuous one to one function g : [0, 1] → [0, 1] with g(0) = 0 and g(1) = 1 is actually strictly ascending.
From condition (iii), for {p k } N k=1 and p k,ε N k=1 we have Therefore the difference of the two sums in (2.2) gives 0, that is, (2.3) Dividing (2.3) through by ε and taking the limit as ε tends to 0, we have: (2.4) Fixing p i constant and varying all other p j s in (2.4) proves that By integrating (2.5), we have (2.6) Summing (2.6) and from condition (iii), (2.8) Thus, selecting g(N, ρ) := g(ρ) := constant for each fixed ρ from [0, 1] since N is the number of elementary events hence it does not change in a fixed system, with 0 ≤ g(ρ) ≤ 1.From (i): g(0) = 0 and from (ii): g(1) = 1 follow that p * i = (1 − ρ) 1 N + ρp i falls in this category and will always have It is therefore of interest to investigate the behavior of estimators of the form where the selection probability belongs to the class (2.11)

Bias and Variance of the Proposed Class of Estimators
The biases of the proposed class of estimators in (2.10) Vol. 4, No. 3;2012 where p α i = (1 − ρ α ) 1 N + ρ α p i and by substitution we obtain The bias of Ŷp α reduces to 0 at ρ = 1, to B ŶR at ρ = 0 and to B Ŷp The variance of the proposed class of estimators is given as

Expected Biases and Mean Square Error of the Proposed Class of Estimators
To select the best estimator from the proposed class, we find the form of expected value of the biases and MSE's of the proposed class Ŷp α under the assumption of a super population model due to Cochran (1946).
The model assumes that where e i are random variables satisfying and ε denotes expectation operator with respect to the super-population model.The parameters β, a, and g are unknown positive constants.Under the model (2.2.1) e 2 i |p i is the residual variance of y for p = p i .The expected value of the residual variance is given by when the infinite super-population is simulated by the finite population for N units having the same characteristics as that of the super-population.Also, this expected value of the residual variance is known to be given by σ 2 y 1 − ρ 2 , where ρ is the product moment correlation coefficient between y and p.Thus we have, where β 2 = ρ 2 σ 2 y /σ 2 y which from (2.2.3) is equal to Therefore, where The expected mean square error of the proposed class of estimators is where Nσ 2 p from (2.2.5).

Numerical Illustration
In this section we shall consider the efficiency of the proposed class of estimators for population 1 described below.
We denote this by Population 1.The population consists of a pair of 30 numbers randomly generated.The first set (designated y) was generated using the table of random numbers after which another set (designated x) were generated independent of the first.The correlation coefficient between y and x, ρ was computed to be 0.3519.The data for Population 1 are in Table 1.
Table 2 shows the expected variances (as functions of a and β 2 ) for the proposed class of estimators and the estimate of the expected variances and biases for different α values and ρ = 0.1, 0.2, 0.3, 0.35186, 0.5 respectively for g = 0 while Table 3 gives the same information for ρ = 0.1, 0.2, 0.3, 0.5, 0.707311 when g = 1 under the supe-rpopulation model.The value of "a" used in obtaining the estimate of the expected variance was the variance estimate of y obtained from a systematic sample of size 10 (a = 3468.97)while β 2 was calculated using (2.2.5).
Table 4 provides the expected variances (as functions of a and β 2 ) and their estimates for the proposed class of estimators for ρ = 0.1, 0.2, 0.3, 0.35186, 0.5 and g = 2 under the model.Figure 1 represent the scatter plots of the estimates of expected variances against different α values for ρ = 0.1, 0.2, 0.3, 0.35186, 0.5 and g = 0 under model (2.2.1) shown in Table 2. Figure 2 are the plots of the estimates of the expected variances against α values for ρ = 0.1, 0.2, 0.3, 0.5, 0.707311 and g = 1 obtained in Table 3 while Figure 3 show the plots of the expected variance estimates for ρ = 0.1, 0.2, 0.3, 0.35186, 0.5 when g = 2 obtained in Table4.

Discussion of Results
The result of analysis in Table 2 for g=0 shows that: The optimal estimate of the expected variance for the developed class of estimator, Ŷp α is 287049.3and occurs at the value of α satisfying: ρα = ρ.(These are respectively α = 0.4536, 0.6490, 0.8676, 1, 1.5069 for ρ = 0.1, 0.2, 0.3, 0.35186, 0.5).α = 1 which coincides with Amahia eta al 1989 estimator, Ŷp α will be the best estimator if and only if ρ = ρ.
The limit of the estimate of the expected variance of our proposed estimator Ŷp α as ρα → 0 is 300509, being the value of variance of Rao's (1966) estimator ŶR .
It can also be seen that for g = 0, ŶR is more efficient than ŶC irrespective of the value of ρ.This result agrees with that of Rao (1966).
When g = 1 (see Table 3) and for the assumed estimate of the correlation coefficient we observed that the optimal expected variances of 9575.86 is obtained at α = 0.1526, 0.2183, 0.2919, 0.3364, 0.5069 respectively for ρ = 0.1, 0.2, 0.3, 0.5, 0.707311 and that ρα = 0.707311 (that is ρα = 2ρ).At α = 0, the estimate of the expected variance of Ŷp α coincides with that of the conventional estimator ŶC which is observed to be 10060.04.The limit of the estimate of the expected variance of the estimator Ŷp α as ρα → 0 gives the estimate expected variance of Rao's estimator ŶR which in this case, is the maximum value of 11227.Figure 2 show the plots of the estimate of the expected variance of Ŷp α against different α values respectively for ρ = 0.1, 0.2, 0.3, 0.5, 0.707311 and it can clearly be seen in all cases that the graph starts from the y-axis 10060.04 and decreases to 9575.86 and then goes up again before converging to 11227.03, the point of convergence however varies from one rho to another.Any α: 1 > ρα > ρ will always produce a transformation of Ŷp α that is better than those of ŶC and ŶR while ŶC is more efficient than ŶR irrespective of the value of ρ.
Table 4 gives the result of the analysis for the population under consideration for g = 2. Again, α = 0 coinciding with ŶC gives the optimal estimate of expected variance irrespective of the value of ρ also, as ρα → 0, the expected variance of Ŷp α converges to that of ŶR irrespective of ρ.From the plots of the estimate of expected variance shown in Figure 3 the graph starts from 346.41 at α = 0 and increases at a steady rate to a point before converging to 470.75.

Conclusions
From the above findings, we conclude that for our proposed class of estimators for which Amahia et al. (1989), Grewal et al. (1999) estimators constitute specific transformations, no transformation can be said to produce the optimal results for all ρ and g.The optimality of a particular estimator depends on the value of ρ and g.For pps with replacement sampling scheme and for g = 0, the optimal transformation will always occur at ρα = ρ, ρα = 2ρ always gives the optimal for g = 1 whereas the conventional estimator is the best for g = 2.

Figure 1 .Figure 2 .
Figure 1.Expected Variance of Estimator, Ŷa p for different values of α and ρ at g = 0

Table 2 .
Expected variances, estimates of expected variance and bias of Ŷp α for different values of α and ρ (g = 0)

Table 2 .
Expected variances, estimates of expected variance and bias of Ŷp α for different values of α and ρ (g = 0)

Table 3 .
Expected variances, estimates of expected variance and bias of Ŷp α for different values of α and ρ (g = 1)

Table 3 .
Expected variances, estimates of expected variance and bias of Ŷp α for different values of α and ρ