Decomposition of Parsimonious Independence Model Using Pearson , Kendall and Spearman ’ s Correlations for Two-Way Contingency Tables

Decomposition of Parsimonious Independence Model Using Pearson, Kendall and Spearman’s Correlations for Two-Way Contingency Tables Kiyotaka Iki1, Shun Sato2 & Sadao Tomizawa2 1 Faculty of Economics, Nihon University, Japan 2 Department of Information Sciences, Faculty of Science and Technology, Tokyo University of Science, Japan Correspondence: Kiyotaka Iki, Faculty of Economics, Nihon University, Chiyoda-ku, Tokyo, Japan.

When the scores {u i } and {v j } are equal-interval scores (or the integer scores {u i = i} and {v j = j}), the LL association model is identical to the uniform association model (see Goodman, 1979 andAgresti, 1984, p. 78).The odds ratio for rows i and j (> i), and columns s and t (> s) are denoted by θ (i< j;s<t) ; thus, θ (i< j;s<t) = p is p jt p it p js .
Using the log odds ratio, the LL association model can be expressed as A special case of the LL association model obtained by letting θ = 1 is the I model.
When the I model holds, Pearson correlation coefficient ρ for U and V (denoted by ρ(U, V)) is equal to zero, however, the converse does not hold.We are interested in what structure between X and Y is necessary for obtaining the independence, in addition to the structure of the correlation being equal to zero.Instead of the structure that ρ(U, V) is equal to zero, we are also interested in the structure that Kendall's tau-b measure (Kendall, 1945) or Spearman's ρ s measure (Stuart, 1963) is equal to zero.Let P C and P D denote the probability of concordance for a randomly selected pair of observations and the probability of discordance for the pair, respectively, i.e., see Kendall and Gibbons (1990, p. 6).Kendall's τ b is defined by where where {r X i } and {r Y j } are the marginal rigits; see Bross (1958) and Fleiss et al. (2003, pp. 198-205).Let h 1 (i) = r X i (i = 1, . . ., r) and h 2 ( j) = r Y j ( j = 1, . . ., c).Define the variables Z 1 and Z 2 by Z 1 = h 1 (X) and Z 2 = h 2 (Y).Spearman's ρ s is the correlation coefficient of Z 1 and Z 2 , defined by Note that E(Z 1 ) = E(Z 2 ) = 0.5 although the proof is omitted.Tomizawa et al. (2008) showed the following theorems; Theorem 1 The I model holds if and only if Pearson correlation coefficient ρ(U, V) = 0 and the LL association model holds.
Theorem 2 The I model holds if and only if Kendall's τ b = 0 and the LL association model holds.
Theorem 3 The I model holds if and only if Spearman's ρ s = 0 and the LL association model holds.
These theorems showed that the structure of the LL association model is necessary for obtaining the independence, in addition to the structure of correlations being equal to zero.Tomizawa (1992) considered the parsimonious Linear-by-Linear association (PLL) model, defined by Let ω X i j denotes the local odds of classification in column j + 1 instead of j for a fixed row i, i.e., ω X i j = p i, j+1 /p i j (i = 1, . . ., r; j = 1, . . ., c − 1) and ω Y i j denotes the local odds of classification in row i + 1 instead of i for a fixed column j, i.e., ω Y i j = p i+1, j /p i j (i = 1, . . ., r − 1; j = 1, . . ., c).Then using the log odds ratio and the log local odds, the PLL model can be expressed as where ξ X i and ξ Y j are unspecified.Namely, this model has the restrictions of local row odds and local column odds, in addition to the structure of the LL association model.We are interested in proposing the parsimonious independence model and considering decompositions of the proposed model using the PLL model and correlations.
In this paper, we (i) define the parsimonious independence model, (ii) show the parsimonious independence model holds if and only if the PLL model holds and the each one of ρ(U, V), τ b and ρ s equals zero, and (iii) show the goodness-of-fit test statistic for the parsimonious independence model is asymptotically equivalent to the sum of test statistics for the decomposed models.Examples are given.

Decompositions of the Model
We define the parsimonious independence (PI) model by The PI model is a special case of the PLL model obtained by letting θ = 1.This model describes that the row and column variables are independent and the local row odds and local column odds have the restrictions, namely, Tomizawa et al. (2008) showed the following lemma; From Lemma 1, we obtain the following theorem; Theorem 4 The PI model holds, if and only if ρ(U, V) = 0 and the PLL model holds.
Proof.Under the PLL model, equation ( 1) is expressed as Thus, ρ(U, V) = 0 holds if and only if θ = 1 (i.e., the PI model holds).The proof is completed.Tomizawa et al. (2008) gave the following lemma; From Lemma 2, we obtain the following theorem; Theorem 5 The PI model holds, if and only if τ b = 0 and the PLL model holds.
Proof.Under the PLL model, equation ( 2) is expressed as Thus, τ b = 0 holds if and only if θ = 1 (i.e., the PI model holds).The proof is completed.Tahata et al. (2008) gave the following lemma; From Lemma 3, we obtain the following theorem; Theorem 6 The PI model holds, if and only if ρ s = 0 and the PLL model holds.
Proof.Under the PLL model, equation ( 3) is expressed as Thus, ρ s = 0 holds if and only if θ = 1 (i.e., the PI model holds).The proof is completed.

Orthogonal Decomposition of the PI Model
Let n i j denote the observed frequency in the cell of ith row and jth column of the table (i = 1, . . ., r; j = 1, . . ., c).Assume that a multinomial distribution applies to the r × c table.The maximum likelihood estimates of expected frequencies under the models can be obtained by using a iterative procedure, for example, the general iterative procedure for loglinear models of Darroch and Ratcliff (1972) or using the Newton-Raphson method to the log-likelihood equations.
Let G 2 (M) denote the likelihood ratio chi-squared statistic for testing goodness-of-fit of model M, namely, where mi j is the maximum likelihood estimate of expected frequency m i j under the model M. Each model can be tested for goodness-of-fit by the likelihood ratio chi-squared statistic with the corresponding degrees of freedom (df).The numbers of df for the PI model, PLL model, and ρ(U, V) = 0 are rc − 3, rc − 4, and 1, respectively.We obtain the following theorem; Theorem 7 The test statistic G 2 (PI) is asymptotically equivalent to the sum of G 2 (ρ(U, V) = 0) and G 2 (PLL).

Examples
In this section, we use the known integer scores {u i = i}, {v j = j} for rows and columns to simplify the problems.

Example 1
We consider the data in Table 1, obtained in Grizzle et al. (1969).These data have four different operations for treating duodenal ulcer patients correspond to removal of various amounts of the stomach.Operation A1 is drainage and vagotomy, A2 is 25% resection (antrectomy) and vagotomy, A3 is 50% resection (hemigastrectomy) and vagotomy, and A4 is 75% resection.The categories of operation variable have a natural ordering.The dumping severity variable describes the extent of an undesirable potential consequence of the operation (none, slight and moderate), which are also ordered.
When we apply the PLL model for these data, the PLL model fits well with G 2 = 7.87 based on df = 8.Also the PI model fits well with G 2 = 13.61 based on df = 9.For testing the hypothesis that the PI model holds under the assumption that the PLL model holds, the likelihood ratio statistic G 2 (PI | PLL) is given as G 2 (PI) − G 2 (PLL) = 5.74 based on df = 9 − 8 = 1.Therefore this hypothesis is rejected at 0.05 significance level.Hence we prefer the PLL model to the PI model for the data in Table 1.Also the likelihood ratio statistic G 2 (PLL | LL) is given as G 2 (PLL) − G 2 (LL) = 3.28 based on df = 8 − 5 = 3. Therefore this hypothesis is accepted at 0.05 significance level.Hence we prefer the PLL model to the LL model for the data in Table 1.Under the PLL model, the maximum likelihood estimates of α and β are 0.83 and 0.32, respectively, and the maximum likelihood estimate of θ is 1.16.From Table 3, for any fixed row i, all local odds ω X i j ( j = 1, 2) are estimated to be smaller than 1.Also, the odds ω X i1 (and ω X i1 ω X i2 ) are estimated to increase as the row i increase.Thus it is inferred that the Damping severity tend to worse as the Operation levels increases.

Example 2
The data in Table 4, obtained in Fienberg (1980, p. 20), present the relationship between aptitude (as measured at an earlier data by a scholastic aptitude test) and occupation.Occupation level O1 is self-employed, business, O2 is selfemployed, professional, O3 is teacher and O4 is salaried, employed.From Table 5 we see that the PI and PLL models fit these data poorly, however, the tests for ρ(U, V) = 0, τ b = 0, and ρ s = 0 are accepted.From Theorems 4, 5 and 6, we see that the poor fit of the PI model is caused by the influence of the lack of structure of the PLL model (not the lack of the ρ(U, V) = 0, τ b = 0, and ρ s = 0).
From Table 5, we see that the I model fits these data poorly.Thus, we can interpret that row and column variables are not independent, although the correlations of row and column variables are equal to zero.These data are one example that when the I model holds, ρ(U, V) = 0 is true, however, converse does not always holds.

Concluding Remarks
When the PI model fits the data poorly, Theorems 4, 5, and 6 may be useful for seeing the reason for the poor fit, namely, which of the lack of the structures ρ(U, V) = 0, τ b = 0 and ρ s = 0 and the lack of the PLL model influences strongly.We point out from Theorem 7 that the statistic for testing the PI model under the assumption that the PLL model holds, i.e., G 2 (PI) − G 2 (PLL), is asymptotically equivalent to the statistic for testing the ρ(U, V) = 0, i.e., G 2 (ρ(U, V) = 0).We emphasize that testing the PI model is not equivalent to testing the ρ(U, V) = 0. We saw in Example 2 that the structure of ρ(U, V) = 0 holds, however, the PI model does not hold.

Discussion
Tomizawa (1992) also described the parsimonious uniform (PU) association model.It is a special case of the PLL model obtained by using integer scores {u i = i}, {v j = j} or equal interval scores for rows and columns.We may obtain the theorems changed the PLL model into the PU model in a similar manner to this paper.

Table 1 .
Grizzle et al. (1969)of duodenal ulcer patients according to Operation and Dumping Severity; fromGrizzle et al. (1969).(The parenthesized values are the maximum likelihood estimates of expected frequencies under the PLL model.)

Table 2 .
Likelihood ratio chi-squared values for the testing the models and structures applied to Table1.

Table 4 .
Cross-classification of subjects according to the aptitude and the occupation; fromFienberg (1980, p. 20).(The parenthesized values are the maximum likelihood estimates of expected frequencies under the structure of ρ(U, V) = 0.)

Table 5 .
Likelihood ratio chi-squared values for the testing the models and structures applied to Table4.