Parsimonious Bivariate T-distribution Type Symmetry Models for Square Contingency Tables

For square contingency tables with ordered categories, Iki, Ishihara and Tomizawa (2013) considered the t-distribution type symmetry model and Iki, Okada and Tomizawa (2018) extended this model. These models are appropriate for a square contingency table if it is reasonable to assume an underlying bivariate t-distribution having any degrees of freedom. This study proposes three kinds of parsimonious models for these models. Additionally, this paper provides the decompositions of the parsimonious symmetry model using the proposed model. Some simulation studies based on bivariate t-distribution show the performances of the proposed models.


Introduction
For analysis of contingency tables, we are interested in whether the two classificatory variables are independent of each another. When the independence does not hold, we may use Pearson's correlation coefficient to estimate the correlation between the two variables. Additionally, it is important to interpret the data, and propose models that fit the data well. Goodman (1979) considered the uniform association model, and Agresti (1983a) considered the linear-by-linear association model.
In particular, we consider tables with the same row and column classifications, which are known as square contingency tables. For square contingency tables, the independence between the row and column is unlikely to hold because many observations fall in the main diagonal cells, which indicates that the value of the row category is the same as the value of the column category. Therefore, for the analysis of square contingency tables, instead of independence, we are interested in whether or not the row variable is symmetric with the column variable. The symmetry (S) model (Bowker, 1948), the marginal homogeneity model (Stuart, 1955) and the quasi-symmetry model (Caussinus, 1965) have been proposed as models of symmetry. Moreover, for the research of the symmetry model, see Yoshimoto et al. (2019), Ando et al. (2021) and Shinoda et al. (2021).
We consider an r × r square contingency table with the same row and column ordinal classifications. Let p i j denote the probability that an observation will fall in the ith row and jth column of the table (i = 1, . . . , r; j = 1, . . . , r). The S model is defined by see Bishop et al. (1975, p.282). This model indicates a structure of symmetry of the probabilities with respect to the main diagonal of the table. Agresti (1983b) considered the linear diagonals-parameter symmetry (LDPS) model defined by This indicates that the probability of an observation falling in the (i, j)th cell, i < j , is θ j−i times higher than the probability of it falling in the ( j, i)th cell. A special case of the LDPS model obtained by putting θ = 1 is the S model. Tomizawa (1991) proposed an extended linear diagonals-parameter symmetry (ELDPS) model defined by This indicates that the probability of an observation falling in the (i, j)th cell, i < j , is θ j−i 1 θ j 2 −i 2 2 times higher than the probability of it falling in the ( j, i)th cell. Agresti (1983;1984, p.216 ) described the relationship between the LDPS model  and the joint bivariate normal distribution as follows: the LDPS model may be appropriate for a square ordinal table if it  is reasonable to assume an underlying bivariate normal distribution with equal marginal variances. Moreover, Tomizawa (1991) pointed out that the ELDPS model may be appropriate for a square ordinal table if it is reasonable to assume an  underlying bivariate normal distribution with different marginal variances. For any fixed constant m (m > 2), Iki et al. (2013) proposed the t-distribution type symmetry (TS(m)) model defined by A special case of this model can be obtained by putting η m = 0 in the S model. The TS(m) model indicates that the difference between the two symmetric probabilities raised to the power [= −2/(m + 2)] is proportional to the distance from the main diagonal of the r × r table. The TS(m) model may be appropriate if it is reasonable to assume an underlying bivariate t-distribution with equal marginal variances having m degrees of freedom (see Iki et al., 2013). For any fixed constant m (m > 2), Iki et al. (2018) proposed the extended t-distribution type symmetry (ETS(m)) model defined by A special case of this model can be obtained by putting γ m = 0 in the TS(m) model. The ETS(m) model may be appropriate if it is reasonable to assume an underlying bivariate t-distribution with different marginal variances having m degrees of freedom (see Iki et al., 2018). Now, we are interested in considering more parsimonious t-distribution type symmetry models, which can be described in terms of fewer parameters than the TS(m) (ETS(m)) models.
The purpose of this paper is to propose new models which may appropriate for a square ordinal table if it is reasonable to assume an underlying bivariate t-distribution. The new models are different from the S, TS(m) and ETS(m) models. Section 2 proposes models and describes the properties of the new models. Section 3 includes the decompositions using the proposed models. Section 4 shows the maximum likelihood estimates of expected frequencies under the proposed models. Section 5 describes the relationships between the proposed models and t-distribution by the simulation study. Section 6 provides some concluding remarks.

Models
We consider random variables U and V having a joint bivariate t-distribution with m (m > 2) degrees of freedom, meaning E(U) = µ 1 , E(V) = µ 2 , variances Var(U) = mσ 2 1 /(m−2), Var(V) = mσ 2 2 /(m−2), and correlation coefficient Corr see Muirhead (2005, p.48). The probability density function is also expressed as where c = 1 where c = 1 where c = 1 We consider the r × r square contingency table with ordered categories. For any fixed constant m (m > 2), we propose a model defined by We shall refer to this model as a parsimonious symmetry (PaS(m)) model. From the form of equation (3), the PaS(m) model may be appropriate if it is reasonable to assume an underlying bivariate t-distribution with same marginal means and variances having m degrees of freedom. Under the PaS(m) model, we see that Namely, the PaS(m) model implies the S model.
Next, for any fixed constant m (m > 2), we propose a model defined by We shall refer to this model as a parsimonious t-distribution type symmetry (PaTS(m)) model. From the form of equation (2), the PaTS(m) model may be appropriate if it is reasonable to assume an underlying bivariate t-distribution with same marginal variances (and different marginal means) having m degrees of freedom. A special case of the PaTS(m) can be obtained by putting α 1 = β 1 in the PaS(m) model. Under the PaTS(m) model, . Vol. 11, No. 5;2022 Namely, the PaTS(m) model implies the TS(m) model. Additionally, under the PaTS(m) model, setting ω i j = µ + α 1 i + β 1 j + τ(i 2 + j 2 ) + φi j, we see that Namely, the PaTS(m) model approaches the LDPS model as m becomes larger.
Moreover, for any fixed constant m (m > 2), we propose a model defined by We shall refer to this model as a parsimonious t-distribution type symmetry (PaETS(m)) model. From the form of equation (1), the PaTS(m) model may be appropriate if it is reasonable to assume an underlying bivariate t-distribution with different marginal means and variances having m degrees of freedom. A special case of the PaETS(m) can be obtained by putting α 2 = β 2 in the PaTS(m) model. Under the PaETS(m) model, Namely, the PaETS(m) model implies the ETS(m) model. Further, under the PaETS(m) model, we see that Namely, the PaETS(m) model approaches the ELDPS model as m becomes larger.

Figure 1. Relationships among models
In Figure 1, we show the relationships among models. In Figure, A → B indicates that model A implies model B.

Decompositions of Models
Consider the r × r square contingency table. Let X and Y denote the row and column variables, respectively. We refer to the model of equality of marginal means, that is, E(X) = E(Y), as the ME model. Additionally, we refer to model of equality of marginal means and variances, that is, E(X) = E(Y) and Var(X) = Var(Y), as the MVE model. Then, we obtain the following theorems.
Then, because the MVE model is given by to E(X) = E(Y) and E(X 2 ) = E(Y 2 ), ji ) = 0. Thus, when we assume that the PaETS(m) and MVE models hold, we can obtain p i j = p ji for all i < j. Moreover, p i j − p ji = 0 for all i < j, that is, Therefore we obtain α 1 = β 1 and α 2 = β 2 . Namely, the PaS(m) model holds. The proof is completed.

Theorem 2 The PaS(m) model holds, if and only if both the PaTS(m) and ME models hold.
The proof of Theorem 2 is omitted because that is obtained in a way similar to Theorem 1.

Goodness-of-fit Test
For an r × r contingency table, let n i j denote the observed frequency in the ith row and jth column of the table, where n = n i j and let m i j denote the corresponding expected frequency (i = 1, . . . , r; j = 1, . . . , r). Assume that the observed frequencies have a multinomial distribution. Let G 2 (M) denote the likelihood ratio chi-squared statistic, defined by Similarly, the numbers of degrees of freedom for the PaTS(m) and PaETS(m) models are r 2 − 5 and r 2 − 6, respectively. We consider the maximum likelihood estimates of expected frequencies {m i j } under the PaS(m), PaTS(m) and PaETS(m) models in the log-likelihood equation. For the PaS(m) model, we must maximize the Lagrangian   International Journal of Statistics and Probability Vol. 11, No. 5;2022 where µ = 1 2 11p

Simulation Study
As described in Section 2, the PaS(m), PaTS(m) and PaETS(m) models may be appropriate for a square ordinal table if it is reasonable to assume an underlying bivariate t-distribution having m degrees of freedom. We shall consider the relationships between the proposed models and bivariate t-distribution in terms of simulation studies, and the comparison between the proposed models and S, TS(m) and ETS(m) models.
We count the frequencies of acceptance (at the 0.05 significance level) based on the likelihood ratio chi-squared statistic for testing the hypothesis that the models with the corresponding m degrees of freedom hold per 10000 times for 4 × 4 tables on each condition.
From Tables 1 and 2, we see that the ETS(m) model is a good fit for all conditions. Further the TS(m) model is a good fit when σ 2 2 = 1, and the S model gives good fit on when µ 2 = 0 and σ 2 2 = 1. In contrast, the PaS(m), PaTS(m) and PaETS(m) models show a similar trend when ρ is close to 0. Thus, from the result of this simulation, we obtain that if it is reasonable to assume an underlying bivariate t-distribution with a low correlation coefficient, the parsimonious models would fit the data well.

Concluding Remarks
Each of the S, TS(m) and ETS(m) models is saturated on the main diagonal cells of the table, but the PaS(m), PaTS(m) and PaETS(m) models are unsaturated on them. Thus, under the PaS(m), PaTS(m) and PaETS(m) models, the estimated expected frequencies on the main diagonal are always not equal to the observed frequencies on the main diagonal. The PaS(m), PaTS(m) and PaETS(m) models may be useful when we want to utilize the information on the main diagonal.
From Section 5, when observations are not so concentrated in the main diagonal cells, that is, a correlation coefficient between row and column variables is close to 0, the proposed models (PaS(m), PaTS(m) and PaETS(m)) may be better for application to a square table than the S, TS(m) and ETS(m) models.