A Multivariate Process Variability Monitoring Based on Individual Observations

In order to have a better understanding whether or not an additional observation has changed the covariance structure, a new statistic will be introduced. This statistic will be defined as the scatter matrix issued from augmented data set subtracted by that from historical data set. Under normality assumption, the distribution of its Frobenius norm will be derived and, for practical purpose, a chi-square approximation will be presented. This statistic and Wilks’ will be used to construct a new procedure for monitoring process variability based on individual observations. The performance of this procedure in providing information about the effect of an additional observation on covariance structure is promising. An industrial application will be presented to illustrate its advantage. .


Introduction
Since last decades, the notion of manufacturing process quality becomes more and more complex.This is the main reason why quality experts have been considering process quality in multivariate setting.In this setting, one of the most important parameters is process variability.The stability of this shape parameter, which is numerically represented by covariance matrix, must be monitored.In general, there are three scenarios in multivariate process variability monitoring.First, is based on sub-grouping where the sub-group size m is greater than the number of quality characteristics p. Articles in this scenario include Yeh, Huwang and Wu (2004), Djauhari (2005) and Yeh, Lin and McGrath (2006).Second, is based on individual observations, i.e., m = 1 such as presented, for example, in Tracy, Young and Mason (1992), Sullivan and Woodall (1996), Khoo and Quah (2003), Huwang, Yeh and Wu (2007), and very recently Mason, Chou andYoung (2009, 2010).In this scenario, the main problem is to test the effect of an additional observation on covariance structure.Third, the most recent scenario introduced in Mason, Chou and Young (2009), is based on sub-grouping where 1 < m < p.
The idea behind the present paper was inspired by the use of Wilks' statistic (1963) for the second scenario.This monitoring procedure was originally introduced by Mason, Chou and Young (2009) and developed in Mason, Chou and Young (2010) in order to identify the quality characteristics that contribute to the out-of-control signal.What makes Wilks's statistic important in this area of industrial application is that it has direct and simple geometrical interpretation and it is easy to implement in practice especially when p is not too large.Based on Wilks' statistic, the effect of an additional observation on covariance structure is measured as the ratio of the scatter matrix determinant issued from a historical data set (HDS) and that issued from the augmented data set (ADS).The latter data set consists of HDS and an additional observation.It is thus proportional to the ratio of the generalized variance (GV) of HDS and that of ADS.Geometrically, see Anderson (2003), it is the ratio of the volume of the p-dimensional parallelotope related to HDS and that related to ADS.
Since the covariance structure is absolutely determined by the eigenvalues and eigenvectors of covariance matrix, then the use of Wilks' statistic to detect the effect of an additional observation on covariance structure might be misleading.This is caused by the fact that GV is only the product of all eigenvalues.It might happen then that Wilks' statistic fails to detect that effect whereas actually the covariance structure has changed.To illustrate the situation, it is sufficient to consider two different covariance matrices having the same GV.Let us consider the following two hypothetical covariance matrices 1  and 2  , These covariance matrices represent two different covariance structures.The variance of the first and the second variables and also the correlation coefficient between them represented by 1  are totally different from those represented by 2  .Both matrices have different set of eigenvalues.They have different Frobenius norm, i.e., 252 for 1  and 553 for 2  .However, they have the same GV which is equal to 36.In Djauhari, Mashuri and Herwindiati (2008) we can see the use of Frobenius norm of covariance matrix as another multivariate dispersion measure besides GV.In that paper this measure is used in process variability monitoring under the first scenario.
The above illustration indicates that the use of Wilks' statistic alone might not be sufficient to describe the effect of an additional observation on covariance structure.This is a logical consequence of the use of GV as a multivariate dispersion measure.This measure has serious limitations as mentioned in Montgomery (2005) and discussed in details in Alt and Smith (1988).Therefore, another statistic is needed to have a better understanding about that effect.This is what we intend to discuss in this paper.
In what follows we introduce a new statistic that can be used, besides Wilks' statistic, for monitoring process variability based on individual observations.That statistic will be constructed based on the matrix D defined as the scatter matrix issued from ADS subtracted by that from HDS.The distribution of its Frobenius norm will be derived and, for practical purpose, a chi-square approximation will be presented.Based on these results we propose a new monitoring procedure which will give a better understanding about the effect of an additional observation on covariance structure.These are the topic in the next section.In the third section, an industrial example will be reported to illustrate the advantage of this procedure.In the last section, additional remarks will close the presentation.

Proposed control charting procedure
Let 1 X , 2 X , . . ., n X , 1 n X  be a random sample drawn from a p-variate normal distribution with covariance matrix  positive definite.The realization of 1 X , 2 X , . . ., n X will be used as HDS and the union of a realization of 1 n X  and HDS is called ADS.See Mason, Chou and Young (2009) for further details.Let where k = n, n+1.k SS is the scatter matrix issued from HDS if k = n and from ADS if k = n+1.Wilks (1963) proposes to use the following statistic to measure the effect of 1 n X  on covariance structure, where k SS is the determinant of k SS and k S is the GV issued from HDS if k = n and from ADS if k = n+1.
Wilks also shows that W follows Beta distribution with parameters (n -p)/2 and p/2.Due to the limitations of GV as a multivariate dispersion measure mentioned above, a careful attention must be paid when we use Wilks' statistic; two different scatter matrices might have the same value of W. To escape from this situation, in the next paragraph we define a matrix D as the scatter matrix issued from ADS subtracted by that from HDS.We will see that the use of Wilks' statistic together with the Frobenius norm of D will give a better understanding about the effect of an additional observation on covariance structure.
From (2) we know how to quantify the effect of an additional observation on covariance structure using Wilks' statistic.In the following proposition we present another quantification method based on the Frobenius norm of D. Those who are interested in the mathematical derivation are pleased to contact the author.

A new statistic
Proposition 1 shows that the statistic represents the effect of 1 n X  on scatter matrix measured using the Frobenius norm of D. Like Wilk's statistic, it can be used to test whether or not 1 n X  has significantly changed the covariance structure.However, W in (2) and F in (3) are two different statistics.Therefore, they might give different statistical decision.This indicates that the use of both statistics will provide a better understanding about the effect of 1 n X  on covariance structure.The statistic F is still difficult to implement in practice because its distribution in Proposition 1 is still impractical except 1  , 2  , …,  p are equal to each other.In order to handle this problem, a chi-square approximation will be discussed in the next sub-section.

A chi-square approximation
The result in Proposition 1 is still difficult to use in practice.To make it more practical, in what follows a chi-square approximation will be presented.Since many decades ago, see Solomon and Stephens (1977), it is common in practice to approximate the distribution of a linear combination of independent chi-square , where c is a positive constant and r is the corresponding degree of freedom, satisfying In the case where  is unknown, it is customary to replace  by the sample covariance matrix issued from HDS, i.e., n S .See, for example, Montgomery (2005).Therefore, the distribution of F can be further approximated by

Proposed procedure
The procedure to monitor multivariate process variability based on Wilks' statistic consists of plotting (i) the observed value of W and (ii) the lower control limit (LCL) which is equal to the  -th quantile of Beta distribution with parameters (n -p)/2 and p/2.An out-of-control signal occurs if the observed value of W is less than LCL.Here,  is the probability of false alarm.Instead of using Wilks' statistic, we can also use the statistic F in Proposition 1 where its distribution is approximated by (4).In this case, the control procedure consists of plotting (i) the observed values of F and (ii) the upper control limit (UCL) which is equal to the c .An out-of-control signal occurs if the observed value of F exceeds UCL.Since both statistics W and F are different, in order to handle the limitation of W, we propose to use both control charting procedures one after another.In the next section, an industrial example will illustrate the advantage of this procedure.

Industrial example
We use Wilks' statistic (2) in monitoring the process variability of B-complex vitamin production at a pharmaceutical industry based on individual observations.There are two quality characteristics under consideration, i.e., 1 x (Thickness of the tablet in mm) and 2 x (Hardness of the tablet in kg/cm 2 ).A HDS of size n = 40 gave During process variability monitoring, 20 individual observations were observed.These data and their a random sample of a p-variate normal distribution with covariance matrix  positive definite.If D = z , …, k z are i.i.d.standard normal N(0,1) and k  is the k-th eigenvalue of  .

Figure
Figure 1.W chart

Table 1 .
Individual observations and the value of W and F statistics