Applying the Inverse Gaussian Distribution to the Assessment of Chemical Reactor Performance

Reliability (or survival) functions based on the inverse Gaussian distribution are used to predict the probability of repair times expectable past a specified time period, and the probability of energy release past a stipulated magnitude, in two (separate) experimental chemical reactors. Such predictions serve for reaching decisions about future employ of the reaction system on a pilot or a commercial scale. The approach illustrates a cross fertilization of chemical reaction engineering/ technology and applied probability theory, and aims to increase appreciation of the latter among chemical scientists, engineers and technologists.


Introduction
Since the description of the first-passage distribution of the Brownian motion (Tweedie, M. C. K., 1945(Tweedie, M. C. K., , 1956)), the inverse Gaussian (also called inverse normal or Wald) distribution was applied to widely different domains, e.g., life testing and product/device reliability studies, air born communication receiver performance, cardiology, hydrology, demography, linguistics, employment service, labour dispute resolution, and finance (Chhikara, R. S. & Folks, L. J., 1989a).Shown to serve as an approximate sample size distribution in sequential probability ratio tests (Wald, A., 1947), and with its statistical properties throroughly investigated (Tweedie, M. C. K., 1957a(Tweedie, M. C. K., , 1957b)), the IGD has become a useful tool in describing distributions ranging from almost Gaussian to highly skewed.Its name has been stated (Wikipedia) to be misleading in the sense that, in contrast to the Gaussian distribution describing Brownian motion at a fixed time, IGD describes the time-distribution of a Brownian motion with positive drift until a fixed positive level is reached.
To the author's knowledge, applications of IGD have not hitherto been indicated in the chemical reactor engineering (research) literature.Motivation for the current paper stems partially from reliability studies mentioned above, and partially from wind speed and energy studies (Chhikara, R. S. & Folks, L. J., 1989b;Bardsley, W. E., 1980), as a "philosophical" foundation for Illustration No.1(Section 3) and Illustration No. 2 (Section 4), respectively.Due to the stated lack of direct experimental information, the numerical data employed in the two illustrations are posited for the sole purpose of showing the potential scope of IGD in the areas of interest, to raise the level of awareness of the subject matter, and to indicate the nature of experimental information required for the use of the described methods (the data were generated by careful consideration of quantitative examples in some of the cited references, e.g., Chhikara, R. S. & Folks, L. J., 1989).

Theoretical Foundations
In this section the mathematical apparatus for handling the two illustrations is briefly presented on the basis of pertinent literature on probability theory (e.g., Chhikara, R. S. & Folks, L. J., 1989c, 1989d;Weisstein, E. W., no date;Seshari, V., 1993;Evans, M. et al., 2000).Detailed derivations and proof of theorems, available in the cited sources, are omitted.

The Reliability Function of the IGD
In elementary terms, the reliability function R(x) = 1 -F(x) expresses the probability that a continuous time-elapsed-until-failure random variable X exceeds a specific value x: P[X ≥ x; x > 0]; F(x), the cumulative distribution function of X is the probability that failure will not occur up to time X = x.In general, R(x) can be taken as a measure of survival of a random existence -variable X, hence R(x) is also called the survival function in the risk theory literature.In the specific case of IGD, it may be written as where Φ is the standard Gaussian (normal) cumulative distribution function carrying location parameter (or mean) μ and scale parameter λ by its standard normal variables

Determination /Estimation of the Scale and Location Parameters
In practical situations, the true (population) parameters μ and λ can only be approximately determined as μ s and λ s from sets of experimental observations called samples.The approximations considered here are the maximum likelihood estimate MLE, and the maximum variance unbiased estimate MVUE.

2.2.1
The ML Estimates μ s , λ s and R s (x) Given the experimental sample x 1 , x 2 , … x n, the parameter estimates are obtained as and where ν is a fixed value of random variable V ≡ (n-1)/λ s denoting the sums in subsection 2.2.1 and G is the right tail area under the curve of Student's t-distribution with degree of freedom (n-2).The x-dependent arguments in Eq.( 5) are obtained as The 0-1 limits are given by

Determining the Admissibility of the IGD for Fitting Experimental Data
In applying the widely known Kolmogorov-Smirnov (K-S) test (e.g., Manukian, E. B., 1986;Blank, L., 1980;Lapin, L. L., 1990;Porkess, R., 21005a;Miller, L. H., 1956) the goodness of fit is determined by comparing the largest magnitude of difference between the reliability function and the cumulative staircase function arising from experimental data sets, to critical D * (n; α) values of the K-S statistic at α level of significance.If D max > D * (n; α) the (null) hypothesis of good fit is rejected.Tables of the critical D-value are available in numerous literature sources, e.g., Porkess, R., 2005b;Miller, L. H., 1956;Lindley, D. V. & Scott, W.F., 1984;Powell, F. C., 1982;Beyer, W. H., 1966Beyer, W. H., , 1968.The significant level α = 0.05 and the highly significant level α = 0.01 used routinely in conventional statistical testing of hypotheses has been progressively complemented (if not yet replaced) by the so-called P -value, i.e. the particular value of α at which D max computed in Eq.( 9) becomes significant; rejection of the hypothesis of good fit thus requires essentially the (subjective) decision of the (presumably knowledgeable) experimenter.

Ilustration No.1: Repair logistics for an Experimental Chemical Reactor
An experimental reactor for studying the viability of a novel (proprietary) process is envisaged to have undergone a series of repairs, due to numerous temporary breakdowns caused by improper operating conditions, and various malfunctioning events.The condition for commissioning the reactor for further operation after twenty breakdowns shown in Table 1 in terms of normalized repair time units RTU ranging from 0.38 to 18.79 is stated as follows: if the probability of an RTU exceeding 25.00 is only 0.5 % (or less), further experiments would be carried out in the same reactor, otherwise further experimental work would be suspended.
A preliminary (and somewhat superficial) evaluation of the observed RTU set, disregarding its skewed-to-the-right nature posits a normal distribution-based reliability function with sample mean 3.4475 and sample variance 17.3955 as population parameter estimates.The resulting probability of repair time units larger than 25 being about two in ten million (R s = 2.3701 x 10 -7 ), continued reactor operation was recommended.The experimental team questions the wisdom of ignoring skew, hence the validity of the decision, and upon an in-depth literature search, proposes the inverse Gaussian distribution to deal with the available experimental data, and proceeds with the required calculations as shown below.
Since the true parameters are not known, they are replaced by parameters based on two powerful estimation methods: maximum likelihood, and minimum variance (Section 2.2).

Maximum Likelihood-based Analysis
The MLE are obtained via Eq.(3) and Eq.( 4), respectively, and from Table 1, as μ s = (60.17+ 7.68 + 1.14)/20 = 3.4495; since summation in Eq.(4) yields 18.62112 -20/3.4495= 12.8232, it follows that λ s = 19/12.8232= 1.4817.Combining Eqs.( 1) and ( 2), the reliability function estimate can be written as with values at the experimentally observed RTU shown in the third column of Table 2.The fourth column carrying corresponding values of the K-S statistic yields D max = 0.1430 which is considerably less than even D * (20, 0.20) = 0.2310.Hence, the hypothesis of the data coming from an inverse Gaussian population could be rejected only at an error larger than 20 %, and Eq.( 10) can be admitted as a reliability function for reactor repair times.Its numerical value, R s (25) ≈ 0.019, almost four times the stipulated 0.5 % probability limit, warrants against further use of the reactor.

MVUE-based Analysis
In accordance with Section 2.2.2, and since V = 12.8232, the R s = 1 and R s = 0 tresholds are computed as L = 0.075 and U = 51.11,indicating "ab ovo" that the reliability function related to the observed data fall fully between the thresholds.Eqs.( 6) and ( 7) yield consequently whose variation with the RTU is presented in Table 3.The K-S test similarly fails to reject the hypothesis of its admissibility with D max = 0.1482.The 0.6 % probability yielded by R s (25) = 0.0060 indicates borderline lack of compliance with the 0.5 % criterion; if the criterion is to be strictly obeyed, the decision is the same as in the MLE-based analysis.

Illustration No.2: Energy Distribution in an Experimental Chemical Reactor
In a preliminary study of a chemical reaction system, where multiple processes occur simultaneously in the experimental reactor, the distribution pattern of energy released is believed by the experimenters to approximate closely an IGD.Prior theoretical considerations seem to point to location parameter (distribution mean) 3.5 and scale parameter 1.3, normalized with respect to a certain reference state.The (stipulated) reliability function in terms of the normalized (random) energy variable X = x: satisfies a prior requirement that the probability of X ≥ 3.0 be at least 25 %, inasmuch as R(3) ≈ 0.302.
Experimental verification is sought via four independent observation sets of energy releases measured in a group of ten identical reactors run under identical experimental conditions (e.g., initial temperature, run time, initial reactant composition and pressure); the ten reactors were operated during four equal but separate randomly chosen time periods.The observations, assembled in Table 4 in increasing order of the energy levels, are subjected to a modified conventional analysis of variance (ANOVA), called "analysis of reciprocals" (ANORE; Chhikara, R. S., & Folks, L. J., 1989f;Tweedie, M. C. K., 1957); with (null) hypothesis of the four populations having the same mean, given that they all have the same location parameter.The test statistic is distributed approximately as the conventional Fisher-Snedecor F-statistic with degrees of freedom (k-1) and (n-k), respectively.The sample parameters in Eq.( 15) are computed as and where n = n 1 + n 2 +…+n k is the total number of observations, and x ij are the individual observations, i = 1,…,k; j = 1,…n i.The scale parameter estimate is obtained via Eq.( 18): With reference to Table 4, n i = 10; k = 4; n = 40; μ s = 3.589, W = 4.3801/47.5549= 1.1054, λ s = 1.0341.Since the critical F-statistic with degrees of freedom k -1 = 3, and n -k = 36: F(3;36) ≈ 1.432 at a 0.25 level of significance, rejecting the equality of the four means would carry at least a 25 % error.Divergence of the resulting reliability function estimate from the a-priori stipulate R(x) is minimal.The theoretically posited μ = 3.5 and λ = 1.3 appear to be on solid grounds, at least in accordance with the available experimental observations.

The Experimental Reactor in Section 3
Although the MLE and MVUE approach produce, in general, very similar reliability functions with small divergence in predicting probabilities, weak inferences on either side of the criterion can ensue.Under less stringent performance criteria, however, identical conclusions would be reached.If, for instance, the 25-RTU threshold were set to 2 % or higher, both the MLE and MVUE method would favour continuation with the reactor of interest.Such calculations are necessary for setting thresholds by engineers, scientists and managers utilizing also their personal experience and professional judgment.
Admissibility of the inverse Gaussian distribution does not, in principle, exclude other distributions known to the literature on reliability analysis (e.g.Weibull, lognormal, exponential).Related to the more general field of model discrimination, this topic is beyond the scope of the current paper.

The Experimental Reactor in Section 4
In addition to the satisfied prior requirement P[X ≥ 3.0] ≥ 1/4, the third quartile Q 3 estimates (defined as P[X ≤ Q 3 ] = 3/4]): 3.74 via Eq.( 14), and 3.55 via Eq.( 19) are found to be slightly different.In a more advanced application of hypothesis testing, ANORE can also be applied to test the (null) hypothesis that the four population location parameters are equal.The test (Chhikara, R. S. & Folks, J. L., 1989g) is reminiscent of the conventional Bartlett test of homogeneity associated with ANOVA for the equality of population variances.Recalling Eq.( 4), and defining and generalizing parameters and V = v 1 + v 2 +…+v k, the modified Bartlett parameters are computed in the equal-size case n 1 = n 2 =…=n k = n; N = kn, as if n 1 = n 2 =…= n k = n, thus N = kn.Since n = 10, k = 4, v 1 = 7.2612, v 2 = 15.4001,v 3 = 6.8053, v 4 = 4.1144; V = 33.5810,Eqs.( 21) and ( 22) yield M = 4.1537 and C = 1.0463, respectively.The M/C ratio is distributed approximately as a conventional chi-square variable with (k-1) degree of freedom.M/C = 3.97 being somewhat lower than the critical value χ 2 (3) = 4.11 at a 0.25 level of significance, rejecting the hypothesis of the commonality of location parameters, of which λ s = 1.0341 is a valid estimate, would carry an about 25 % error.
It is instructive to consider the results of a conventional ANOVA, and a conventional Bartlett's test on the data in Table 4, i.e. if they were assumed to come from Gaussian populations.Comparison of the computed F = 1.874 to the critical values F(3; 36) = 2.243 at α = 0.1, and F(3; 36) = 1.432 at α = 0.25 indicates that the (null) hypothesis of equal population means can be rejected at an about 17 % error, but Bartlett's test with M/C = 16.39 evokes a sound rejection of homogeneity (i.e. the equality of the four population variances), on account of its negligible error at about 0.1 %.These tests are standard topics in statistics textbook, hence omitted here.

Concluding Remarks
The material presented above, applied to two representative cases, has a much wider potential scope for chemical process scenarios, in general.It also represents a pattern for the cross fertilization of two major disciplines, and indicates how basic methods of applied probability theory can be useful in technologically important areas.In this respect, the paper is intended to whet the appetite of motivated chemical scientists and engineers.

Table 1 .
The number of normalized repair-time units (RTU) observed while operating the experimental reactor in Illustration No. 1, arranged in increasing order and frequency

Table 2 .
MLE-based values of the reliability function and the K-S statistic at observed RTU values in IllustrationNo. 1

Table 3 .
MVUE-based values of the reliability function and the K-S statistic at observed values of RTU in

Table 4 .
ANOVA array for the reactor in Illustration No. 2