Estimating a Nonlinear Mixed Volume-age Model with and without Taking into Account Serially-correlated Errors: Differences and Implications

In this study we estimated a lodgepole pine (Pinus contorta var. latifolia Engelm.) volume-age model with and without taking into account serially-correlated errors arisen from permanent sample plots. The estimations were based on the first-order (FO) and first-order conditional expectation (FOCE) methods of the nonlinear mixed model technique. Among the correlated error structures considered, the spatial power structure was found to be the most appropriate. Model predictions were obtained and evaluated on model fitting data, as well as on independent validation data collected from a different ecoregion. Results showed that the model estimated with the independent and identically distributed (iid) error structure performed much better than the model estimated with the correlated error structure. This is true on both model fitting and validation data sets, and for both FO and FOCE methods. It implies that, if the main purpose of a study is to develop models for predictions, there is no real benefit to consider more elaborate and complex error structures to account for the correlated errors. The iid error structure is a sound choice for dealing with correlated errors under the nonlinear mixed model framework.


Introduction
For repeatedly measured, unequally spaced and unbalanced longitudinal data collected from permanent sample plots (or stem analysis trees), various within-and between-plot correlations and heterogeneous variances may occur.In addition, the correlations can be both temporal and spatial.
Violation of the independent and identically distributed (iid) error assumption could have important statistical consequences for analyses based on the least squares principle.It may invalidate hypothesis testing and interval estimation.The problem may be solved using the generalized least squares (GLS) method (Judge et al. 1985) or mixed model technique (Davidian and Giltinan 1995).Examples of GLS in forestry have been shown by many (e.g., Monserud 1984, Gregoire 1987).One issue related to GLS is that it generally applies to equally spaced and balanced data.It may create other problems when used for unequally spaced and unbalanced data.Lappi and Bailey (1988) stated that ordinary least squares (OLS) are "deemed to be appropriate" for large samples, even though GLS could be used.Borders et al. (1988), West (1995), and Huang (1997) cited other studies which concluded that OLS is adequate, and it is unnecessary to apply more elaborate procedures when faced with correlated errors.Knowledge of the correlation parameter in GLS is typically ignored in predictions (Monserud 1984).
In more recent years, more researchers have chosen the mixed model approach to deal with correlated errors and heterogeneous variances (e.g., Lappi 1986, Ojansuu 1987, Gregoire et al. 1995, Huang 1997, Fang and Bailey 2001, Schabenberger and Pierce 2002, Wang et al. 2007).This approach allows for a more flexible error covariance structure.More importantly, knowledge of the correlation parameter and the entire error covariance structure from the mixed model estimation can be used in predictions.
In spite of the large amount of effort invested in addressing correlated errors, many questions remain.The primary objectives of this study are to: (1) compare a volume-age model for lodgepole pine (Pinus contorta var.latifolia Engelm.)estimated using nonlinear mixed model (NLMM) technique with and without taking into account correlated errors; (2) evaluate the impacts of different error structures on model predictions based on model fitting data as well as on independent model validation data; (3) clarify some methodological confusions and miscues about the NLMM technique in forest modeling; and (4) recommend a sound and practical approach to use when faced with correlated errors.

Data and base model
The lodgepole pine data used in this study were collected by the Alberta Forest Service over the last 45 years as a part of the provincial Permanent Sample Plots (PSP) system.They were obtained from pine dominated forests in the upper foothills and lower foothills ecoregions of Alberta.The most common PSP size is 1000 m 2 , but it can range from 200 to 2000 m 2 depending on the density of a stand.Within each PSP, diameters (always at 1.3 m above ground) were measured for all trees taller than 1.3 m, and heights were measured for all trees with diameters or portions of representative trees (Alberta Forest Service 2000).The missing tree heights were predicted using the "bias-free" height-diameter models (Huang et al. 2008a).Individual tree volumes were calculated following the standard taper functions (Huang 1994), and were summarized to give plot volume.Plot age (at breast height) was obtained from cored trees within the plot or sectioned trees in the buffer areas around the plot.
The volume-age data from the upper foothills ecoregion were used as model fitting data, and the data from the lower foothills ecoregion were used as model validation data.The two data sets (Figure 1) can be considered independent of each other because of the geographic separation.The model fitting data consist of 1,249 measurements from 288 plots (subjects), with a mean (minimum-maximum) volume (m 3 /ha) of 302.694 (13.909-642.471)and a standard deviation (SD) of 108.902 (m 3 /ha).The validation data consist of 1,011 measurements from 265 plots, with a mean volume of 287.611 (1.748-564.329)and an SD of 108.581.Both data sets are unequally spaced and unbalanced longitudinal data.
Based on a detailed examination of the overall and subject-specific (SS) volume-age trends exhibited by the data in Figure 1, two NLMMs were selected for evaluation: (1) where Vol i and Age i are observed volumes and ages for the ith plot, b 1 , b 2 and b 3 are fixed parameters common to all plots, u 1i , u 2i and u 3i are random parameters unique to the ith plot, and i ε is a normally distributed within-plot error term.Both (1) and (2) can be sigmoidal.They can also describe declining volumes at old ages, which were observed in some plots likely due to mortality exceeding growth at old ages (Figure 1).Preliminary analyses indicated that (2) failed to achieve convergence in some cases because of over-parameterization.Therefore, (1) was chosen in this study for further analyses.

Error structure and parameter estimation
Model (1) can be written as a generalized NLMM of the form: (3) ′ is a vector of plot volumes for plot i, x i is a design matrix of the covariate age, b is a ′ is a vector of subject-specific random parameters, and i ε is a vector of within-plot errors.The u i and i ε are assumed to be uncorrelated and normally distributed with mean zero and variance-covariance matrices D and R i , respectively.The D is typically assumed to be unstructured and identical for all subjects within the population: are variances and covariance for random parameters u 1 and u 2 .
As noted earlier, for the unequally spaced and unbalanced longitudinal data collected from the PSPs, the errors may not be iid.Instead, they could be correlated and unequally varied.To account for these, a total of more than 20 error covariance structures described in SAS Institute Inc. ( 2004) and Littell et al. (2006) were examined.The spatial power error structure produced the most reasonable fit to the data.It was chosen for R i .As an example, for a plot repeatedly measured five times, the R i is defined as: (5) where σ 2 is the overall error variance, Ψ is a generalized error covariance structure, d ιη is the distance between two measurements at times ι and η (d ιη = | t ι -t η |), and ρ is the correlation parameter.The spatial power error structure captured the unequally spaced and unbalanced nature of the volume-age data.Other error structures examined but not presented here, e.g., compound symmetry, heterogeneous Toeplitz, Toeplitz with different bands, unstructured correlation, first-order ante-dependent, and several spatial correlation structures evaluated in Yang and Huang (2008), were not converged or performed poorer than the SP(POW).The time-series structures such as AR(1), MA(2) and ARMA(1,1), which assume that the measurements are taken at equally spaced time intervals, may not be adequate without further adjustments, even though some of them produced seemingly reasonable results.
For models assumed to follow an iid error structure, the R i is simplified to: R i Ψ , where i n I is a n i × n i identity matrix of 1s.
Different methods have been developed to estimate the generalized NLMM (3).They are described in detail in Davidian and Giltinan (1995), and Vonesh and Chinchilli (1997).Since the estimation of the fixed parameters is readily available in most statistical software packages, we focused on the estimation of subject-specific random parameters based on the first-order (FO) method of Beal and Sheiner (1982) and the first-order conditional expectation (FOCE) method of Lindstrom and Bates (1990).
For the FO method, a first-order Taylor series expansion of (3) around zero, the expected value of the random parameters, is used to approximate (3).The random parameters are predicted by: where D ˆ and i R ˆ are estimates of D and R i , respectively, b ˆ is an estimate of the fixed parameters b, and Z i is the derivative matrix with respect to the random parameters, defined by ( 7) For the FOCE method, the first-order Taylor series expansion of model ( 3) is iterated around a current predictor of the random parameters (and a current Z i ) until the convergence criterion is met.The final random parameters are obtained numerically through the following equation: where D ˆ, i R ˆ and b ˆ are as defined above, but Z i is given by (9) Since i u ˆ appears on both sides of (8), for the FOCE method, i u ˆ must be solved iteratively.This is the fundamental concept innate to the FOCE method (Schabenberger 1994, Vonesh and Chinchilli 1997, Huang 2008, Temesgen et al. 2008, Meng et al. 2008, Meng and Huang 2009).Unfortunately, many forest modelers used (6) from the FO method to predict i u ˆ while implementing the FOCE method, without realizing that this could seriously bias the outcome.The methodological mixing should be avoided in future studies as it may lead to erroneous conclusions.
All model fittings were carried out using the SAS macro %NLINMIX, with expand = ZERO for the FO method and expand = EBLUP for the FOCE method (Littell et al. 2006).The default fittings assume iid error structure (i.e., R i = i n 2 I σ ).For the spatial power error structure illustrated in (5), the fittings must be implemented using the keyword TYPE = SP(POW)(age) within the REPEATED statement.

Predictions
Once the b ˆ and i u ˆ are estimated, subject-specific volume predictions for plot i can be made: (10) where Z i for FO is defined in (7), and x i is the x matrix for new or old (modeling) observations.Note the s ˆi y for FO and FOCE are different (Vonesh and Chinchilli 1997).Mixing them will likely result in biased predictions (Huang 2008, Meng andHuang 2009).It is possible that, following Judge et al. (1985)'s descriptions on prediction, we can also derive the following alternatives to obtain the adjusted predictions from the use of SP(POW) or any other generalized error covariance structure: where x 0i is a known x matrix of new observations for plot i, i 0 y ˆ corresponds to x 0i , x i is an x matrix from the old (existing) data, V ˆis the estimated covariances between elements of old and new errors, and Ψ ˆ is the estimated generalized within-plot error covariance structure.For the SP(POW) structure, for instance, more explicit prediction equations can be derived following Judge et al. (1985) (detailed derivations are omitted here.Interested readers may contact the senior author to obtain a copy): where κ is the number of projection intervals (or periods), ( κ+1) is the number of measurements, x i,κ and Z i,κ are the x i and Z i values κ intervals ahead, and d 1(κ+1) is the distance (or time) between the first and (κ+1)th measurements (e.g., d 1(κ+1) = d 15 for the 4 th projection period of the SP(POW) structure given in ( 5)).For a simpler structure such as AR(1), the above equations reduce to: For the reasons discussed at the end of Section 5, however, we opted not to implement the alternative equations in the present study before further assessment and simulations are conducted.
There are different ways to compare the differences between the observed and predicted volumes, and various statistics exist for reporting the goodness-of-fit of NLMMs (Vonesh and Chinchilli 1997).Here, we are mostly interested in the criteria to evaluate the prediction errors (residuals), their average and variations.Thus, the following statistics were computed: where e is the overall mean bias, ij y and ij y ˆ are the jth observed and predicted values for subject i, m is the number of subjects, N is the total number of observations from m subjects (N = ∑ = m i i 1 n ), SD is standard deviation, δ is an overall accuracy measure combining the bias ( e ) and precision (SD) of the errors (Cochran 1977), % e is percent bias, and y is the arithmetic mean of the observed values.
Because the predominant interest of a NLMM is in subject-specific predictions, it is more important that we also compute the above statistics by subject: (16) where all variables are as defined before, except that they apply to subject i, not the entire population.
For a population of m subjects, the frequency distribution for any one of the SS measures expressed in ( 16)-( 19) can be constructed, and the percentage of subjects that fall below or above certain thresholds such as the 5 th or the 95 th percentile, can be determined.As will be shown later, examination of such frequency distributions is exceedingly useful in helping modelers obtaining a realistic understanding of the goodness-of-fit a NLMM when it is used for subject-specific predictions.More importantly, it also helps revealing the differences between the models estimated with different error structures.
It is worthwhile to note here that many forest modelers have used the fixed parameters estimated as a part of a NLMM to predict the population-averaged (PA) responses, i.e., by setting i u ˆ= 0 and using ) , , ( ˆ0 b x y . This is likely a result of some misunderstanding.The fixed parameters estimated as a part of a NLMM do not fully characterize the mean responses of the population, especially when the inter-subject variability is large (Davidian andGiltinan 2003, Fitzmaurice et al. 2004).Consequently, they generally provide a biased partial representation of the true population mean responses.It is very important to recognize that the predictions obtained from the fixed parameters estimated as a part of a NLMM generally do not represent the "unbiased" PA responses (Huang 2008).

Results and discussion
Results of the estimated parameters for model (1) with iid and spatial power error structures are shown in Table 1.The Akaike's information criterion (AIC) is computed by (Littell et al. 2006): where -2LL is -2 times the log-likelihood function, and p is the total number of estimated parameters.
Studentized residual plots were obtained from the FO and FOCE methods using the iid and correlated (spatial power) error structures.Only those from the FOCE method are shown here (Figure 2), as those from the FO method display similar patterns.The differences in the studentized residual plots between the iid errors and the correlated errors are easy to detect.The studentized residuals from the iid errors are scattered evenly around the zero line, whereas the studentized residuals from the correlated errors are not.This suggests that judged by visual means, the iid error structure is more satisfactory.
To facilitate the discussion of the upcoming results, and to help readers grasp the essences of the issues, we first demonstrate the computations involved in predicting the random parameters and in deriving the predicted volumes for a "new" plot measured five times (j = 5) from the validation data (Table 2a), based on the FOCE method with the SP(POW) error structure.Actual computation algorithms for this demonstration are provided in Appendix 1, and were explained in detail in Huang (2008), Meng et al. (2008), and Meng and Huang (2009).The computations for the iid errors and for the FO method are relatively simpler and can be derived directly from the algorithms given in Appendix 1 (Huang 2008).
For the FOCE method, the derivatives of model ( 1) with respect to the two random parameters are: Therefore, for the ith subject with j measurements, the estimated Z i matrix is constructed as follows: where der_u 11 , …, der_u 1j are given by ( 21) and der_u 21 , …, der_u 2j are given by ( 22), and j = 5.The estimated D matrix for any subject i is (from eq. ( 4) and Table 1):

D
The estimated R i matrix for the example subject is (from eq. ( 5) and Tables 1 and 2a):

R
For the FOCE method, there is no simple algebraic solution for i u ˆ from ( 8).An iterative procedure involving the following computation steps is used to numerically solve for i u ˆ: Step 1: Obtain an initial estimate, termed 0 , ˆi u , of the random parameters.A reasonable first guess of this estimate is 0 , ˆi u = E( i u ) = 0, the expected value of the random parameters for the FO method.Thus, 0 , ˆi u can be computed directly using eq.( 6) for the FO method, as follows:

Z
is a first estimate of the Z i matrix (equivalent to the Z i in (7) for the FO method): Z is the Z i expressed in (23) evaluated at 0 , ˆi u , with its elements given by: Step 3: Having the calculated 1 , ˆi u from ( 27), the new estimation for i u ˆ, termed 2 , ˆi u , is computed using (8) again, with the i u ˆ on the right-hand side replaced by the updated 1 , ˆi u from Step 2: ˆi u , with its elements defined in eqs.( 28)-( 29) updated using 1 , ˆi u .
This process is iterated k times until a desired precision, e.g., 0000001 .0 ˆ) 1 ,( , , is achieved.The predicted final random parameters for subject i is: For the example plot shown in Table 2a, the predicted final random parameters for the FOCE-pow are: u 1 = -0.00338906and u 2 = -0.059555.The final elements (der_u 1 and der_u 2 ) of the Z i matrix are listed in Table 2a, along with the Z i matrices for the other error structure and the FO method.
Once the i u ˆ are known, the predicted volumes for the FOCE method are obtained directly using (11).Results are shown in Table 2a.For the FO method, the predicted volumes are obtained differently using (10).Results are also shown in Table 2a.Once the prediction errors and their SD are known, various intervals can be constructed (Hahn and Meeker 1991).Some common ones are: (tolerance interval to contain at least proportion γ of the subject) (confidence interval for the subject standard deviation) where z is the standard normal distribution (and α = 0.05 throughout this study).
Table 2b lists the normality test results and the calculated intervals for the example plot.All four tests conducted following Yang et al. (2004) showed that normality was met.However, it must be noted that since the sample size for this example plot is small, the interval calculations serve only as illustrations.The "preferred" calculations in fact require that the sample size should not be smaller than 8 (Hahn and Meeker 1991).In the cases where the normality assumption is not met, distribution-free intervals could be constructed (Hahn and Meeker 1991).One could also use a 10% trimmed mean as an estimate of i e and a jackknifed standard deviation expressed in (35) as an estimate of SD to construct approximated intervals (Efron andTibshirani 1986, Hahn andMeeker 1991) is the sample average of the data set without the jth point.It is straightforward to infer that (31)-( 35) can also be applied to the entire population of N observations (N = ∑ = m i i 1 n ).A fundamental difference between the FOCE and FO methods is that, the random parameters for the FOCE method must be solved numerically, whereas they can be obtained directly from (6) for the FO method.This fundamental difference appeared to have largely been neglected in previous applications of NLMMs in forestry.It is essential to recognize that when implementing the FOCE method, the i u ˆ calculated by ( 6) is just a first initial "guesstimate" (and sometimes an inappropriate guesstimate) in the search of a true empirical best linear unbiased predictor of i u .
Table 3 lists the overall prediction errors (e ij ) on both model fitting and validation data sets.For both FO and FOCE methods, the prediction errors (absolute values) and their variations from the correlated error structure are much larger than their counterparts from the iid error structure.For example, for the FO method on the model fitting data, the δ and % e values for the iid error structure are 107.965and 0.226%, respectively, whereas for the spatial power error structure, they are 4696.876and 10.906%, respectively.The differences between the values from the iid and correlated error structures are striking.They are also consistent across the methods and data sets.All these suggest that the iid error structure produced much more accurate and precise predictions than the correlated error structure.
For the more important SS predictions, the statistics calculated according to ( 16)-( 19) by different error structures and data sets are summarized in Table 4.The number of plots where the percent bias ( % e i ) exceeded ±2.5% of the observed mean were identified and listed separately in Table 4.The frequency distributions of the mean biases, percent biases and standard deviations from all 288 plots of the model fitting data and 265 plots of the model validation data were obtained.For the sake of brevity, only the frequency distributions for % e i from different error structures and data sets are shown (Figure 3).
Judging from the results shown in Table 4, the contrasts between the iid and power error structures are remarkable.The power error structure produced much larger biases, SDs and δs.More specifically, for instance, for the FO method, among the 288 plots of the modeling data, only two plots produced biases exceeding ±2.5% of the observed means when the iid error structure was used.But when the power error structure was used, 272 plots produced biases exceeding ±2.5% of the observed means.Similar results are also apparent on the validation data, and for the FOCE method.The frequencies of poorer predictions from the power error structure are much greater than those from the iid error structure.
The poorer performance of the power error structure is more readily seen in Figure 3, where the percent biases from different plots are centered around zero but those from the power error structure are spread out much more than those from the iid error structure, indicating lower accuracies and larger variations from the power error structure.This is true on both model fitting and validation data sets, and for both FO and FOCE methods.
Figure 4 shows the observed volumes against the volumes predicted from different methods with iid and power error structures.We only show the plots from the validation data because the plots from the modeling data are similar.More importantly, it is more telling on validation data than on modeling data because the validation data are "new" independent data that a fitted model is most likely to be applied to make predictions in real-world applications.The poorer performance of the power error structure is rather apparent in Figure 4.It is consistent for both FO and FOCE methods.
An examination of the "spaghetti" or "chow-mein" plots (Figure 5) of the prediction trajectories of the validation data indicates that the prediction trajectories from the iid error structure closely mimic the observed trajectories shown in Figure 1(b), whereas the prediction trajectories from the power error structure "compress" the observed trajectories.This is again consistent for both FO and FOCE, even though the degree of the compression varies slightly.
Since much of the data collected for forest modeling is unequally spaced and unbalanced longitudinal data, where repeated measurements are taken on the same experimental units, the lack of independence of such data is inborn.The correlated nature of the data could violate an important assumption required by certain statistical techniques, mostly notably the least squares principle, for "optimal" performance and inference.Forest modelers have studied for many years, and have used the old and new techniques brought to light by statisticians, to address the dependence problem, believing in many cases, that the successful removal of the correlation would result in a "better" model.However, contrary to the common belief, the results obtained in this study from the NLMM methods showed that although accounting for the serial correlation appeared to have produced better fits when judged by the AIC values in Table 1 (a smaller AIC value is said to mean a better fit), it produced much worse predictions on both the model fitting and validation data sets.The predictions obtained with the iid error structure had consistently much smaller and better distributed biases than those obtained with the correlated (spatial power) error structure.This created a dilemma, as the successful accounting of the correlated errors produced much poorer predictions, while ignoring the correlated errors produced much better predictions but could potentially invalid hypothesis testing and interval estimation.It also asked for the question of whether the common practice of using AIC (and its modified forms) to select a preferred model and/or a preferred error covariance structure is correct.
While there is no "best" answer to the apparent contradiction, and opinions may understandably vary depending on the focus of a study or a particular researcher, we believe in general that, because the vast majority of forestry models are developed to be used as predictive tools once the parameters have been estimated, and because much of the practical, real-world emphasis of forest modeling has been on the predictive capabilities and biases from the applications of the models, the prediction results, preferably obtained on independent validation data, should be given the predominant consideration.They should be used as the deciding factor in determining the appropriateness of a fitted model, and in comparing alternative models and/or error structures.
We also believe that, there may be some debatable issues and misconceptions about hypothesis testing and interval estimation in forest modeling.We understand that, in the presence of correlated (and/or heteroscedastic) errors, the ordinary estimators of coefficients ignoring the correlation are still unbiased.However, the standard errors of the coefficients are biased and inconsistent, and the estimators are not efficient (Judge et al. 1985).When a correct error covariance structure has been identified and a consistent estimator of this error covariance structure is obtained, it can be used to obtain a consistent estimator of the standard errors of the estimated coefficients that are also efficient.The question though is that, for a forest model developed as predictive tools, model users are (arguably) not really concerned much about the hypothesis testing and interval estimation on the coefficients that have been estimated in an unbiased manner.Instead, they are generally more concerned about the hypothesis testing and interval estimation on the predictions illustrated in eqs.( 31)-( 35) and Table 2b, which have little to do with the standard errors of the coefficients obtained on the modeling data.To ward off some potential statistical traps, forest modelers should be aware of and be mindful of some common miscues when conducting hypothesis testing and interval estimation under normal or non-normal error assumption, on different data sets (i.e., modeling vs validation/application), at different levels (i.e., population average vs SS), and for different purposes (e.g., model coefficients vs model predictions).
For the vast majority of forest models built on repeated measurement data and to be used as predictive tools on new data or data ranges beyond those used in modeling, one needs to understand the pertinent prediction scenario first before making any prediction and judging the goodness of the prediction.As illustrated in Figure 6, various prediction scenarios exist, and different evaluation measures can be used to judge the goodness of predictions.We examined the scenarios and conducted simulations based on the fitted volume-age model, and found that accurate local predictions could be achieved from 2, 3 or more prior observations.We also found that, under the NLMM framework, with or without adjusting the predictions from the use of SP(POW) produced varied outcome, dependent on the 'best' covariance structure chosen, the direction of the predictions, the length and number of the prediction intervals, the level on which the predictions are made, the estimation technique (FO vs FOCE) used, the evaluation measures selected, and more importantly, the number of prior observations available for predicting the SS random parameters (we used all available prior observations in our analyses).In general, however, when SS random parameters are obtained from 2 or more prior observations, adjusting the predictions degenerated the predictions for our data.While the results from our study are obvious, we recognize that this is just one such study to demonstrate that the iid error structure is a sound choice for dealing with correlated data under the NLMM framework, if prediction is the primary focus of the study.We realize the potential options and variations in making predictions (as illustrated in Figure 6), and plan to conduct additional studies to further test this phenomenon on larger data sets, and for other tree species and other types of models.

Conclusions
Based on the FO and FOCE methods, we evaluated the volume-age model for lodgepole pine estimated with and without taking into account the serially-correlated errors.We found that the model estimated with the iid error structure outperformed the model estimated with the correlated (spatial power) error structure by a large margin.This observation was consistent for both FO and FOCE methods, and on both model fitting and validation data sets.It means that a NLMM estimated with the iid error structure is better in predictions than the model estimated with the correlated error structure.The result of this study can have some important practical implications, as a better model can be estimated using a simpler analysis without the need to account for the correlated error structure.In fact, accounting for the correlated error structure within the NLMM context not only did not improve model predictions, it actually degenerated the predictions substantially.Unless the main objective of modeling is not to develop models for prediction purposes, we believe that there is little or no concrete benefit to consider more elaborate error structures that account for the correlated errors.The iid error structure is simply a better choice for the lodgepole pine model, and for a similar model for black spruce (Picea mariana) (Huang et al. 2008b).Future studies should be conducted to see if the conclusions reached in this study still hold for other types of models and species, particularly on independent validation data sets.(tn,bx,q,u,z,b,s,d,age,vol,nn,res,uv,cov); z=j(nn,q,0); res=j(nn,1,0); uv=j(tn,q,0); xx=1; run sm (tn,bx,q,u,z,b,s,d,age,vol,nn,res,uv,cov); σ are variances for random parameters u 1 and u 2 , respectively, and 2 1 u u σ is the covariance between u 1 and u 2 , ρ is correlation parameter, σ 2 is residual variance, and AIC is Akaike's information criterion (defined in eq. ( 20)).-2275.16 802.30 195.767 -32.424 74.234 226.229 -6391.37 1074.44 249.451 -23.222 84.234 268.278 -9968.73 1240.74 279.850 -11.572 94.234 290.151 -13816.33 1376.67 302.845 -12.694Note: the methods are defined in Table 1, Vol_o and Vol_p are observed and predicted volumes (m 3 /ha), der_u 1 and der_u 2 are derivatives with respect to random parameters u 1 and u 2 , respectively, Res is residual, i e is the arithmetic mean of the residuals, and SD is the standard deviation of the residuals.The algorithms given in Appendix 1 show the computations for the FOCE-pow method.1, N is the total number of observations, min., max.and SD are minimum, maximum and standard deviation, δ is an overall accuracy measure calculated by eq. ( 14), and e % is the percent bias calculated by eq. ( 15).  1, m is the number of subjects (plots), min., max.and SD are minimum, maximum and standard deviation, δ i is the subject-specific accuracy measure calculated by eq. ( 18), and i e % is the subject-specific percent bias calculated by eq. ( 19).The frequency (freq.) of | % e i |>2.5 refers to the number of plots whose % e i exceeded ±2.5 (the 2.5 threshold was chosen by dividing the one-sided 5% significance level commonly used in statistical inference into two-sides).d) and (e) correspond to SS predictions based on one prior observation, and (f) corresponds to the predictions based on two prior observations.Depending on the number of observations available from an observed trajectory, forward and backward predictions can also be made using 3, 4,…, k observations randomly selected from the trajectory.In our analyses, using (c) to illustrate, we used all 3 observations to get the random parameters.We then made predictions using (10) or (11) (we also examined the adjusted predictions, starting with the 1 st observation and taking into account the spatial power structure when predicting the 2 nd and 3 rd observations through (10b) or (11b)).

Figure 1 .
Figure 1.Model fitting (a) and validation (b) data sets.The model fitting data are from the upper foothills ecoregion.The model validation data are from the lower foothills ecoregion.

Figure 2 .
Figure 2. Studentized residual plots from the first-order conditional expectation (FOCE) method with the iid error structure (a, b), or correlated (spatial power) error structure (c, d).The right-hand side graphs connect the scatter points of the left-hand side graphs by subject.

Figure 3 .
Figure 3. Frequency distributions of subject-specific percent biases ( % e i s) for the model fitting data (a, b, c, d) and model validation data (e, f, g, h), based on the first-order (FO) and first-order conditional expectation (FOCE) methods: (a) and (e) -FO with iid errors; (b) and (f) -FO with correlated (spatial power) errors; (c) and (g) -FOCE with iid errors; and (d) and (h) -FOCE with spatial power errors.

Figure 4 .
Figure 4. Observed versus predicted volumes for the validation data: (a) -first-order (FO) method with iid errors; (b) -FO method with spatial power errors; (c) -first-order conditional expectation (FOCE) method with iid errors; and (d) -FOCE method with spatial power errors.The line represents equality.

Figure 5 .
Figure 5. Predicted volume-age trajectories for the validation data: (a) -FO with iid errors; (b) -FO with spatial power errors; (c) -FOCE with iid errors; and (d) -FOCE with spatial power errors.

Figure 6 .
Figure 6.An illustration of selected prediction scenarios, where '•' represents an observed value and '○' represents a predicted value: (a) -predicting a y based on an observed x (equivalent to an overall population mean prediction); (b)predicting a new (future) y based on the last y-x pair of an observed trajectory; (c) -predicting ys based on the first observed y-x pair; (d) -predicting previous ys based on the last observed y-x pair; (e) -predicting ys based on an observed y-x pair in the middle of a trajectory; and (f) -predicting a y or ys based on two observed y-x pairs.Note that (b), (c), (d) and (e) correspond to SS predictions based on one prior observation, and (f) corresponds to the predictions based on two prior observations.Depending on the number of observations available from an observed trajectory, forward and backward predictions can also be made using 3, 4,…, k observations randomly selected from the trajectory.In our analyses, using (c) to illustrate, we used all 3 observations to get the random parameters.We then made predictions using (10) or (11) (we also examined the adjusted predictions, starting with the 1 st observation and taking into account the spatial power structure when predicting the 2 nd and 3 rd observations through (10b) or (11b)).

Table 1 .
Parameter estimates and fit statistics for model (1) based on the first-order (FO) and first-order conditional expectation (FOCE) methods assuming iid or correlated (spatial power) error structures.

Table 2a .
Example computations of prediction errors for a plot (subject) measured five times.

Table 2b .
Lower and upper limits of confidence intervals for the example plot shown in Table2a.Md.(method) is defined in Table1, W, KS, W 2 and A 2 are Shapiro-Wilk, Kolmogorov-Smirnov, Cramér-von Mises and Anderson-Darling statistics, respectively, for testing normality.Prediction intervals, tolerance intervals, and confidence limits (CL) containing the mean ( i e ) and the standard deviation (SD) are computed according to eqs.(31)-(34), respectively.Table3.A summary of overall prediction errors from different methods and error structures.
Note: the methods are defined in Table

Table 4 .
A summary of subject-specific prediction errors from different methods and error structures.