Using Nonlinear Mixed Model Technique to Determine the Optimal Tree Height Prediction Model for Black Spruce

Based on the nonlinear mixed model technique, four base height-diameter models were evaluated for black spruce. The Chapman-Richard model was chosen. Top height and basal area were incorporated into the base model. Comparison of the base and expanded models showed that, although the goodness-of-fit measures on the modelling data were improved with the inclusion of top height and basal area, the predictive accuracy of the expanded models at the subject-specific level where the predominant interest of nonlinear mixed models lies, was reduced when tested on the model validation data. This has important practical implications because more accurate individual tree height predictions can be better achieved using the base height-diameter model without requiring the addition of other variables. It also reaffirms that determining the adequacy of a model on model fitting statistics alone can be misleading. A fitted model is best judged on separate validation data.


Introduction
Total tree height (H) and tree diameter at breast height (DBH) are two of the most fundamental variables in forestry.The estimation of tree volume and biomass, the description of stand conditions and their changes over time, as well as the development of growth and yield projection systems all rely heavily on the availability of a complete set of tree heights and diameters.
However, since measuring tree height is time consuming and costly, typically in various data collection programs in forest management, all trees are measured for diameter but only a portion of the trees are measured for height.Predicting the "missing" tree heights is therefore routinely required in forest research and operations (Curtis 1967, Wykoff et al. 1982, Arabatzis and Burkhart 1992, Huang et al. 1992).
Although different options exist, the most commonly used method of predicting tree heights is to develop height-diameter models.From the trees that have both height and diameter measured, a height-diameter model can be developed to express tree height as a function of tree diameter.This model can then be used to predict the "missing" heights from measured DBHs.
In numerous studies related to the development of height-diameter models, it has been almost unanimously shown that the predictive ability and accuracy of the base height-diameter models can be improved if additional (statistically significant) tree and stand level variables are incorporated into the models.These variables may include tree or stand age, basal area per hectare (ha), stems per ha, crown ratio, crown competition factor, site index, dominant/co-dominant height, top height, crown class, wind speed, and different competition indices and species composition measures (e.g., Curtis 1967, Huang and Titus 1994, Huang 1999, Eerikäinen 2003, Temesgen et al. 2008, Meng et al. 2008).
The conclusions reached in previous studies were mostly derived from the goodness-of-fit measures on the modelling data, which, as will be shown later, could be unreliable and even misleading sometimes if one is not careful.Furthermore, in the cases where the nonlinear mixed model (NLMM) technique was applied (Castedo Dorado et al. 2006, Sharma and Parton 2007, Temesgen et al. 2008, Meng et al. 2008), no frequency-based comparison of the predictive performance between the base and expanded models was made on the plot-specific level where the main interest of NLMMs lie.
The prime objectives of this study were to: (1) select the optimal base height-diameter model for black spruce (Picea mariana (Mill.)B.S.P.); (2) compare the predictive accuracy of the base model with that of the expanded models on both the model fitting and validation data sets; and (3) evaluate the impacts of including or not including statistically significant stand level variables into the base model.The selection, comparison and evaluation were all conducted within the NLMM framework.

Data
The black spruce data used in this study were collected by the Alberta Forest Service as a part of the provincial Permanent Sample Plots (PSP) database.The data encompass three key forest ecoregions (also called subregions) in Alberta: the lower foothills, the upper foothills, and the boreal mixedwood.The PSPs range in sizes from 200 to 2000 m 2 , with the most common one being 1000 m 2 .Within each plot or subplot (usually a quarter of the full PSP), diameters (always at a breast height of 1.3 m above ground) were measured for all trees taller than 1.3 m, and heights were measured for approximately 10 to 20% of the representative trees in a random or systematic fashion.A detailed description of the data collection procedures is provided in Alberta Forest Service (2000).
To avoid the potential serial correlation from repeated measurements of the same PSPs, only the initial measurements were retained in this study, which resulted in a total of 164 plots (1703 trees).The data were divided into two independent data sets by ecoregion: 93 plots (917 trees) from the lower foothills ecoregion were used as modeling data, and the other 71 plots (786 trees) from the boreal mixedwood and upper foothills ecoregions were used as model validation data.For the reasons discussed in Huang et al. (2002), we purposely avoided "random splitting" of the data.Summary statistics for the tree level variables H and DBH, and the plot level variables top height (i.e., mean height of the 100 largest DBH trees per ha) and stand basal area, are listed in Table 1.The data are also shown in Figure 1.

Specification of base height-diameter models
Based on a detailed examination of the height-diameter data shown in Figure 1, a large number of base height-diameter models were examined in preliminary analyses.Four promising candidate models were selected for further evaluation: (1) ( ) where H is total tree height (m), DBH is tree diameter (cm) at breast height, and β 1 , β 2 and β 3 are model parameters.These models and their slight modifications have been ranked favorably among alternative models in other relevant studies, particularly the Chapman-Richards model (4) (e.g., Wykoff et al. 1982, Huang et al. 1992, Sharma and Parton 2007, Temesgen et al. 2008).
Following the NLMM technique (Davidian andGiltinan 1995, Vonesh andChinchilli 1997), one or more parameters in the above models were assumed to be mixed (i.e., they consist of a fixed component and a random component).Likelihood ratio tests were conducted to determine if this assumption was true (at α = 0.05 throughout this study).It was found that the two parameters in (1) and (2) were both mixed, but parameter β 2 in (3) and (4) was fixed.Hence, the NLMM formulations of the above base models are: (5)

(
) where H ij and DBH ij are observed height and DBH for the jth tree in the ith plot (subject), i = 1, 2, …m, j = 1, 2, …n i , m is the total number of plots in the population, n i is the number of trees in plot i, β 1 , β 2 and β 3 are fixed parameters common to all plots, b 1i and b 2i are random parameters specific to plot i, and ij ε is a normally distributed within-plot error term.Models ( 5)-( 8) can be expressed in matrix forms.For example, the Chapman-Richards model [8] can be written for plot i as (9) where H i and DBH i are vectors of observed heights and DBHs, respectively, for the ith plot with n i observations, and ′ is a vector of the within-plot errors.

Expanded models
Individual tree height-diameter relationship could be affected by many factors.Of which, site quality and stand density were found to be most common (Huang and Titus 1994, Sharma and Parton 2007, Meng et al. 2008).Typically, site quality is represented by site index or dominant/co-dominant height (H dc ), with the latter appeared more common, likely due to the fact that site index is often unobservable and must be predicted from site index models, which could bring in additional prediction errors and artificially reduce the variation in site quality.The use of H dc , however, is not without its problems.The definition of "dominant"/"co-dominant" is not always clear-cut in practice, and how many "dominants" and/or "co-dominants" are needed to get a stable H dc for a stand is unclear.Nicholas at al. (1991) found that the reliability of tree crown position classification was very low, and "only 38% of dominant trees and 81% of co-dominant trees were similarly re-classified" during visits in the same growing season.
In this study top height was used as an indicator of site quality.Top height is defined as the average height of the 100 largest DBH trees per ha.Given a plot size, a fixed number of the largest DBH trees corresponding to the plot size were selected and the average height of these trees was obtained to represent the site quality of the plot.The use of top height removed the inconsistencies associated with the use of H dc .
For stand density, basal area per ha (BA, m 2 /ha) was used.Basal area per ha is a simple and objective measure of the degree of crowding within a stand.It is generally better than the stand density measure based on trees per ha because basal area combines tree number with tree size.
Top height and basal area per ha were incorporated into the base H-DBH models.Different functional forms and various combinations of the variables and their transformations were tried and evaluated based on the standard NLMM goodness-of-fit measures such as -2 times log-likelihood (-2LL), Akaike Information Criterion, Schwarz's Bayesian Information Criterion, significance of the parameters, and studentized residual plots (Littell et al. 2006).It was found that simple linear combinations were adequate.As examples, two expanded models, one with top height incorporated and the other with both top height and BA incorporated, were formed based on the Chapman-Richards base model ( 9): (10) where TopH i is top height (m), BA i is basal area per ha (m 2 /ha), and β 1 -β 5 are fixed parameters common to all plots, and b 1i and b 2i are random parameters specific to the ith plot.Attempts to relate β 4 and β 5 to additional random parameters in linear or nonlinear fashions failed to achieve convergence.

Parameter estimation
All base and expanded models can be written as a generalized NLMM of the form: ′ is the error vector.The i b and i ε are assumed to be uncorrelated and normally distributed with mean zero and variance-covariance matrices D and R i , respectively, that is: Furthermore, since no obvious pattern of unequal error variance was detected from studentized residual plots, which is typical of height-diameter models after the inclusion of random parameters (e.g., Castedo Dorado et al. 2006, Meng et al. 2008), R i is assumed to be i n 2 I σ , where 2 σ is the error variance and i n I is a n i × n i identity matrix, and D is assumed to be an unstructured covariance matrix that is the same for all i: is the covariance between b 1 and b 2 .
The parameters of the generalized NLMM (12) can be estimated using different methods (Davidian andGiltinan 1995, Vonesh andChinchilli 1997).In this study we used the first-order (FO) method of Beal and Sheiner (1982) and the first-order conditional expectation (FOCE) method of Lindstrom and Bates (1990).We found that the FOCE method often failed to achieve convergence, and was less stable and more prone to floating errors than the FO method.In the cases where both FO and FOCE achieved convergence, the FO method was found to give more accurate predictions with smaller mean error and error variance.Therefore, we limited ourselves in this analysis to the FO method.
The FO method employs a first-order Taylor series expansion of (12) around a * β close to β and a * i b set to zero, the expected value of the random-parameters (i.e., ), to approximate the nonlinear function ( 12), with the quadratics and cross-products dropped: where the derivative matrices i X and i Z are defined by ( 15) , if we define a pseudo-response function * i y as: , eq. ( 14) can be written as a standard linear mixed model: Following the linear mixed model theory, the generalized least squares estimator β ˆ of the fixed parameters β in ( 16) can be obtained as described in Fitzmaurice et al. (2004).The random parameters predictor i b ˆ of the i b is (Davidian andGiltinan 1995, Vonesh andChinchilli 1997): where D ˆ and i R ˆ are estimates of D and R i , respectively, and Z i is defined in (15).
It is worthwhile to point out that eq. ( 17) is adequate only for the FO method.Had the FOCE method been used, we must use the following equation to predict the i b ˆ and solve it iteratively (Lindstrom and Bates 1990, Schabenberger 1994, Huang 2008, Temesgen et al. 2008, Meng et al. 2008 Unfortunately, many forest modellers used eq.( 17) to predict the i b ˆ while implementing the FOCE method, without realizing that this could seriously bias the results.
All base and expanded NLMMs were estimated using the SAS macro %NLINMIX, with the keyword EXPAND = ZERO for the FO method (Littell et al. 2006).

Model prediction
Since model fitting statistics alone can sometimes be unreliable, a fitted model is more appropriately judged through cross-validation or using separate model validation data (Picard and Cook 1984, Huang et al. 2002, Yang et al. 2004).To evaluate the predictive accuracy of the base and expanded H-DBH models fitted in this study on the validation data at plot-or subject-specific level where the main interest of NLMMs lies, the random parameters i b ˆ must be predicted first based on eq. ( 17).Once the i b ˆare available, predictions for i y are obtained by eq. ( 18), which is derived by equating ( 16) to the pseudo function * i y (Davidian andGiltinan 1995, Vonesh andChinchilli 1997) The errors (residuals) associated with the predictions are where ij y and ij y ˆ are the jth observed and predicted values for subject i with n i observations.For a population of N observations from m subjects, where N = ∑ = m i i 1 n , the overall mean bias ( e ) is ( 19) Since in practice, many researchers (e.g., Fang andBailey 2001, Calama andMontero 2004) have used the fixed parameters estimated as a part of a NLMM to represent the "typical" population-averaged responses, we also calculated such "typical" responses: are the "typical" population-averaged responses based on the fixed parameters β ˆ only (with the i b ˆ set to zero).The associated "typical" population-averaged residuals are: In addition to the overall mean bias averaged over all observations from all subjects (as in eq. ( 19)), which could be misleading sometimes because the positive and negative errors from individual subjects could cancel one another out, the mean bias (or simply bias) for each individual subject was also calculated one plot at a time, together with the standard deviation (SD): where i e is the bias for subject i with n i (or n ij ) observations, and all other variables are as defined before.Two other conventional goodness-of-fit statistics were also calculated by plot (subject).One is the percent bias (Bias% or % e i ) and the other is the root mean square error (RMSE): where i y is the average of the observed values for subject i, and all others are as defined before.
Once the values of i e , Bias% i and RMSE i were calculated for each of the 71 plots of the validation data, their frequency distributions were examined and compared to assess the goodness-of-prediction of the base and expanded height-diameter models on the validation data.
For a comparison, the biases for each of the 93 plots of the modelling data were also calculated following the same procedures outlined above and their frequency distributions were also examined.

Maximum height-diameter relationship
Subject-specific local predictions obtained from the NLMM technique attempt to "best" mimic the local data.Sometimes, due to the scarcity and the limited range of the data, as well as the flexibility of the specified model, local predictions may be unrealistic when extrapolated beyond the original data range (especially when more covariates are present).This is illustrated in Figure 2. To prevent the potential extrapolation errors due to data limitation, a maximum height-diameter curve was developed for black spruce based on all PSP data, plus the stem analysis data from sectioned trees (Huang et al. 1992).The combined data of 14,345 trees (Figure 2) were grouped by a 2-cm DBH class and the tallest tree in each DBH class was used to fit the Chapman-Richards model ( 4) by the ordinary nonlinear least squares method.This produced the "mean" maximum height-diameter curve for black spruce: which, if necessary, would be used to constrain the height predictions beyond the observed data range.In the cases where negative predictions are obtained, which is possible for the FO method since eq.( 18) is used to predict the response variable, negative values are set to zero to be biologically meaningful.

Choice of the base and expanded models
Table 2 lists the parameter estimates and relevant fit statistics for the base and expanded models, where all fixed parameters are highly significant at α = 0.05.The Akaike Information Criterion (AIC) and the Schwarz's Bayesian Information Criterion (BIC) are computed by where -2LL is -2 times the log-likelihood function of the model, P is the total number of effective parameters (includes fixed parameters, variance-covariance components of the random parameters, plus the residual variance component), and m is the number of effective subjects.
Among the four base models, the Chapman-Richards model (9) has the smallest AIC and BIC values (Table 2), suggesting (9) is the most reasonable base model for black spruce.Therefore, only ( 9) is expanded to include top height and/or basal area.The two expanded models (10) and ( 11) have smaller AIC and BIC values than their counterparts from (9), suggesting (10) and ( 11) are better than the Chapman-Richards base model (9).
Between the two expanded models, model (11), which included both top height and basal area, has smaller AIC and BIC values (Table 2).Hence, model ( 11) is considered better than model (10).This is expected because (10) included top height only.The estimated fixed parameters related to top height (β 4 ) and basal area (β 5 ) are both positive, indicating that in general, site quality and stand density have positive impacts on black spruce height growth.
Following the standard model ranking and selection practice in regression analysis, models (9), ( 10) and ( 11) would be ranked 3, 2 and 1 (best), respectively.Model (11) would be selected as the best model among the three models.The base model ( 9) would be considered as the worst model.

Predictive performance of the models
To assess the predictive performance of the models in real-world applications, height predictions from the base and expanded models were made based on the validation data (as well as on the modeling data for a comparison).To illustrate the computations involved, an example plot was used.This plot has three height-diameter observations.They are listed in Table 3 and  The height predictions based on the fixed parameters only are obtained directly from the models, with the random parameters set to zero.For instance, for the base model ( 9): where the estimated fixed parameters 3 1 ββ ˆ are given in Table 2.For the FO method, the derivatives of ( 9) with respect to the two random parameters are calculated by The calculated results at the given DBHs are listed in Table 3.They constitute the i Z matrix defined in (15).In addition, from Table 2, we also know the D ˆ and i R ˆ matrices.Therefore:

Ĥ
, i Z , D ˆ and i R ˆ values, the random parameters can be predicted from (17): Once the i b ˆ are known, the predicted heights are obtained by (18): . Results are listed in Table 3, along with the associated errors.The standard deviation of the errors and an overall accuracy measure (δ) combining bias and precision were also calculated (Cochran 1977): (28) δ = (mean bias) 2 + (standard deviation of the errors) 2 Height predictions from the other models were obtained in a similar manner.Results are also listed in Table 3. Height prediction trajectories for all 71 plots of the validation data across the possible predictive range where future predictions are likely to be made, are shown in Figures 3(c)-3(d).These trajectories, sometimes referred to as "spaghetti plots" or "chow-mein plots" (Huang 2008), can be particularly useful in assessing the predictive behaviors of NLMMs within and beyond the observed data range.All trajectories for black spruce (Figure 3) were found to be within the bounds defined by the maximum H-DBH curve shown in Figure 2, rendering the constraint unnecessary for this data set.
The results in Table 3 suggest that all three models over-predicted the height for the example plot (i.e., the mean biases are all negative).Among the three models, the expanded model ( 11) produced the largest bias (-0.633) and δ (1.156), whereas the base model ( 9) produced the smallest bias (-0.347) and δ (0.748).Note that this is completely opposite to the ranking of the models based on the goodness-of-fit measures on the modelling data (Table 2), where the expanded model ( 11) was the best and the base model ( 9) was the worst.
Following the same procedure illustrated for the example plot, random parameters were calculated and height predictions were made for each of the 71 plots of the validation data.In addition, "typical" population-averaged (PA) height predictions were also made using the estimated fixed parameters only given in Table 2.For the purpose of comparison, similar predictions were made for each of the 93 plots of the modelling data.Summary statistics of the prediction biases from different models, by different methods, and on different data sets are provided in Table 4.
For the PA predictions based on the fixed parameters only, the expanded models are more accurate with smaller values of δ and Bias% than those from the base model (Table 4).This implies that if we use the fixed parameters only to make the PA predictions, the inclusion of additional variables will improve the accuracy of the predictions.This is consistent with the traditional least squares regression, in which only fixed parameters are present, and additional variables may need to be incorporated into the base model to explain additional variations that may arise from the between-subject differences in, for instance, site quality and stand density.
However, under the NLMM framework, or whenever the NLMM technique is applied, subject-specific local predictions become the predominant focus.The fixed parameters estimated as a part of a NLMM do not fully characterize the mean responses of the population, especially when the inter-subject variability is large (Hu et al. 1998, Davidian and Giltinan 2003, Fitzmaurice et al. 2004, Young et al. 2007).As a consequence, they generally provide a biased partial representation of the true population mean responses, which, unfortunately, is hardly ever recognized in forest modelling, where many forest modellers used the predictions obtained from the fixed parameters only to represent the "unbiased" PA responses.Corrections (or separate fits by nonlinear least squares) are necessary to obtain the unbiased PA predictions unless it can be shown that the biases are of little or no practical consequence (Chiswell and Monahan 2004, Kjellsson et al. 2004, Funatogawa and Funatogawa 2007, Huang 2008).The simplest and the most effective correction procedure for obtaining the unbiased PA predictions under the NLMM framework can be derived based on the ratio estimate method of Hartley and Ross (1954), which is frequently employed in sample surveys to obtain unbiased estimates of the population and subpopulation means.Since a correctly specified and fitted NLMM possesses the ability to closely follow the population-averaged and subject-specific trends exhibited by the data, the proportionality between the observed and predicted values can be assumed and the ratio estimate method can be used to adjust the predictions.Various ratio estimators can be formed of which the most common one, when calculated for the entire population based on the fixed parameters only, is (29) where y and y ˆ are the grand mean of the observed ( ij y ) and predicted ( fix ij y _ ˆ) values, respectively.The r defined in eq. ( 29) follows that: 1) if y ˆ = y , r = 1, indicating unbiased prediction; 2) if y ˆ > y , r < 1, indicating over-prediction; and 3) if y ˆ < y , r > 1, indicating under-prediction.Cochran (1977) showed that the expectation: Therefore, in the case of over-prediction (r < 1), the ratio estimate will adjust the predicted values downward by the ratio y / y ˆ to remove the bias.When under-prediction occurs (r > 1), the ratio estimate will adjust the predicted values upward by the same ratio y / y ˆ.In both cases, the adjusted (unbiased) predictor adj ij y _ ˆ, can be written as Once the adjusted predictor is obtained, it can be used to generate an unbiased PA curve for the population.For the base model ( 9) fitted in this study, this unbiased PA curve is given by are the adjusted unbiased PA predictions, and r is ratio estimate defined in eq. ( 29).
For subject-specific predictions, the results in Table 4 suggest that the overall mean biases are much smaller than those for the PA predictions.This is expected because the PA predictions were based on the fixed parameters only, whereas each plot in the subject-specific predictions has its own localized curve obtained from both fixed and random parameters.
The results in Table 4 also suggest that for subject-specific predictions, the mean bias and the standard deviation of the errors from the base model ( 9) are smaller than those from the expanded models.The bias (-0.023) and the SD (0.870) from the base model ( 9) are the smallest, whereas the bias (-0.068) and the SD (0.903) from the expanded model ( 11) are the largest.The values of the overall accuracy (δ) are 0.757, 0.798 and 0.820 for models ( 9), ( 10) and ( 11), respectively, indicating that as the number of variables incorporated into the base model increases, the accuracy of the predictions decreases.A similar trend is also observed on the modeling data (Table 4).Clearly, the height predictions from the base model are the most accurate.This is again entirely opposite to the ranking of the models based on the model fitting statistics.It is unexpected because the expanded models, which included one additional variable (top height) in ( 10), and two additional variables (top height and basal area) in ( 11), were supposed to provide incrementally more accurate predictions with smaller biases and SDs (to offset the incremental time and costs associated with measuring the additional variables).
For subject-specific local predictions, the unbiasedness of the predictions is not guaranteed because, depending on many factors (e.g., number of subjects, inter-subject variability, observations per subject, approximation process, model specification, errors-in-variables, assumptions, etc.), a NLMM provides varying degrees of goodness-of-fits to individual subjects within the population (Huang 2008).In fact, the biases for many subjects of an appropriately fitted NLMM could still be substantial (e.g., |Bias%| exceeding ±2.5 or ±5.0% of the observed mean, see Tables 5-6).To ensure the unbiasedness of subject-specific local predictions, a ratio estimate can also be calculated for each subject within the population: (30) ˆ where i y is the mean of ij y for subject i and i y ˆ is the mean of ij y ˆ predicted from eq. ( 18).The calculated i r can be used to adjust the ij y ˆ to obtain unbiased predictions for subject i: Essentially, both the population-based and subject-specific correction procedures adjust the predicted values from a NLMM downward or upward by the calculated ratio estimates, such that the averages of the residuals (i.e., mean biases) are zero on both the population and subject-specific levels.To avoid a digression from the main objectives of this study, we did not demonstrate the correction procedures in the present study.Interested readers can find more details elsewhere (e.g., Huang 2008).

Distributions of biases from the base and expanded models
To determine the possible reasons for the lower accuracy from the expanded models, the frequency distributions of subject-specific biases from the base and expanded models were examined.Table 5 lists the summary statistics for the bias ( i e ), Bias% i and RMSE i calculated plot-by-plot.The calculations were done for 71 plots of the validation data, and as a comparison, for 93 plots of the modelling data as well.The plots where the Bias% i exceeded ±2.5% of the observed mean were identified and listed separately, where the ±2.5% threshold was chosen by dividing the one-sided 5% significance level commonly used in statistical inferences into two-sides.
The frequency distributions of the biases, Bias%s and RMSEs were examined.Here we only show the distributions of the Bias%s from the base model ( 9) and the expanded model (11) (Figure 4), as the distribution from the expanded model ( 10) was similar to that of (11), and the distributions of the other statistics showed similar patterns to those of Figure 4. Actual values and additional information related to the distributions are available, but here we only list those (Table 6) relevant to the validation data.
The contrasts between the base and expanded models are obvious.For instance, when (9) was used, 18 out of 71 plots of the validation data produced biases exceeded ±2.5% of the observed mean (Tables 5-6).When ( 10) and ( 11) were used, 23 and 27 plots produced biases exceeded ±2.5% of the observed mean.The results on the modelling data were similar: 15 out of 93 plots had biases exceeded ±2.5% when (9) was used, and 30 and 37 plots had biases exceeded ±2.5% when ( 10) and (11) were used.The frequencies of poor predictions from the expanded models are greater than those from the base model.
The poorer performance of the expanded models can also be seen in Figure 4, where the biases are centered around zero but those from the expanded models are spread out more than those from the base model, suggesting larger prediction variations and lower accuracies from the expanded models.On the validation data, the standard deviations of the biases for ( 10) and ( 11) are 0.612 and 0.664, respectively (Table 5).Both are larger than that (0.399) for the base model ( 9).The δ value (0.159) of the base model is less than half of the δ values (0.386 and 0.447) of the expanded models, suggesting that the base model is more than twice as accurate as the expanded models.The same conclusion can also be reached on the modelling data (Table 5).
It was generally taken for granted that the predictive accuracy of a base model would be improved if additional tree and stand level variables were incorporated into the model.This is true in the traditional least squares regression, and it may also be true in the biased PA predictions using only the fixed parameters under the NLMM framework.However, in terms of subject-specific predictions that the NLMM technique was developed for, we found that the addition of top height in (10), and top height and basal area in (11) both produced lower accuracy than the base model ( 9).In fact, judging from the results in Tables 4-5, the overall accuracy (δ) of the predictions is reduced as additional variables are added into the base model.This is true on the validation data, as well as on the modelling data.
The possible reason that caused the base model to perform better in local predictions than the expanded models likely originated from the inclusion of subject-specific random parameters in the base model.Under the NLMM framework, the random parameters in the base model already allowed the plot level variations associated with different site-specific factors to be accounted for without actually requiring that they be identified or measured.This renders in a statistical sense (as oppose to biological) that the addition of other plot level variables such as top height and stand density, which also represent the same differences in site factors that the random parameters already represent, unnecessary.
For the purpose of evaluating the biological implications of top height and basal area on black spruce height growth under the NLMM framework, it is still useful to look at the significance of these variables when entered into the base model.However, for subject-specific local predictions, the results of this study show that the addition of other variables beyond the base height-diameter model is unwarranted.Apparently, the random parameters in an aptly specified base model can take into account all or most of the other known and unknown factors on a given site, thereby eliminating the need to measure or to include other variables when subject-specific local predictions are made.

Conclusions
We examined four base height-diameter models for black spruce under the NLMM framework.We found that the Chapman-Richards base model was the best.We compared this base model with the expanded models that included top height and/or basal area as the additional predictors through the evaluation of the overall biases and the frequency distributions of the biases from individual plots.We found that the base model produced the most accurate subject-specific height predictions.For black spruce, optimal, subject-specific tree height predictions can be achieved using a simple height-diameter model fitted by the NLMM technique without including any other additional variables.We also found that the frequency distributions of the biases from individual plots where the main interest of a NLMM lies are more powerful in revealing the adequacy of the fitted NLMM than many other statistics.Such frequency distributions should be routinely examined in any future study involving NLMMs.Picard, R.R., & Cook, R.D. (1984).Cross-validation of regression models.J. Am. Stat. Assoc. 79: 575-583. Schabenberger, O. (1994).Nonlinear mixed effects growth models for repeated measures in ecology.In Proc. of ASA, Section on Stats.& Environment, Toronto, Canada, pp. 156-161.Sharma, M., & Parton, J. (2007).Height-diameter equations for boreal tree species in Ontario using a mixed-effects modeling approach.For.Ecol. Manage. 249: 187-198. Temesgen, H., Monleon, V.J., & Hann, D.W. (2008).Analysis and comparison of nonlinear tree height prediction strategies for Douglas-fir forests.Can.J. For.Res. 38: 553-565. Vonesh, E.F., & Chinchilli, V.M. (1997).Linear and nonlinear models for the analysis of repeated measurements.Marcel Dekker, New York, 560 p. Wykoff, W.R., Crookston, N.L., & Stage, A.R. (1982).User's guide to the stand prognosis model.USDA For.Serv.Gen. Tech.Rep. INT-133. 122 p. Yang, Y., Monserud, R.A., & Huang, S. (2004).An evaluation of diagnostic tests and their roles in validating forest biometric models.Can.J. For.Res.  is the covariance between b 1 and b 2 , σ 2 is the residual (error) variance, P is the total number of parameters (includes fixed parameters, variance-covariance components of the random parameters, plus the residual variance component), m is the number of subjects (plots), -2LL is -2 times log-likelihood of the model, AIC is Akaike information criterion, and BIC is the Schwarz's Bayesian information criterion (AIC and BIC are defined in eq. ( 27)), and ranking refers to the rank of the model by the AIC and BIC values (smaller is better).Note: H_ fix is height prediction from fixed parameters, H_ ss is height prediction from fixed and random parameters, e_ fix and e_ ss are associated residuals, der_b 1 and der_b 2 are derivatives with respect to b 1 and b 2 , SD is the standard deviation, δ is defined in eq. ( 28), and Bias% = 100 × mean bias / mean H. Note: N is sample size, SD is standard deviation, δ is defined in eq. ( 28), and Bias% = 100 × mean bias / mean H (equals to 10.673 for the validation data, and 13.092 for the modelling data, see Table 1).and RMSE i are plot-specific bias, Bias% and root mean square error defined in eqs.( 22), ( 24) and (25), respectively, freq refers to the number of plots, SD is standard deviation, and δ is an overall measure of accuracy defined in eq. ( 28).       6.

Figure 1 .
Figure 1.Model fitting (a, b, c) and validation data (d, e, f).Summary statistics are listed in Table1.

Figure 2 .
Figure 2. Maximum height-diameter relationship (solid line in a and b) for black spruce.The relationship is defined by eq.(26).The localized curve (dashed line) from the two data points in (b) gives unrealistic predictions when extrapolated beyond a DBH of 26.5 (cm).Therefore, for DBH > 26.5, height predictions are constrained by the maximum relationship.

Figure 3 .
Figure 3. Height prediction curves from the base model (9) (a, c) and expanded model (11) (b, d), for an example plot (a, b) and for all 71 plots (c, d).Actual values for the example plot are listed in Table3.

Figure 4 .
Figure 4. Frequency distributions of Bias% from models (9) (a, c) and (11) (b, d) on the validation data (a, b) and modelling data (c, d).Actual values related to (a) and (b) are given in Table6.

Table 1 .
34: 619-629.Young, M.L., Preisser, J.S.,Qaqish, B.F., & Wolfson, M. (2007).Comparison of subject-specific and population averaged models for count data from cluster-unit intervention trials.Statistical Methods in Medical Research 16: 167-184.Summary statistics for black spruce model fitting and validation data TopH is top height (i.e., average height of the 100 largest DBH trees per ha), N is sample size (i.e., number of trees at tree level, or number of plots at plot level).Min, max, and SD are minimum, maximum, and standard deviation, respectively.

Table 2 .
Parameter estimates and fit statistics for the base and expanded height-diameter models

Table 3 .
Height predictions for an example plot with three height-diameter (H-DBH) observations

Table 4 .
Summary statistics for the overall height prediction biases

Table 5 .
A summary of plot-based height prediction biases from the base and expanded models.

Table 6 .
Frequency distributions of Bias% from the base and expanded models on the validation data Note: mean and freq refer to the mean bias and frequency (number of plots) associated with the defined class.The frequencies are displayed in Figures4(a