A Note on Bivariate Smoothing for Two-Dimensional Functional Data

In this paper we study a bivariate smoothing approach for estimating multiple functional parameters for functional data with a two-dimensional domain. We present a penalized regression framework for smoothing with the purpose to: (a) facilitate the estimation of the smooth overall bivariate mean function of two-dimensional functional data, (b) enable the estimation of the functional e ﬀ ect of a scalar covariate, (c) accommodate completely or incompletely sampled data, (d) implement the ﬁtting approach using available statistical software, and (e) construct pointwise approximate conﬁdence intervals for multiple bivariate functional parameters. Implementation results from simulation studies show that these methods perform very well in practice. We illustrate the usefulness of the bivariate smoothing approach to several real datasets, including applications to electricity demand.


Introduction
We study a statistical framework for smooth function estimation for functional data observed on a two-dimensional domain.Two-dimensional functional data are recorded in many disciplines and examples include human mortality rates (Gervini, 2010) and gel proteomic samples (Morris, Baladandayuthapani, Herrick, Sanna, & Gutstein, 2011).Analyzing two-dimensional functional data is a research area of dynamic interest.In this paper we examine a fitting approach for multiple smooth bivariate functional parameters.
We observe for the i th sample the data values [{Y i (t i j , s iκ )} jκ , W i ], where Y i (t i j , s iκ ) is the observed outcome at points t i j ∈ T and s iκ ∈ S, W i is a scalar covariate, and (t, s) ∈ T × S, where T and S are intervals on the real line.We consider the following model for the i th observed outcome: for 1 ≤ i ≤ n, 1 ≤ j ≤ m i and 1 ≤ κ ≤ q i .The bivariate mean function is modeled semi-parametrically (Ruppert, Wand, & Caroll, 2003;Wood, 2006) and comprises of two components: a linear function W i γ(•, •) to account for the scalar covariate W i and an overall nonparametric function f (•, •).The bivariate functional parameters f and γ are assumed smooth, and the errors i are mean zero random processes.The model parameter γ(t, s) is the bivariate functional coefficient of the scalar covariate and captures the effect of W i varying smoothly on the two-dimensional domain (t, s).
We wish to estimate f and γ in the setting of model (1).While the estimation of the overall mean function f has received some attention in the context of two-dimensional functional data, see Gervini (2010), the simultaneous estimation of both f and γ in the setting of model (1) has not been discussed.Important aspects that are still open to investigation in the setting of two-dimensional functional data include: estimating the bivariate smooth overall mean function for completely or incompletely sampled outcomes; incorporating the functional effect of a scalar covariate; constructing confidence intervals for multiple smooth bivariate functional parameters.Most smoothing methods for functional data are focused on functional data with a univariate domain (Ramsay & Silverman, 2005;Ramsay, Hooker, & Graves, 2009), and the use of smoothing methods in the setting of two-dimensional functional data is currently in the incipient stage.In this paper we undertake a mixed model framework with a fast-fitting approach to examine these aspects.This provides a statistically coherent analysis for important problems in the two-dimensional functional data literature.
The organization of the paper is as follows.Section 2 describes the methods for the bivariate smoothing approach.Section 3 outlines extensions to incompletely observed data scenarios.Section 4 presents numerical results for the empirical performance of the smoothing approach.Applications to several real datasets are illustrated in Section 5, and Section 6 is our conclusion.

Methods
In this section we describe the statistical modeling and fitting procedures.We begin with expressing model ( 1) as an additive model, followed by employing penalized regression and associated mixed model fitting framework.In this section it is considered that the i th sampled outcome is observed at (t i j , s iκ ) = (t j , s κ ) for all i, where 1 ≤ j ≤ m i and 1 ≤ κ ≤ q i with m i = m and q i = q for all i.In section 3 we discuss the extension to other designs.

Statistical Modeling of Bivariate Functional Parameters Using Spline Bases
We express model parameters γ and f using a linear combination of tensor product spline basis functions.In particular, the parameter γ is represented as γ(t, s) . The bases a l 1 (•) and φ k 1 (•) are univariate bases in argument t, a l 2 (•) and φ k 2 (•) are univariate bases in argument s, and β l 1 ,l 2 and μ k 1 ,k 2 are unknown coefficients corresponding to the tensor products Model (1) can thus be represented as an additive model: where and φ k 2 (•) are chosen a priori.Functions f and γ are conveyed using rich tensor product bases to plausibly represent their complexity.The dimension of the bases is defined by choosing a large enough number of basis functions L 1 , L 2 , K 1 and K 2 .

Penalized Criteria for Fitting Smooth Bivariate Functional Parameters
In order to obtain smooth estimates for f and γ we impose quadratic penalties on their corresponding bases coefficients.Let m i ((t, s); β, μ) represent the mean of Y i (t, s), where β is the vector of all parameters β l 1 ,l 2 and μ is the vector of all parameters μ k 1 ,k 2 .Then the criterion to be penalized is: where Pen(β) and Pen(μ) take the form of quadratic penalties for the bivariate smooths (Wood, 2006).For each parameter function we use smoothing parameters for both t and s dimensions.The penalty term for γ is Pen(β) = β T λ β,t P β,t + λ β,s P β,s β.The penalty term for f is Pen(μ) = μ T λ μ,t P μ,t + λ μ,s P μ,s μ.Matrices P β,t , P β,s , P μ,t and P μ,s have known functional form, are fixed, and depend on the choice of spline bases.
The scalars λ β,t , λ β,s , λ μ,t and λ μ,s govern the smoothness of the fit and are unknown parameters.Fitting model (3) calls for estimation of these smoothing parameters, for which we opt for restricted maximum likelihood (REML) estimation (Reiss & Ogden, 2009) in a chosen mixed model and outline the approach next.

Mixed Model Representation for Bivariate Smoothing
The penalized criterion with the quadratic penalty choices discussed above is where σ 2 is the variance of the errors in model (1).By letting σ 2 β,• = σ 2 /λ β,• and σ 2 μ,• = σ 2 /λ μ,• and employing the considerations presented in, for example, Ruppert et al. (2003, sec 4.9), one can establish that the solution to the penalized criterion (3) is the best linear unbiased predictor in the mixed model where P β,• and P μ,• are known, and σ 2 β,• and σ 2 μ,• control the smoothness of the functional parameters.In many instances the inverse is substituted by a generalized inverse or the coefficients are partitioned into penalized and unpenalized.This mixed model representation is a useful statistical framework that has several advantages.First, it provides a fast approach for bivariate smoothing that is carried out using available software.Second, elements of inference corresponding to mixed model analysis, such as pointwise confidence intervals (Nychka, 1998;Ruppert et al., 2003), can be obtained for the model parameters.We study pointwise confidence intervals for our bivariate functional parameters when we assume independent and identically distributed errors in model (1).Third, other design settings can be accommodated using this framework, such as incompletely observed two-dimensional functional data.
We display how to fit model (4) using statistical software based on REML estimation of the variance components.We employ the gam() function from the mgcv package (Wood, 2012) in R (R Development Team, 2012).

Fitting Approach for Bivariate Smoothing via Mixed Models Software
In this section we provide implementation steps for the approach using the mgcv package (Wood, 2012) in R. As illustrated, model (3) can be viewed as a penalized additive model where the smooth bivariate model parameters are represented using the tensor product bases a l 1 (•)a l 2 (•) and φ k 1 (•)φ k 2 (•).These are spline bases and the model fit can be expressed via the following R code formula syntax: fit <-gam(Y ˜te(t,s,by = W) + te(t,s), method = ''REML'').
We now outline the components involved in the fitting procedure above while explaining the connection to our model and its notation.For two-dimensional smoothing, the data is organized into column vectors in order to implement the fitting approach.The observed two-dimensional functional data outcome samples {Y i (t j , s κ )} i jκ are arranged as a column vector of size nmq, so that the i th sample matrix with observed outcomes containing mq observations is . The next step is to specify the grid of points in dimensions t and s that correspond precisely to the locations of the observed outcomes.The vector The scalar covariate W i is repeated mq times for each i to form the nmq dimensional column vector labeled W. The expression te(t,s,by = W) thus fits a tensor product of spline basis functions at the grid of points specified in the vectors t and s while being weighted by the values of the scalar covariates W i .Without the option by = W the expression te(t,s) constructs a tensor product of spline basis functions a l 1 (t)a l 2 (s), whereas including by = W is taking into account the weights W i , thus accounting for the effect of the scalar covariate via its functional coefficient varying over the t and s dimensions.
Fitting the part of the model that yields a smooth estimate for f (t, s) involves the matrices t and s already constructed.Thus, the expression te(t,s) fits a bivariate tensor product spline φ k 1 (t)φ k 2 (s) at the grid of points in the vectors t and s.The details of the gam fit can be adjusted as needed.For example, requiring the use of 10 basis functions for each dimension can be requested by including the term k = 10 so that the expression now is te(t,s,k = 10).
Using the approach described, the underlying gam fitting procedure equips the user to perform rapid bivariate smoothing and provides a useful toolbox for obtaining graphical and inferential methods readily accessible.

Extension to Incompletely Observed Two-Dimensional Functional Data
Incompletely observed two-dimensional functional data can be accommodated by the procedure described in section 2, provided some adjustments are made in the fitting step.In such settings the outcomes {Y i (t i j , s iκ )} i jκ , 1 ≤ i ≤ n, are sparsely observed at points {(t i j , s iκ )} i jκ over the domain, for 1 ≤ j ≤ m i and 1 ≤ κ ≤ q i .Vectors Y, t, s and W all have dimension nm i q i and are constructed as follows.The vector of outcomes for the i th sample has dimension m i q i and is constituted as Next, the vector labeled t has the form t = (t (1) , t (2) , . . ., t (n) ) T and consists of row-stacking the m i q i -dimensional vectors t (i) = (t i1 , t i2 , . . ., t im i , t i1 , . . ., t im i ) T for all i.Further, s = (s (1) , s (2) , . . ., s (n) ) T is built by row-stacking all the sample-specific vectors of dimension m i q i defined as s (i) = (s i1 , s i1 , . . ., s i1 , s i2 , . . ., s iq i ) T .The vector of covariates W is obtained by taking m i q i copies of W i and row-stacking these vectors across all n samples.After these matrices are set up, the fitting approach proceeds as described in section 2. Implementation results obtained for this setting are provided in sections 4 and 5.

Simulation Study
We present numerical results from implementations to simulated data.Empirical performance for the fit of the estimators was evaluated for completely or incompletely observed two-dimensional data.

Bivariate Smoothing for One Bivariate Functional Parameter
For each 1 ≤ i ≤ n, we generated densely sampled two-dimensional outcome data according to the model which were observed on an equally spaced grid with t j = ( j − 1)/m and s κ = (κ − 1)/q, 1 ≤ j ≤ m, 1 ≤ κ ≤ q, where m = 48, q = 50, and n = 60.The values of m, q and n are chosen similar to the values in real data applications in sections 5.1 and 5.2.We considered three cases for the error terms: Case R1 where i (t j , s κ ) are independent and identically distributed N(0, 2); and Cases R2-R3 where i (t j , s κ ) = Z i (t j , s κ ) + e i jκ , and where Z i is a bivariate mean zero stochastic process with variance 1, and e i jκ are i.i.d.N(0, 1).Independent samples Z i (•, •), 1 ≤ i ≤ n, were simulated, using GaussRF() function in the RandomFields (Schlather, 2012) package, from a mean zero and variance one Gaussian Random Field with Matérn correlation function with scale parameter ρ and shape parameter ν = 0.5.We considered two cases: Case R2 ρ = 0.05, and Case R3 ρ = 0.15.
The bivariate mean functional parameter was f (t, s) = 1.6 sin(πs) cos(2πt), which has a similar shape to the periodic mean function in the application considered in section 5.2.Signal-to-noise ratio defined as SNR 1 = range( f )/[var{ (t, s)}] 1/2 was set to 2.2, where range( f 5) does not incorporate the effect of a scalar covariate and thus is a simpler model compared to model (1).Model ( 5) permits a comparison of several estimators for f (•, •), which we discuss next.The first estimator, labeled E1, denotes the penalized splines estimator obtained by applying methods described in section 2. We facilitate accuracy of the fit comparisons with other estimators that can be used for estimation of f in the setting of model ( 5).Tensor products of cubic B-splines with 100 basis functions (for E1) and 25 basis functions (for E2 and E3) and second-order difference penalties in both directions were used to fit f (•, •).The following estimators were computed: E1. Pen Spl -Penalized splines estimator obtained by applying the methods presented in section 2.
E2. Smooth ensemble -Ensemble average was obtained 1/n n i=1 Y i (t, s) followed by bivariate smoothing.E3.Average of smooths -A bivariate smooth estimate for Y i (t, s) was fit for each i.Then, an ensemble average was computed.
E4. Assemble f t -For each fixed t, model (5) was fitted and a univariate penalized spline estimator was obtained.Then, these estimators were joined along t to form a bivariate estimator f .E5. Assemble f s -For each fixed s, model ( 5) was fitted and a univariate penalized spline estimator was obtained.Then, these estimators were joined along s to form a bivariate estimator f .We provide numerical results for the fit performance, as well as computational time aspects of the fitting procedures used.We computed the integrated mean square error (IMSE), integrated square bias (IBIAS 2 ), and integrated variance (IVAR), where IMSE{ Here E{ f (t, s)} and Var{ f (t, s)} were estimated by the empirical mean and variance of f (t, s) in S = 250 simulations.
Results for the fit of the estimation for each method are presented in Table 1 and can be summarized as follows.
The IMSE for the penalized splines estimator E1 was smaller compared to the estimators E2-E5 for all R1-R3 choices.Figure 1 shows that the variability of the ISE of all estimators was comparable, and that the median ISE of E1 was smaller than for the other estimators.Compared to E4 and E5 estimators, which do not borrow strength in both t and s domains simultaneously, E1 performed better, as illustrated by smaller ISE; see Figure 1.IMSE was smaller for the case R1 of the errors.For the case of non-i.i.d.errors (cases R2 and R3) IMSE increased for all estimators, but E1 still outperformed E2-E5 estimators.Comparable IVAR among E1, E2 and E3 may be achieved for case R1 provided a similar number of basis functions is used for model fitting.For cases R2 and R3, the methods E1 and E3 display undersmoothing (of similar degree) compared to the uncorrelated errors case, and thus have larger IVAR compared to case R1.Bivariate smoothing is typically slow, but our approach was fast: about 11 seconds for the analysis of 60 two-dimensional data objects.

Bivariate Smoothing for Two Bivariate Functional Parameters
We generated two-dimensional functional outcomes according to model (1), where W i was a binary variable such that W i = 1{U i ≥ 0.75}, with U i ∼ Uniform(0, 1), where n = {30, 60}.We considered the equally spaced grid design used in section 4.1.We generated i (t, s) for cases R1 -R3 as in section 4.1.
For the bivariate functional parameters we set γ(t, s) = sin(πs) cos(2πt) and f (t, s) = 1.5 exp −8s 2 +0.6 exp −8t 2 , so that γ and f have comparable range.Our choice for f (•, •) was motivated by the shape of the mean function for the application in section 5.3.Signal-to-noise ratio defined as SNR 2 = range(.75γ+ f )/[var{ (t, s)}] 1/2 was set to 2.2.Because of the presence of the scalar covariate, only the estimators E1, E4 and E5 are employed for fitting model ( 1).Empirical results corresponding to the accuracy of the fit are presented in Table 2. Results show that the penalized splines estimator E1 outperformed the estimators E4 and E5 for both f and γ, for R1-R3 choices and for both n choices.The advantage of E1 was especially evident for case R1 of the errors.As n increased, the IMSE decreased, which indicates that the penalized splines estimator is consistent.E1 provides a smooth fit compared to the E4 and E5 estimators, which do not borrow strength in both t and s domains; see Figure 2 for the visual representation of the estimators E1, E4 and E5 for f (•, •).Methods E4 and E5 comport like unpenalized (least squares) estimators in one of the two dimensions and thus sometimes yield slightly smaller IBIAS than E1, which is a penalized estimator in both t and s dimensions.

Incompletely Observed Two-Dimensional Data
The bivariate smoothing method studied in this paper can be applied to incompletely two-dimensional sampled data.In this section we present a brief evaluation of the accuracy of the fit in such scenarios.Data was generated as described in section 4.2 for Case R1.We randomly selected and kept sixty percent of the generated outcome data and this set of data was used as input values for model fitting.The approach described in section 3 was applied.
Results for the fit of the estimators show that accuracy in estimation is affected by the sparsity in the data, but not by much; compare IMSE, IVAR and IBIAS columns in Table 3 to results in Table 2 for Case R1.As the number of samples n increased, the estimation accuracy improved, as expected; see IMSE columns in Table 3.

Confidence Intervals for Smooth Bivariate Functional Parameters
One advantage of the fitting approach discussed in section 2 is obtaining pointwise confidence intervals for the model parameters as a consequence of the mixed model fitting procedure.We considered the same data generating process as in section 4.2 and Case R1.To characterize the properties of the pointwise confidence intervals we computed the integrated actual pointwise coverage (IAC) and integrated actual width (IAW).For the case of An approximate 95% pointwise confidence interval CI p (t 0 , s 0 ) for f (t 0 , s 0 ) can be constructed as where Σ is the estimated covariance matrix of μ.We used the Bayesian posterior covariance matrix (Ruppert et al. 2003).The length of this confidence interval was 3.92 sd{ f (t 0 , s 0 )} and IAW = E[3.92T S sd{ f (t, s)}dtds].Empirical coverage results are presented in Table 4.The nominal level was set to 0.95.Table 4. Results for pointwise approximate confidence intervals.Displayed are IAC(IAW) results for 95% nominal confidence level.Scenario: m = 48, q = 50, S NR 2 = 2.2, Case R1, section 4.4 simulation setting.Results are based on 250 simulations γ(t, s) f (t, s) n = 30 0.97 (0.47) 0.97 (0.41) n = 60 0.97 (0.44) 0.97 (0.38) Table 4 results can be summarized as follows.The pointwise confidence intervals had a coverage probability exceeding the nominal level.As the number of samples n increased, the width of the intervals decreased for both parameter functions, while maintaining IAC above the nominal level.In conclusion, for Case R1, the pointwise approximate confidence intervals for the bivariate functional parameters had narrow width and achieved or exceeded the nominal coverage level.

Applications
We applied the bivariate smoothing methods discussed in section 2 to several real datasets.Implementations to datasets on electric energy demand, temperature and human mortality are presented.

Application to Electric Energy Demand in Adelaide, Australia
Electricity is an energy support for human progress.Electricity demand fluctuates as it is influenced by consumption patterns according to business hours, outside temperature and other weather conditions.Understanding electricity demand is useful for planning optimal electric energy supply depending on various factors, such as, time of day and week of year.Our goal is to estimate the electric energy demand patterns in Adelaide, Australia, as function of time of day t and week s.We use the Electricitydemand data, which is part of the functional datasets fds package (Shang & Hyndman, 2012) in R, and comprises of electricity demand recordings, in megawatts, that span the calendar years 1998 -2006.Following the notations in section 2, Y i (t, s) corresponds to the electricity demand for i th day at half-hourly time of day t and week of year s.We estimate the mean function f (•, •) and the functional coefficient γ(•, •) of weekend, where W i = 1{weekend day}, for 1 ≤ i ≤ n, from data consisting of n = 63 (corresponding to data for the seven days of the week, for nine years) two-dimensional data samples.The two-dimensional domain is T × S, where t ∈ T = {1, 2, . . ., 48} corresponds to 48 half-hourly time points during the course of a day, starting with the first half-hour after midnight, and where s ∈ S = {1, 2, . . ., 52} corresponds to the week of year, starting with the first week in January.We applied the methods presented in section 2 to obtain smooth f and γ.Tensor products of cubic B-splines with 64 basis functions and second-order difference penalties in both directions were used to fit f (•, •) and γ(•, •).Results show that over the course of the year, the mean electricity demand was high for the beginning and end-year weeks (Australian summer) and in the mid-year weeks (Australian winter); see Figure 3.Over the course of the day, the electricity demand was higher during mid-day, at around half-hour 30.During the early morning and late evening hours of the day, more electricity was demanded during mid-year (Australian winter); this may be due to the use of heating for homes during the night hours.During mid-day more electricity was demanded during first and last weeks (Australian summer).The effect of weekend on the electricity demand resulted in a decrease by as much as 300 megawatts; see Figure 3.This decrease is more pronounced for the mid-day hours of the weekend, around half-hours 20 -30, throughout the year.The mean width of the 95% pointwise confidence intervals was 17.7 for f and 32.9 for γ. Figure 3 displays information about the pointwise confidence intervals for γ: bright green corresponds to effects that were approximately 2 standard errors above -100, and bright blue corresponds to effects that were approximately 2 standard errors below -100.

Application to Temperature Patterns in Adelaide, Australia
Adelaide, Australia has an overall mild climate year-round.In this region, winter corresponds to the mid-year months, while the summer months occur during the weeks at the beginning and end of the year.The goal was to estimate the mean temperature pattern as a function of half-hourly time of day t and week s.Data tempkent, which is available in the functional datasets fds package (Shang & Hyndman, 2012) in R, was used for this purpose.The two-dimensional domain T × S in this application was the same as in the application to electricity demand.Here Y i (t, s) denotes the temperature in degrees Celsius recorded for i th day at time of day t and week s.We applied the methodology in section 2 to estimate f (•, •) using n = 63 two-dimensional data samples (corresponding to 7 days of week for years 1998 -2006).Scalar covariates were not considered in this application.Tensor products of cubic B-splines with 100 basis functions and second-order difference penalties in both directions were used to fit f (•, •).The estimator is illustrated in Figure 4.The features can be summarized as follows.Warm average temperature of about 27 degrees Celsius was observed for the beginning and end weeks of the year (Australian summer), especially during mid-day at half-hours 20 -30.The mid-year weeks registered a decreased average temperature with a minimum average temperature of about 10 degrees Celsius during the beginning and evening hours of the day.The mean width of the 95% pointwise confidence intervals for f was 0.21.The estimated overall mean function displayed a decrease in infant mortality in recent years; see Figure 5.This observed effect was probably due to recent improvements in medical care for newborns and infants.After age 10, as age increased mortality also increased for most years.The estimated functional coefficient of gender shows that, overall, female mortality was lower than male mortality; see Figure 5. Female mortality for early youth had lower magnitude compared to male mortality and this decrease was larger in recent years.This effect was due, possibly, to the recent medical advancements that benefited the female population.The mean width of the 95% pointwise confidence intervals was 0.04 for f and 0.06 for γ. Figure 5 displays information about the pointwise confidence intervals for γ: bright green corresponds to effects that were approximately 2 standard errors above -0.5 and bright blue corresponds to effects that were approximately 2 standard errors below -0.5.

Discussion
The approach we study in this paper is a useful statistical framework for fitting smooth bivariate functional parameters in the setting of two-dimensional functional data.Our bivariate smoothing research is a useful contribution

Figure 3 .
Figure 3. Electric energy demand data.Estimated overall mean electricity demand (left panel) and estimated functional coefficient corresponding to weekend days (right panel)

Figure 4 .
Figure 4. Temperature data.Smooth estimate for the overall bivariate functional mean temperature

Figure 5 .
Figure 5. Human mortality data.Smooth overall mean log mortality estimate (left panel) and smooth functional coefficient estimate corresponding to gender (right panel)