Electricity Consumption Analysis Using Spline Regression Models : The Case of a Turkish Province

Energy is one of the indispensible elements of human life and electrical energy is adopted as the most frequently used energy type. As this type of energy can not be stored at the present time, it has to be instantly consumed. In other words, the demand of the consumers has to be compensated, immediately. This paper employs to model the electrical consumption of Erzurum province in 2011 by spline regression and to decide whether a statistically seasonal variation exists for this consumption. The one-year data set of the investigation was obtained from Turkish Electricity Transmission Company Provincial Directorate of Erzurum and was analyzed by the agency of continuous partial polynomial spline regressions. This analysis determined three knots and fits linear, quadratic and cubic spline regression models.


Introduction
Anybody can argue on the necessity of energy for human life and nowadays energy factor is acknowledged as the fundamental input among production process in order to establish economical and social development (Mucuk & Uysal, 2009: 106).As the energy plays a direct role on production stages, it is also an indispensable entry of economical development period (İnkaya & Demirhan, 2009).Electric energy is no doubt the most commonly used energy type and keeps its must-have status for the modern life and welfare levels of developed and developing countries (Yiğit, 2011).Electric energy is a pure energy source, which can be transmitted from the production point to very remote places with ease via distribution networks, can be kept under control (Hamzaçebi & Kutay, 2004).As electric energy can not be stored at the present time, it has to be consumed instantly.In other words, the demand of the consumers has to be compensated, immediately (Terzi & Sargın, 2006).Because, electric consumption is directly effected by temperature factor, the supply and it differs with respect to air temperature.Thus, it is inevitable that the demand of electricity evolves as well as the increase or decrease of temperature (İlaslaner, 2009).This paper investigates one-year electricity consumption demand of residants living in Erzurum province by linear, quadratic and cubic spline regression model.Second section of this study introduces the logical and theoretical framework of linear, quadratic and cubic spline regression models, while third section indicates the implementation of the model and the fourth section manifests and evaluates the findings of the concerning investigation.

Spline Regression Models
The approximation to multiple data points can cause fatal mistakes even though this technique occasionally provides great conveniences.In such circumstances, spline interpolation method is recommended, where the approximation between two data sets is being accomplished by first, second or third degree functions.Spline interpolation employs on the basis of finding out smaller degree polynomials on sub-intervals of corresponding interval and finite points, when they do not cover each other (Dalman, 2009).
The basic definition of spline designates a thin wooden strip used in constructions and this spline was employed by engineers and architects to draw curves between the points being determined (Wegman & Wrighti 1983).The initial utilization of splines in mathematics was introduced by Schoenberg (Schoenberg, 1946).Ryan (1997) generally describes the splines as a partial polynomial, which the curves and line segments are constituted seperately and these segments are integrated subsequently.So, the method of splines consists of dividing up the range into sub-segments with joint points called knots (Seber & Lee, 2003).A spline function comprises partial polynomials of degree q where (q-1)th derivation is continuous at the point of changes, which is denominated as knots in the literature of the spline (Seber & Wild, 2003).The knot point can simply be described as the point that brings out the changes in the relationship between dependent and independent variables and divides up the regression line into segments (Sung, 1985).
In this sense, as Figure 1 depicts, a spline function of degree d and owns the knots of , , … , , has the following properties (Oturanç, et al., 2008): 1) The degree of the polynomial on each , interval is der(S)≤q.
2) The (q-1)th derivation of on the , interval is continuous.Although, spline regression models might sound like smoothing complicated and formidable, they are really just dummy variable models with a few simple restrictions placed on them (Marsh & Cormier, 2002).
Regression spline regression models are employed while, regression line is divided into several line segments by knots and when spline regression is linear, functions tend to be equal in the knots.However, the slope of the functions can be distinctive.As quadratic spline regression is selected, both functions and and the slopes of the functions are equal, even though their variances seem to be different.Finally, when cubic spline regression is preferred, the slopes of the functions and their variances are restricted.Cubic splines are more commonly chosen, in such circumstances since the curve are required to be smoothest and elastic (Marsh & Cormier, 2002).Spline functions consist of seperate components, where the curves of each component are continuous functions.Splines do not have to be a smooth line.So, spline can be distinguished as a third degree polynomial and in this case, the first and second derivatives of the function are expected to be continuous (Pindyck & Rubinfeld, 1991).
The following formulation indicates how the spline parameters are directly estimated (Gnad, 1977).General spline formula can be achieved, below; ∑ ∑ (1) where, : The parameter coefficient of independent variables : Independent variable : Knot : The parameter coefficient of the knot : The error term where the entire assumptions of the regression are satisfied and under the restrictions of Equation (2) as follows: In this sense, knot points, so is known in advance.
The 's and the 's are the parameters of the spline.As the above equations suggest, the spline regression is linear in these parameters.Next, the definition of the vectors and matrices are exposed as shown; .

dimensional vector
where the matrices and and the vector are constructed from m observations.Then, the regression model is given by, (3) Equation ( 3) is exhibited in matrix notation as follows: (4) While, and , , in Equation (2.4) the regression model is transformed to the Equation ( 5) as follows: (5) Obviously, the Equation ( 5) is nothing else but a linear regression model.The Least Square Estimator is given by the following equation: ′ ′ (6) With the common assumptions of the Classic Linear Normal Regression Modell all the statements of this theory can be applied.
is an unbiased linear estimator and is normal distributed.The following equation can be derived by using the estimator, .Hereby, the study determines three knots on account of electricity consumption.The distribution on Figure 2 involves four segments, considering the knots being determined.Next, the study assigns dummy variables to make continuous line estimation for each knots, where the slope change.As a result, the researchers of this study introduce three dummy variables for three knots.As shown on Figure 2, denotes the first dummy variable for 91st observation, denotes the second dummy variable for 182nd observation and finally denotes the third dummy variable for 274th observation.The dummy variables satisfy following restrictions, below: (11)

Linear Spline Regression Model
The findings of the study subsidy unrestricted linear spline regression model being composed with respect to the knots and dummy variables as the following formulation: (12) In corresponding regression model, denotes the constant of the model; , , and denote the coefficients of the slope or independent variables of the model and denotes the error term when all the assumptions of the regression are satisfied.This regression model in Equation ( 12) is available for the estimation by using Least Squares Method under these conditions.Equation ( 13 The coefficients and the formulation of Equation ( 13) were obtained by the agency of Table 2. Table 2 indicates that the model possesses a single constant, a0 and this situation demonstrates the continuity of the regression line at the knots of the distribution.Additionally, the coefficients, a0, b0, b2 and b3 satisfy the statistically significance at 0.05 significance level where all of these coefficients are different from zero.Besides, the table suggests that the second, and third knots are also statistically significant for restricted linear spline regression and the values of b0, and b1 are negative.

Quadratic Spline Regression Model
Similarly, the findings of the study performs unrestricted quadratic spline regression model being composed with respect to the knots and dummy variables as the following formulation: In corresponding regression model, denotes the constant of the model; , , , , and denote the coefficients of the slope or independent variables of the model and denotes the error term when all the assumptions of the regression are satisfied.This regression model in Equation ( 14) is available for the estimation by using Least Squares Method under these conditions.Equation (15) indicates the estimated regression of daily electricity consumption of Erzurum province in 2011 by quadratic spline regression method.
Estimated Daily Electricity Consumption (Y) = 1708,55 + 0,5293X -0,028825X 2 + 0,049519(X-91) 2 + 0,011494(X-182) 2 -0,033493(X-274) 2 (15) Table 3 approves that the entire model being fitted is statistically significant, where F = 458,02, p < 0.05, and mean square error corresponds 5474.The coefficients and the formulation of Equation ( 15) are obtained by the agency of Table 4. Table 4 indicates that the model possesses a single constant, a 0 and this situation demonstrates the continuity of the regression line at the knots of the distribution.Additionally, the coefficients, a 0 , b 1 , b 2 , b 3, and b 4 satisfy the statistically significance at 0.05 significance level where all of these coefficients are different from zero.Besides, the table suggests that first, second, and third knots are also statistically significant for restricted quadratic spline regression and the values of b 1 , and b 4 are negative.

Conclusion and Discussion
This paper performs the analysis of electricity consumption demand of Erzurum province in 2011 by spline regression methods.For this purpose, one-year data set of the investigation is obtained from Turkish Electricity Transmission Company Provincial Directorate of Erzurum.As observed on the distributions of the data, electricity consumption increases in the spring and summer and decreases in autumn and winter.
Linear spline regression analysis explains 85,9 % of electricity consumption change by independent time variable and suggests the statistically significance of the model being fitted, where p-value is equal to zero.Furthermore, the coefficients, a 0 , b 0 , b 2 , and b 3 satisfy the statistically significance at 0.05 significance level where all of these coefficients are different from zero.Besides, the analysis suggests that second and third knots are also statistically significant for restricted linear spline regression.So, the differences of electricity consumption with respect to the knots being determined, are statistically significant during the processes of transmission from the spring to summer and also from summer to autumn.
Alike, quadratic spline regression analysis explains 86,4 % of electricity consumption change by independent time variable while the analysis suggests the statistically significance of the model being fitted, where p-value is equal to zero and the coefficients, a 0 , b 1, b 2 , b 3, and b 4 where all of these coefficients are different from zero.First, second and third knots are statistically significant for quadratic spline regression, ensuring the differences of electricity consumption with respect to the knots being determined, are statistically significant during the processes of transmission from winter to the spring and also from the spring to summer.
Finally, cubic spline regression analysis explains 88 % of electricity consumption change by independent time variable.The analysis indicates the statistically significance of the model being fitted, where p-value is equal to zero.Cubic spline regression analysis refers to the statistically significance of the coefficients, a 0 , b 0 , b 1 , b 2 , b 3 , b 4 , and b 5 at 0.05 significance level, where all of them are different from zero.In other words, the differences of electricity consumption with respect to the knots being determined are statistically significant during the processes of transmission from the spring to summer and also from summer to autumn.
In conclusion, this paper compute the values of R 2 , Adjusted R 2 , Standard Error of the Model, Mean Square Error, Akaike and Bayesian Information Criteria for all three concerning spline regression models being fitted to decide the most efficient model.Because, cubic spline regression model offers higher values of R 2 and Adjusted R 2 , lower values of Standart Error of the Model, MSE, and Akaike and Bayesian Information Criteria, this model is selected as the most efficient model.

Figure 1 .
Figure 1.Point distribution of , observation pair the distributions of electricity consumption of Erzurum province in 2011 by linear, quadratic and cubic spline regression models.Figure 2 and Appendix 1 demonstrate the distribution of daily electricity consumption in Erzurum province.The data set of the investigation comprises 365 observations.In the study, date 01.01.2011 represents the first observation, date 02.01.2011 represents the second observation, date 03.01.2011 represents the third observation,…, date 30.12.2011 represents 364th observation and date 31.12.2011represents 365th observation.The knots of the study were determined as quarterly periods.As Figure 2 exhibits, electricity consumption of Erzurum province decreases in the spring and summer; while it increases in the autumn and winter.This situation ensures that the electricity consumption of Erzurum province differentiates with respect to seasons.

Figure 2 .
Figure 2. The distribution of daily electricity consumption of erzurum province in 2011

Finally, Figure 3
confronts the distributions of observed and estimated electricity consumption of Erzurum province in 2011 by linear spline regression model.

Figure 3 .
Figure 3.The distributions of observed and estimated electricity consumption by linear spline regression

Figure 4
Figure 4 concludes the distributions of observed and estimated electricity consumption of Erzurum province in 2011 by quadratic spline regression model.

Figure 4 .
Figure 4.The distributions of observed and estimated daily electricity consumption by quadratic spline regression 0 , b 0 , b 1, b 2, b 3 , b 4 , and b 5 satisfy the statistically significance at 0.05 significance level where all of these coefficients are different from zero.Besides, the table suggests that first, second and third knots are also statistically significant for restricted cubic spline regression and the values of b 0 , b 1, b 3 , b 4 , and b 5 are negative.

Figure 5 .
Figure 5.The distributions of observed and estimated daily electricity consumption by cubic spline regression

Table 1 .
) indicates the estimated regression of daily electricity consumption of Erzurum province in 2011 by linear spline regression method.Table1summarizes the results for Analysis of Variance and ensures the statistically significance of entire model, where p < 0.05, F = 549,54 and Mean Square Error (MSE) = 5668.Analysis of variance results for linear spline regression

Table 2 .
Linear spline regression model output

Table 3 .
Analysis of variance results for quadratic spline regression

Table 4 .
Quadratic spline regression model output

Table 5
demonstrates the results for Analysis of Variance and ensures the statistically significance of entire model, where p < 0.05, F = 435,93, and Mean Square Error (MSE) = 4876.

Table 5 .
Analysis of variance results for cubic spline regression model

Table 6 .
Cubic spline regression model output

Table 7 .
Model comparisonAs Table7is examined, cubic spline regression model is selected as the most efficient model, enabling higher values of R 2 and Adjusted R 2 and lower values of Standard Error of Model, Mean Square Error, Akaike and Bayesion Information Criteria.