Time-Series Forecasting Models for Gasoline Prices in China

,


Introduction
Gasoline price connects closely to the inflation expectation and it is an important influential factor of the daily lives of the public.It has impacts on people's living expenditures and on their decisions of automobile purchases as well as selections of the travel mode.As Molloy and Shan (2013) pointed out, the gasoline price could also influence the prices of residential properties.
For automobile manufacturers, gasoline price links to their sales directly (McManus, 2007;Li, Timmins, & von Haefen, 2009) and is one of the determining elements for them to design and market auto products such as the energy efficient vehicles which have been promoted in recent years.Besides, the price level is important for the government agencies to establish effective macroeconomic policy tools and environmental protection regulations (Busse, Knittel, & Zettelmeyer, 2013;Allcott & Wozny, 2014).Being one of essential components that measure the inflation level, its distortion is found to have negative impact on the economy (Shi & Sun, 2017).
Due to rapid economic development during recent three decades, China obtained dramatic economic development.While most people in China realized significant improvement in their living conditions, the demand for private automobiles and other durable goods increased rapidly.As a result, by the end of 2009, China became the largest auto market in the world (Ho, 2010;Guillaume, 2010).The civil car parc in China reached more than 217 million at the end of 2017 (National Bureau of Statistics of China [NBSC], 2018), and the Chinese auto market has evolved to be one of the most important for auto makers all over the world.For example, in 2016, more than 28 million cars were sold in China (Kwong, 2017), and General Motors alone sold 3.87 million cars, which counted around one third of its global total sales (Burden, 2017).In 2017, the vehicle sales in China increased by 3% (Anonymous, 2018) and GM sales in China reached 4 million (GM, 2018).
Following dramatic increase of the automobile sales, the gasoline price has become a popular topic in China for those who purchased and who plan to purchase cars.Especially, it is now a key factor for the manufacturers to consider in designing their products and planning new locations for the production facilities.And for the government agencies, accurate forecast of the gasoline price is important due to its increasing influence on the living costs of the public.As the Chinese government has invested substantially in the expressway system during the last decade, the prices of the gasoline products cast significant impact on resource allocations for local and state government agencies and for businesses in the logistic and transportation industry.For the environmental protection agencies, it also influences the effectiveness of the policy implementation, especially for resolving severe air pollution problems recently occurred in many regions in China.
Based on the importance of obtaining reliable predictions for the gasoline price, this paper tries to propose appropriate prediction models for the gasoline prices in China.The data-driven type time-series forecasting methods will be applied in this study, due to difficulty in obtaining accurate estimates of influential external factors.In what follows, Section 2 will briefly discuss the researches which have been conducted in predicting the gasoline prices.Section 3 illustrates different forecasting models based on price time-series data.Empirical analysis is carried out in Section 4 using historical gasoline price data collected in the Chinese markets.Section 5 concludes this study and provides suggestions for future research.

Research on Predictions of the Gasoline Price
Limited researches have been conducted in predicting the gasoline prices.As Baumeister, Kilian, and Lee (2017) found that increasing studies in recent years focused on the crude oil prices.In relating to the gasoline products, many extant studies focus on the relationship between the crude oil and gasoline prices, and between different economic factors and the price level as well as other characteristics of the gasoline products.Cheung and Thomson (2004) applied the cointegration method to study the demand of the gasoline products.Ma, Oxley, and Gibson (2009) investigated the convergence of prices of gasoline and other energy products in major cities in China.Zhang, Lohr, Escalante, and Wetzstein (2010) focused on studying the long-run cointegration of prices of food and fuel commodities.McManus (2007), Li et al. (2009), and Allcott and Wozny (2014) discussed the influence of the gasoline prices on the vehicle sales and fleet business, while Shi and Sun (2017) claimed that there was negative impact of the distortion in gasoline prices on China's economic growth.On the other hand, Borenstein, Cameron, and Gilbert (1997), Bachmeier andGriffin (2003), andRadchenko (2005), analyzed the asymmetric response effect of gasoline prices to the variation of the crude oil prices.
To predict the future price level of the gasoline products, Anderson, Kellogg, Sallee, andCurtin (2011, 2013) used MSC (Michigan Survey of Consumers) survey data of consumers' beliefs about future retail gasoline prices.Xu, Valentine, and Wang (2014) presented AR (autoregressive) and ARCH (autoregressive conditional heteroscedasticity) structures for the prediction model, after controlling impacts from external time-dependent latent and observable covariates.Recently, Baumeister et al. (2017) conducted a thorough study of using regression-based methods for predicting the gasoline prices in the U.S., based on which they proposed using pooled forecasts from five forecasting models with equal weights.
Different from works conducted by others, due to unknown information and dynamic characteristics of the external factors, this paper focuses on predicting gasoline prices using data driven type forecasting models, which rely on analyzing only the gasoline price time series data.Although data driven models lack economic meaning and explanation of determining factors, they do provide convenience and practicability for prediction purposes.In addition, this paper uses gasoline price data collected from wholesale market in China, with the purpose to provide reliable statistical methods for predicting the future gasoline price levels.

Time-Series Prediction Models for Gasoline Prices
In dealing with time-series data for the prediction purposes, the ARIMA (autoregressive integrated moving average) and smoothing models are two popular tools applied by researchers and practitioners, e.g., Zou, Yu, and He (2015) estimated mean of the log difference of the WTI crude oil price using ARMA process based on wavelet entropy.Set Y t to be the gasoline price at time t, according to Box, Jenkins, and Reinsel (1994), the general format of an ARIMA(p,d,q) model applied to the time series Y t 's can be stated in the following linear format where p and q are orders of corresponding AR (autoregressive) and MA (moving average) processes, B is the backshift operator, and d is the differencing degree to obtain a stationary time series.A more concise format of the model is written as While in the presence of seasonal variations, the model can be modified to be denoted as an ARIMA(p,d,q)(P,D,Q) s model written in the form of, in which s is the number of seasons.The estimates of the orders and then the relevant coefficients can be obtained using the Box-Jenkins methods (see, for example, Box et al., 1994;Commandeur & Koopman, 2007;Jyndman & Athanasopoulos, 2013).
In terms of smoothing models, the Holt-Winters exponential smoothing model (Winters, 1960;Holt, 2004) is proposed in this study for predicting the gasoline prices.Besides having specific components in capturing potential trend and seasonality, this model continuously revises forecasts using new observations and include all observed data by assigning weights that decline exponentially as the observation becomes further back in the history.Regular moving average and weighted moving average models, can be treated as the special cases of the Holt-Winters model.The general format of the model, in a multiplicative type, is stated as what follows, where  is the level estimate of Y t for period t, Trend t is the trend estimate calculated at period t and  is the seasonal factor estimate.Three smoothing parameters (, β and ) take values between 0 and 1.And the h step ahead forecast of the series can then be obtained using the formula Another forecasting method, the grey forecasting model, became popular in recent two decades among researchers in Asia.It is based on grey systems theory established by Deng (1982).According to the grey theory (see, for example, Liu, Forest, & Yang, 2012;Deng, 1982;Kayacan, Ulutas, & Kaynak, 2010;Zhou & He, 2013), for a non-negative sequence Y (0) = {y (0) (1), y (0) (2), …, y (0) (n)}, where y (0) (t) = Y t stands for the gasoline price at time t, a monotonically increasing accumulating generation operation (AGO) series can be created as Y (1) = {y (1) (1), y (1) (2), …, y (1) (n)}, where . By defining a mean sequence  (1) () = 0.5 (1) () + 0.5 (1) ( − 1), ∀ = 2,3, … , , a GM(1,1) grey model can be stated in the following common format, (6) The parameters a and b, which are called the development coefficient and grey action quantity, can be obtained using regular least square estimates.And the whitenization differential equation of the model is listed as what follows,  (1) ()  +  (1) () = . (7) Therefore, the time response function of the model can be derived to be, from which the forecast of original series at time period k can be obtained using the following equation, And to deal with non-continuous characteristic of the time series data records, Zhou and He (2013) suggested using the following discrete type time response function.
Recently, due to dramatic development of information technology, machine learning models became popular and have been applied extensively in different areas by industries, academia and government agencies.Two of them that gain increasing popularity in providing classification and forecasting functions are the artificial neural networks (ANNs) and support vector regression (SVR) models.
According to Zhang, Patuwo, and Yu (1998) and Huang, Lai, Nakamori, Wang, and Yu (2007), the ANNs could significantly improve forecasting results.This paper adopts one of the most popularly applied ANNs models, the feed forward network (FNN) model, which has only one hidden layer (Haykin, 2009).Therefore, the FNN maps previous periods' observations, via a nonlinear activation function g(), to the gasoline price in the future time period.The model can be expressed using the following function in which  i , β j , and  ij are weight parameters, and the activation function takes the common logistic format of Estimates of the weight parameters are obtained through the least square process by minimizing the mean square of the errors (MSE).And the values of lag p and size q can be determined using the cross-validation method or using the AIC and BIC criteria.
The SVR model, according to Müller et al. (1997), applies a nonlinear mapping () function of the data x to a high dimensional feature space in which a linear regression can be created as, where x is a vector of lagged values of the time series data Y t ,  i 's and b are parameters which can be estimated by minimizing the empirical risk function of Inside the risk function above, L  is the Vapnik's -insensitive loss function And Vapnik (1995;1998) shows that the optimal decision hyperplane is obtained to be where  i ,  i * , and b * are Lagrange multipliers solving the maximization problem in a convex quadratic form, and is the kernel function, which is often chosen to be the Gaussian radial basis function (RBF) with a tuning parameter (Müller et al., 1997;Cao & Tay, 2003).
With the availability of alternative prediction models, some researchers argued that empirically ensemble learning methods could often result in better forecasts (Kuncheva & Whitaker, 2003;Brown, Wyatt, Harris, & Yao, 2005;and Zhou, 2012).For example, Baumeister et al. (2017) pointed out that different models outperformed others over different forecast horizons, and they claimed that using simple average of predictions from all models yielded the best forecasts.The simple average method can be treated as the simplest format of ensemble learning, whose popular methods applied include boosting (Freund & Schapire, 1997;Mason, Baxter, Bartlett, & Frean, 2000), bagging (Breiman, 1996), stacking (Smyth & Wolpert, 1999;Clarke, 2003), and Bayesian model average (Hoeting, Madigan, Raftery, & Volinsky, 1999;Amini & Parmeter, 2011).Taking the Bayesian model average (BMA) method for example, while assuming that the frequencies of k proposed models to be selected for the prediction purposes follow a multinomial distribution with parameters p i 's, a priori a conjugate Dirichlet distribution can be assigned to those p i 's, denoted as where 1 , ⋯ ,  are specified parameters.Given different models' prediction performances over the sample data, the posterior distribution of p i 's can be found to follow also a Dirichlet distribution with updated parameters  * =  +   , in which   is the frequency count of model i providing better forecasts than other models.Therefore, the posterior mean values of the p i 's can then be used as weights for calculating the BMA forecasts.

Empirical Analysis of Predicting the Gasoline Prices in China
In what follows, the data driven prediction models which are stated in the previous section will be applied to analyze the gasoline price time series data obtained from the Chinese market.The price data to be analyzed is the monthly (from Jan. 2006 to Dec. 2017, measured in RMB yuan per ton) records of #93 gasoline wholesale prices in Shanghai region, provided by Sinopec Group.As Ma et al. (2009) pointed out, due to the integration characteristic of the energy markets in China, the data can be used to represent the general gasoline price level in China.And to compare the performances of different models, the mean square error criterion   = 1  ∑(  −   ̂)2 , will be applied due to its consistency and convenience of application.
While checking the time-series plot (Figure 1) of the gasoline prices, it reveals obvious trends, increasing in the first six years and decreasing since the summer of 2014, which implies non-stationarity of the time series.
Assigning the Augmented Dickey-Fuller test towards the series, it suggests that there is existence of a unit root, therefore, the first difference of the gasoline price series, Diff(Y t 's), is obtained for further analysis to estimate the ARIMA model.In addition, the White test (White, 1980) shows that the homoscedasticity assumption is not violated within the differenced time-series.Based on the autocorrelation (ACF) and partial autocorrelation (PACF) functions which are shown in Figure 2, using the Box-Jenkins methods, an ARMA(1,0) process can be identified for the difference series.where  t is the white noise error term.From Equation ( 17), the gasoline price in month t can be estimated using the function of,  ̂ = 13.03 And applying the exponential smoothing model towards to gasoline price data series, the smoothing parameters are estimated to take values of  =0.98, β =0.01, and  =0, based on the method of minimizing the MSE value.
The seasonality component within the model can then be dropped, which is also apparent from the time-series plot and can be proved by insignificant monthly indicator variables if regressing the gasoline price on eleven monthly indicator variables (p-values of all eleven t statistics corresponding to the coefficients range from 0.42 to 0.98).Therefore, the Holt-Winters model can be estimated to be For the grey forecasting model established above, based on least square method, the development coefficient and (with 5% significance limits for the partial autocorrelations) the grey action quantity can be estimated to be, a = -0.00191and b = 6651.96,which can then be used to obtain the forecasts for the gasoline prices using Eq. ( 9).Estimations of the FNN and SVR models are conducted using the nnet (Ripley, 2016) and e1071 (Meyer et al., 2017) packages within R programming environment.For the neural network FNN model, based on minimizing the MSE of the forecasting results, the lag and size parameters are estimated to be p = 5 and q = 5.With the help of the clusterGeneration (Qiu & Joe, 2015) R package, Figure 3 below presents graphically the model estimate, inside which the black and grey lines stand for the positive and negative weights, and the thickness of the line is in proportion to the magnitude of the corresponding weight.In the first layer, "I1-I5" are five lagged input values, and the target variable of future period's gasoline price is labeled as "O1", which stands for the only output value on the rightest layer.The hidden layer in the middle, have its nodes labels as "H1" to "H5".And "B1" and "B2" are bias layers that assign constant values to the nodes within the neural network.Estimation of the SVR model results in 85 support vectors, whose detailed information is excluded from this study as the focus here is the prediction performances of the model.To compare the goodness-of-fit of all five models constructed above, it is noted that the ARIMA(1,1,0) model provides the best results, in terms of the MSE and the R 2 measurement.The R 2 value here is calculated by squaring the correlation coefficient between the observed and predicted gasoline prices, i.e.,  2 = Corr(  ,  ̂).Table 1 lists relevant measurements from these models, from which it is noted that the grey forecasting model only explains about one fifth of the variation of the gasoline prices, much less compared to other models.On the other hand, albeit the ARIMA model has the smallest MSE measurement, its performance is not significantly better than that of the smoothing and FNN models.But it does provide a more parsimonious format, which is favorable for conducting the analysis here.Often, better performance in fitting the historical price data given does not guarantee that the same model will outperform others in predicting future gasoline prices.Since the purpose of the forecasting models is to obtain accurate predictions of future prices, it is necessary to conduct out-of-sample cross-validation for comparing the prediction performances of the models developed above.In doing so, also in checking the robustness of different models' performances, sequentially the last observation and then period by period up to the last 24 months' observations are taken out and used as the test data, with the rest serving as the training data.Therefore, 24 out-of-sample prediction results are generated for each model.Besides the MSE criterion, the mean absolute percentage error (MAPE) measurement is also used to compare the out-of-sample prediction accuracies of different models.With the prediction results, an interesting finding here is that for a short to medium length of time horizon, i.e., for next month and up to next five months, the ARIMA(1,1,0) model performs the best among all five data driven forecasting models.And for a horizon of medium length (6 to 12 months), the support vector regression (SVR) model outperforms others, and for a length of 13 to 24 months, the feed forward network (FNN) model consistently provides better prediction results, in both criteria of the MSE and MAPE measurements.Table 2 below lists those measurements for each model based on their out-of-sample predictions for the next one to 24 months.The comparison above reveals the robustness of the prediction performance of the ARIMA model for the short to medium forecast horizon, and of the SVR model for the medium time horizon as well as of the FNN model for long run.Due to that, applying the BMA ensemble method here does not provide additional improvement, with posterior mean estimates assigning large weights to the three models identified above.And considering significant increase in the degree of the complexity, the BMA model can be dropped from predicting the gasoline price in the Chinese market, based on the current gasoline price time series data.Besides, given the goodness-of-fit and out-of-sample prediction results, the grey forecasting model seems not as suitable as other models for predicting the gasoline prices in the Chinese market, therefore it can also be excluded from the group of forecasting model candidates.
In addition, it needs to be noted that the above discussion does not mean that in the future, the smoothing model, should not be considered in predicting the prices of the gasoline products in China.Following the changes of the energy market and the development of the socioeconomic conditions (e.g., the new government policy of offering preferential tax credits to cars using alternative energy), the behavior of the gasoline price time series will change accordingly.Thus, with new patterns included into the time series, different models may outperform others over different time horizons, in which case the ensemble method can be more likely to produce better forecasting results.

Conclusions
Gasoline price influences people's living expenditures and their decision on auto purchases and travel mode.Accurate forecasts of the price level are important for auto makers due to direct impact on their total sales.Also, they are critical for the government agencies to implement effective policies, especially for the Chinese government to control the inflation, protect environment and alleviate traffic congestions in China's big cities.
Considering the difficulty to obtain accurate estimates of external factors which influence the gasoline price, the data driven forecasting models introduced in this paper provide convenience and practicability for predicting the price levels in future time periods.Five specific models were established and applied to the monthly gasoline price time series data obtained in the Chinese market.Based on the empirical analysis, a parsimonious ARIMA model is identified to be suitable for conducting predictions within a time horizon of less than six months.And for a forecast horizon of six months or more, the SVR and FNN models are proposed due to their abilities of generating more accurate forecasting results compared to other models.
The models introduced here can be applied to analyze time-series data of prices of other gasoline products in China, or gasoline prices in other markets.In which case, different models may outperform others given different forecast horizons, and the ensemble method, which is based on prediction results of alternative data driven models, may be more likely to provide more accurate forecasts.

Figure 1 .
Figure 1.Time-series plot of #93 gasoline prices Autocorrelation Function for Diff (Yt) (with 5% significance limits for the autocorrelations) Partial Autocorrelation Function for Diff (Yt)

Figure 3 .
Figure 3. Network diagram of the FNN model Note.The black lines stand for the positive weights, with the grey ones representing the negative weights.The thickness of a line is in proportion to the absolute value of the corresponding weight.

Table 1 .
Goodness of fit comparison of the forecasting models From the table, it is noted that the ARIMA model carries the smallest MSE and the largest R 2 measurements.Considering its parsimonious structure, the ARIMA model fits the data given the best, compared to other models constructed.

Table 2 .
Performances of out-of-sample predictions