Forecasting the Gold Returns with Artifical Neural Network and Time Series

Gold is an important investment tool especially in developing countries. Return-on-gold and prediction thereof is a topic which has been attracting the attention of investors and densely studied recently. For this reason different methods are being used to predict return-on-gold and effectiveness of these methods are being compared. The purpose of this study is to generate a prediction of return-on-gold using artificial neural networks and GARCH and its derivatives, which is a conventional time series method, based on the series obtained from the return of gold values provided by Turkish Gold Exchange belonging to the February 2014 and June 2014 period. As a result of this study, contrary to the expectations and the majority of similar studies, ANN provided less successful outcomes compared to GJR GARCH method.


Introduction
Predicting the future on the basis of past time data is one indispensable tool of financial markets in particular.When these markets are examined, no single method stands out as able to precisely model the direction of movements due to the multiplicity of variables and the volatility of the elements that form these markets.For this reason, financial markets accommodate more risks compared to other fields of investment.
negative or positive changes.
Artificial neural networks (ANN) are one of the artificial intelligence topics and methods which has been used for prediction recently.ANN is preferred for predictions due to its multiple variance and non-parametric structure and predisposition for application in non-linear time series.ANNs are especially suitable for solving the problems which cannot easily be mathematically modelled; they also have the ability to learn new conditions by changing their weights and they can learn with different learning algorithms, which allows for the learning to be used in solving similar problems as a result of such learning.A trained ANN can act as a specialist, analyse the data that is given and make projections.
The price of gold is one of the important market instruments; both investors and decision makers desire to determine its future value.Investing in gold usually aims protection from inflation, political risks and crisis in the long term and utilization from price fluctuations in the short term.Its long history and wide acceptance by economic units has made the gold and its volatility, which is an indicator of risk in gold, an essential topic.Most of the studies provide a safe investment tool for investors in unsteady periods.Different methods are being used for predicting the prices of gold which are subject to comparison in order to determine the best performing one.
The purpose of this study is to examine the impact of ANN on determination of gold prices returns by using a time-series analysis method, GARCH models.

Literature Review
Those people who have to make investment decisions, especially economist and decision-makers, make predictions about the future.For this effect much effort is paid on the prediction of economic data such as stock market, exchange rate, inflation and gold.
Prediction methods in time series, which are especially favoured by econometrists, have performed more effective predictions by integrating themselves into new techniques by means of developed computer technology.
There are several studies performed on gold prices especially recently.In some studies, ANN stands out in some of them due to its supremacy; in some others it appears in comparison with conventional methods (Özdemir et al., 2013;Chamzini and Yakchali, 2013;Parisi et al., 2008;Shafiee and Topal, 2010;Lineesh et al., 2010;Roy and Singh, 2014).Tully and Lucey (2007) examined the movements in gold prices with ARCH-GARCH analysis and found out that the returns on stock certificates decreased when the price of gold increased.Shafiee and Topal (2010) examined 40 years of gold and found a high correlation between gold and oil prices.Capie et al. (2005) investigated the relation between weekly gold prices and pound-USD and yen-UD parities for 33 years and the behaviour of gold as an instrument for avoiding exchange rate risks based on GARCH, threshold GARCH and exponential GARCH methods.They concluded that GARCH model was the most suitable one which statistically showed that gold provided protection from exchange rate risks.Do et al. (2009) examined the relation between volatility of stock markets in some ASEAN countries, namely Indonesia, Malaysia, Thailand and Vietnam, and the returns of international gold market with GARCH and GJR models.They found out that, with the exception of Vietnam, GJR (1,1) was the most suitable model.

Artifical Neural Networks
Artificial neural networks (ANNs) are the general term used for computer systems which have working principles similar to the neural systems of human body.It involves studies directed at training and teaching of computers.They are also computer programs that simulate the neural networks of human brain.ANNs allow for reaching better solutions compared to linear methods due to its non-linear structure and probabilistic behaviours.
ANNs are data-based systems formed by lamellar connection of artificial neural cells; they simulate some abilities of human brain such as learning and fast decision-making under various conditions through simplified models in order to solve complex problems.
The most generic ANN model is seen below.
Figure 1.The most generic Artificial Neural Network model Inputs ( 1 ,  2 , … ,   ) are the information that entre the cell from other cells or the environment.These are determined by the examples that the network is asked to learn.Weights ( 1 ,  2 , … ,   ) are the values that express the impact of a process of elements on this process of elementsin the input set or another previous layer.Each input is multiplied with the weight values that connect the input to the operation element and combined through the sum function, so that the net information that entering the network is found.The net input of the network is calculated as follows: ℝ    =   +   Where Θ  shows the total input entering each neuron.  is the threshold value which is a negative or positive value imported from the environment.Another factor that determines the behaviour of neural cell is the activation (transfer) function, which processes the information entering the artificial neural cell and determines the output that will be produced in return for this input.The most widely used transfer functions are threshold, sigmoid, hyperbolic tangent etc. functions (Kaastra, 1996, p. 227).The messages that are sent to the exterior, to itself or to other cells are called "outputs".A neural cell can have multiple inputs whereas it can have only one output, which can act as the input to several cells.Output values can be expressed in functions as follows (Tosunoğlu & Benli, 2012, p. 543); )   = ( 1 ,  2 , … ,   ) ∈ ℝ  Where;   expresses the output of the system or a single neuron,   is the activation function,   is net inputs and   is the threshold value.ANNs are densely used for time series analysis in recent years.The exercise of obtaining prediction in time series by using artificial neural networks can be summarised in the 7 steps listed below (Benli & Yildiz, 2012, p. 160): Step 1: Preliminary processing of data: first, the data are converted into [0, 1] interval.If logistic activation function is to be used where xi is the input (observation) values, the input values are converted into [0, 1] interval as follows: Step 2: Decision is made as to percentage share of the dedication and testing sets in data set.Usually 10% to 20% Step 3: In this step, the number of inputs to be used, number of hidden layers, number of units at hidden layer, number of units at output layer, the activation function to be used in these units, learning algorithm and parameters and performance criteria thereof are determined and the ANN model which will be used is constructed.
Step 4: Construction of input values: the input values of ANN are delayed time series.While input values are formed for   , time series, m (the number of units at input layer) number of delayed time series are constructed as  −1 ,  −2 , … ,  − .
Step 5: Calculation of best weight values: the best weight values are found through education set with the selected learning algorithm.Using these obtained best weight values, the output values of the constructed ANN model are calculated.
Step 6: Calculation of performance criteria: testing set estimations of ANN are obtained.The inverse of the conversion applied in step is implemented to the output values obtained in step 5 and the values obtained in this step.The values obtained as a result of this conversion constitute the predictions of education set and testing set, respectively.The selected performance criterion is calculated based on the difference between the prediction of testing set and the data therein.Two of the most widely used performance criteria in the literature are Mean Square Error (MSE) and Mean Absolute Error (MAE) values whose formulas are given below (Zhang, 1998, p. 51): Where, T shows the number of predictions,  , shows the real value in t time, and   ′ shows the mean value of prediction values.The importance of this criterion is that it can dissociate into the variance sums of prediction errors.This feature shows that MSE criterion depends on the second moment of joint distribution of realisation and predictions only.Nevertheless, it must be noted that it cannot provide full information on the actual distribution (Zhang, 1998, p. 51).
Step 7: Prediction: Finally, the best weight values found in step 5 are used to obtain the prediction values for times after testing set, meaning the future, using either iterative prediction method or direct prediction method.
ANNs accommodate several network structures and models.Artificial neural network consists of the interrelation of a series of neural cells in forward-driven and back-propagation connection patterns.Today a number of artificial neural networks have been developed for several purposes and for usage in various fields (Perceptron, Adaline, MLP, LVQ, Hopfield, Recurrent, SOM, ART etc.)Among these, multiple layer forward-driven (Multiple Layer Perceptron-MLP) artificial neural networks are the most widely used; they are also the networks that this study employs.
Back propagation algorithm is an education algorithm which is most widely used for Multiple Layer Perceptron networks.
Figure 2. The algorithm of multiple layer back propagation Artificial Neural Networks BP (Back Propagation) algorithm is basically realised in two stages.They are (i) the forward stage, where activations are propagated from input to the output layer, and (ii) back stage which back propagates in order to change the error weights and bias values between the real value observed at output layer and desired nominal value.Before the education and test sets inputs and outputs, the feeding work network has to begin education.The model used for time series in ANN is as follows (Ozkan, 2011, p. 187): Where;   is prediction equation,  = ( 1 , … ,   ) ′ ,   = ( 0 ,  1 , … ,  ,−1 ,   ) ′ and  = 1,2, … ,  show input neuron number;   = ( 1 ,  2 , … ,   , 1) ′ = ( −1 ,  −2 , … ,  − , 1) ′   show error term and ( ′   ) sigma activation function is GARCH models proposed by Engle (1982) and Bollerslev (1986) are successful in detecting the flat-tailed structure of the distribution and volatility which changes in time.The following section gives the definitions of the models used in the application.

1). ARCH Model
Variance problem occurs in regression analysis which especially use financial data ARCH model determined this conditional variance as a function of the squares of error terms in t time (Engle, 1982(Engle, , p. 1002).In ARCH model, it is assumed that the characteristic behaviours of prediction errors depends on regression residuals which will be autocorrelated (Gökce, 1998, p. 57).
In GARCH model, the size of conditional variances is a linear function of error squares and conditional variance terms (Bollerslev, 1986, p. 44).In GARCH (p, q) models the variance of error terms is affected both by its own past values and conditional variance values.Less parameters are needed compared to ARCH models.
The model has to meet the following conditions so that a successful variance prediction can be made (Bollersley, 1987, p. 542): This condition tells that the sum of parameter values belonging to the conditional variance equation is smaller than it is important for obtaining the finite variance of the model (Green, 1993, p. 570).
Financial series usually show such features as excessive oblateness, volatility clustering and leverage effect, their prediction for the volatility of gold prices with changing variance models can be incomplete and asymmetric conditional variance models can be needed which take into consideration the different impact of negative and positive shocks on volatility.For this reason this model was also included in the following section of the study.

3). EGARCH Model
EGARCH models are also known as exponential GARCH models.Unlike GARCH models which only take into account the size of the input signs, they include the difference of the effects created by negative and positive shocks and were proposed by Nelson (1991) The positivity in conditional variance was found without the existence of the condition that parameters should not be negative (Teravista&Timo, 2009, pp. 34-35)

4). GJR GARCH Model
The model with threshold value which considers the different and asymmetric effects of positive and negative shocks was proposed by Glosten, Jagannathan and Runkle (1993).An unexpected increase in the series is evaluated as good news which affects the conditional variance with i parameter.An unexpected fall, on the other hand, is defined as bad news and the conditional variance is affected by i + k parameter (Chen, 2005, p.4).
On the other hand, leverage effect in GJR GARCH model is quadratic whereas it is exponential in EGARCH (Ozden, 2008, p. 325).

Data
The fundamental purpose o this study is to decide on the precision method which can be used in predicting gold markets.The methods which will be evaluated for this effect are artificial neural networks and time series prediction methods.In our study, the daily-frequency, closing prices (USDs / Ounce) between 12.02.2010,when Turkish Gold Stock Market became a member of the Federation of Euro-Asian Stock Exchanges (FEAS), and 30.06.2014.
In order to display the return of gold, the model used the following return series: Gold closing prices data and gold return data are given in figures 3 and 4.
Figure 3. Gold closing prices data

Methodology
In this paper a multi-layered ANN model belonging to gold prices was constructed and back-propagation method was determined as learning algorithm first.Back-propagated networks have a structure where output and intermediate layer outputs back-propagate to input units.The exit of the neurons at output layer does not only depend on the current but also on previous input values.This characteristic is the reason for which back-propagation is preferred for usage in prediction models (Makay, 1992, p. 420).
-  When the prediction performance of gold price data are compared with the real data after the education performed at ANN, it can be seen that they have high performance both at learning stage and at prediction stage.
Non-linear ARCH models are used in the modelling of conditional variances which do not have normal distribution.When table 1 is examined, it can be seen that Gold Return series is not regularly distributed according to Jarque Bera (1980) (Bera, 1980, p. 255) test results.The existence of ARCH effect in return series was performed for 2 legs with ARCH LM test.ARCH effect is observed in the series.

Gold Return Series Unit Root Test
In order to research whether the return series has unit root, time series and correlogram graphs were examined and it was observed that (i) the series oscillated around a certain mean value, (ii) auto-correlation coeffcients (ACF) did not assume very high values and (iii) they decreased rapidly as delay factor increased.These preliminary information display that return series is stable.ADF and PP unit root tests were applied in order to search integration level with test statistics.Akaike (AIC) and Schwartz (SC) criteria and Lagrange multiplier (LM) test statistics were used for decision-making.The results of this test are given in tables 2 and 3.  ADF and PP test results display that there are no unit roots at the Gold Return series.The return series is stable.

Determination of Gold Return Series AR(P) and MA(Q) Levels
Time series cannot only be defined correctly with an autoregressive (AR) or a moving average (MA) process.But they can be defined with the ARMA (p,q) model which was introduced by Wold (1938) to the literature which combines these two structures and is called aggregate autoregressive moving average.In this section the suitable ARMA (p,q) structure of the return series was studied within the scope of Box Jenkins approach.In in When table 4 is examined, it can be seen that there is a strong ARCH effect for all levels at ARMA (1,1) residuals.The null hypothesis which claims that ARCH effect is non-existent is rejected.

Determination of GARCH Model
Predictions were made belonging to the GARCH, GJR GARCH and EGARCH models of which structures were detailed before.In order to be able to make the right choice among these models, the statistical significance of coefficients, validity of model restriction conditions, AIC and SIC selection criteria, R 2 value, DW-d statistic, and LL value were taken into consideration.The cost-benefit analysis for choosing the right model is usually ignored.In these cases, some statistical prediction error measurement techniques are considered (Poon &Granger, 2003, p. 478).Among symmetric prediction criteria, mean standard error (MSE), mean absolute error (MAE, mean percentage absolute error (MAPE) and Theil inequality coefficient (TIC) were used for selection.The low levels of these values ensure the selection of relevant model.The results are given in table 6.An examination of table 7 reveals that ARCH effect on GJR GARCH (1,1) residuals disappeared and that autocorrelation does not exist in residuals.
The measurement of the risk of gold as a financial instrument can be performed with volatility analysis.Volatility can be defined as momentary changes and movements in the prices of financial assets.In addition to the knowledge of future prices, it is essential that the investor can foresee the risk as well.In this sense modelling of the volatility feature and using it in prediction is highly important.The basic question in volatility literature is whether financial return volatility is predictable and, if it is, which model should be used to make the best prediction.In order to be able to answer this question this study was conducted to choose the best model which determines the volatility of gold prices in Turkey.The closing prices (USD/ounce) for 12.02.2010-30.06.2014 period were used for this purpose.Return series conversion was applied to the data.
A variety of GARCH models, which can be classified as conditional volatility models, were evaluated in order to obtain the best prediction.The study concluded that the most successful model in explaining the volatility of gold prices is GJR GARCH model.In this model the difference in the impact created by bad news and good news on gold return conditional variance can be observed.The model considers that positive and negative shocks are not symmetrical.It has been found out that the news impact on gold return in Turkey is asymmetrical.The coefficient which displays the leverage effect is not zero.The good news is unexpected rise in the series.In the case of a bad news, the model responses with 0.07096 parameter.The positive coefficient obtained in GJR GARCH model displays the existence of leverage effect, which shows that the negative news will be more effective than positive news on volatility.Based on both prediction models, it can be evaluated that GJR GARCH model is a more effective prediction model for both MSE and MAE values.

Conclusion
For centuries, gold has been an indispensable instrument in financial system.Central banks demand gold as reserves, speculators demand it to determine their strategy, industries as input and individuals and investors for saving.The facts that it has been in the market for a very long time and that it has been accepted by economic units made gold and its return an essential issue.
Researchers have utilized a variety of prediction methods for predicting gold return.Conventional prediction methods take the lead.However, artificial neural networks have been put into use as an effective method recently.
In this paper, a conventional prediction model, GARCH and its derivatives and ANN were examined for gold return data of February 2010 and June 2014 period in Turkish Gold Stock Market.This study concluded that, contrary to the expectations and in disagreement with several studies in the literature.GJR GARCH model made a slightly more successful prediction compared to ANN.

Figure 8 .
Figure 8. GJR GARCH (1,1) results . The model allows for good or bad news to create different impacts on volatility.

Table 1 .
Descriptive statistics ALTFigure 4. Gold return dataDescriptive StatisticsBefore discussing the predictions of ARCH class models, a test has to be made in order to see if daily returns on financial markets have variable variance structure.Table1provides some descriptive statistics which are important for determination f the conditional variance model of return variable.

Table 3 .
PP test unit root tests (level data) with the stinginess principle, the models that reach ARMA (4,4) level were predicted.The selection among models with statistically significant coefficients took into consideration the size of AIC and SC selection criteria, high determination coefficient (R 2 ), high logarithmic probability function (LL), and low Theil inequality coefficient.The results are given in table 4. line

Table 6 .
Predictions of GARCH ModelsWhen table 6 is examined, it can be seen that the most appropriate GARCH model is GJR GARCH (1,1) model.The coefficients of this model are statistically significant and have the lowest AIC criteria value and the highest R 2 and LL values.As the case in all models, autocorrelation does not exist.GJR GARCH model has lower values in prediction criteria.Following the selection of the best GARCH model, ARCH LM test statistic was used on GJR GARCH residuals in order to test whether ARCH effect in the model continued.Autocorrelation test is based on Q test values.The results are given in table 7.