Forecasting Financial Time Series Using Multiple Regression, Multi Layer Perception, Radial Basis Function and Adaptive Neuro Fuzzy Inference System Models: a Comparative Analysis

In the last few decades, techniques such as Artificial Neural Networks and Fuzzy Inference Systems were used for developing predictive models to estimate the required parameters. Since the recent past Soft Computing techniques are being used as alternate statistical tool. Determination of nature of financial time series data is difficult, expensive, time consuming and involves complex tests. In this paper, we use Multi Layer Perception and Radial Basis Functions of Artificial Neural Networks, Adaptive Neuro Fuzzy Inference System for prediction of S% (Financial Stress percent) of financial time series data and compare it with traditional statistical tool of Multiple Regression. The accuracies of Artificial Neural Network and Adaptive Neuro Fuzzy Inference System techniques are evaluated as relatively similar. It is found that Radial Basis Functions constructed exhibit high performance than Multi Layer Perception, Adaptive Neuro Fuzzy Inference System and Multiple Regression for predicting S%. The performance comparison shows that Soft Computing paradigm is a promising tool for minimizing uncertainties in financial time series data. Further Soft Computing also minimizes the potential inconsistency of correlations.


Introduction
Forecasting (Armstrong, 2001;Chen, 2002) and predicting financial time series (Box, 2008;Brockwell et al., 2009;Chatfield, 2000;Fuller et al., 1999) has been a topic of active research since past few decades.It is a key element of financial and managerial decision making.It is highly utilized in predicting economic and business trends for improved decisions and investments.The inherent challenge involves in accuracy of predicting the financial market data (Altman, 1993).The central aspect of improving prediction accuracy is to have good and efficient forecasting techniques.The problem has initially been handled using different Statistical techniques.However, after the emergence of Computational Intelligence techniques such as Artificial Neural Network (ANN), Fuzzy Sets, Evolutionary Algorithms, Rough Sets etc. (Altun et al., 2004;Benardos et al., 2007;Chaudhuri et al., 2009;Dash et al., 2008;Jang et al., 1995;Kosko, 2008;Simpson, 1990;Zadeh, 1994;Zhang et al., 2005;Zimmermann, 2001) as alternative techniques to conventional Statistical techniques with better performance have paved the road for increased usage of these techniques in areas of financial time series forecasting.The stock traders have come to rely upon various types of Intelligent Systems to make trading decisions.Several Information Systems have been developed in recent years for modeling expertise, decision support and complicated automation tasks.
Every financial time series data is characterized by a unique financial cycle.Despite its apparent uniqueness from conditions that lead to boom times to triggers that result in reversals, historical narratives (Kindleberger, 2005) suggest that most cycles display common features.Boom times are associated with periods of credit expansion and persistent increases in asset prices often followed by rapid reversals.These commonalities confirmed by different empirical work (Bordo et al., 2001) suggest that developments in credit and asset markets of individual countries provide an early warning indicator of vulnerability in financial system that would be useful in assessing current situation and in discussions of possible policy actions.In light of this it is somewhat surprising that the empirical work in this area is scarce.Whatever reasons there may be at general level, the problem in doing this type of analysis for developed countries is compounded by the scarcity of events that would qualify as a situation of financial crises resulting in financial stress.Absence of financial crises does not however mean that financial systems of developed countries have not or cannot come under stress but it does raise the issue of the best way to proceed.Financial stress is characterized as a situation in which large parts of financial sector face prospects of large financial losses.These situations are usually accompanied by an increased degree of perceived risk i.e., widening of distribution of probable losses and uncertainty i.e., decreased confidence in shape of that distribution.
In this work, we use a set of methodologies that can be used to assess the role of credit and asset prices as early warning indicators of vulnerability in the financial system of countries that have experienced very few or no financial crises over the sample period of interest.A typical example is Canada which is the basis of empirical work in this research.Bordo et al. (2001) illustrates that Canada has not experienced twin crises viz.banking and currency crises since the beginning of 1883 and has experienced only four currency crises since 1945.These features of sample preclude a meaningful country level analysis based on binary indicators of crises.Instead we suggest that in such circumstances one focuses on incidences of financial stress.Here we use the Financial Stress Index (SI), a continuous measure of financial stress developed by Illing and Liu (2006).The measure was originally developed for Canada, but the underlying approach can be applied to any country.In our examination of the role of credit and asset price in episodes of financial stress we consider both linear and nonlinear models, since the latter may be more suitable in capturing any behavioral asymmetries of financial market participants.The working hypothesis is that movements in credit and asset prices are indicators of the health of the system and its ability to withstand various types of shocks.Since the impact of a shock depends not only on the state of the system but also on the magnitude of the shock one would expect that everything else being the same, excessive growth of credit and persistent increases in asset prices reduce the ability of system to withstand shocks.The study aims to determine comparative empirical relationships for estimation of Financial Stress percent of time series data by using Multiple Regression (Fuller et al., 1999;Lund et al., 2002) Artificial Neural Network (ANN) models such as Multi Layer Perception (MLP) and Radial Basis Functions (RBF) (Kenneth et al., 2001) and Adaptive Neuro Fuzzy Inference System (ANFIS) (Jang, 1993).239 time series data samples are tested for determination of Financial Stress percent (S%) (Bordo et al., 2001;Kindleberger, 2005) in terms of four major explanatory variables viz.Credit Measures (CM), Asset Prices (AP), Macroeconomic Variables (MV) and Foreign Variables (FV) to establish predictive models using Statistical and Machine Learning and Soft Computing techniques (Kosko, 2008;Zadeh, 1994).It is found that the relationships developed here allows CM, AP, MV and FV to be used as rapid, easy to determine, low cost means to estimate the stress potential with sufficient accuracy to allow for adequate design in situations where financial crisis situations can be prevented.Moreover the comparison of performance indices and coefficient of correlations for predicting Financial Stress percent revealed that prediction performance of RBF is higher than that of Multiple Regression, MLP and ANFIS.This paper is presented as follows.In the next section, experimental framework is highlighted.In section 3, data analysis is presented using Multiple Regression, Artificial Neural Network (ANN) models such as Multi Layer Perception (MLP) and Radial Basis Functions (RBF) and Adaptive Neuro Fuzzy Inference System (ANFIS).This is followed by illustration of experimental results in section 4. Finally, conclusions are given in section 5.

Experimental Framework
In this study, the data was captured from extensive studies of financial stress developed by Illing and Liu (2006).They constructed a weighted average of various indicators of expected loss, risk and uncertainty in financial sector.The resulting SI is a continuous, broad based measure that includes indicators from equity, bond, and foreign exchange markets such as: (a) the spread between yields on bonds issued by Canadian financial institutions and yields on government bonds of comparable duration (b) spread between yields on Canadian nonfinancial corporate bonds and government bonds (c) inverted term spread (d) beta derived from total return index for Canadian financial institutions (e) Canadian trade weighted dollar GARCH volatility (f) Canadian stock market GARCH volatility (g) the difference between Canadian and U.S. government short term borrowing rates (h) average bid-ask spread on Canadian Treasury bills (i) spread between Canadian commercial paper rates and Treasury bill rates of comparable duration.In constructing SI, Illing and Liu (2006) considered several weighting options that reflect relative shares of credit for particular sectors in economy.The resulting index is shown in Figure 1  In following subsections the analysis of the financial time series data analysis is performed using Multiple Regression, MLP and RBF ANN models and ANFIS.

Multiple Regression
Multiple Regression is a commonly used statistical technique dating back to 1908 (Fuller et al., 1999)  are regression coefficients representing the amount the dependent variable y changes when corresponding independent variable changes one unit; c is a constant where the regression line intercepts y axis representing the amount the dependent y will be when all independent variables are zero.The standardized versions of coefficients are beta weights and ratio of beta coefficients is the ratio of relative predictive power of independent variables.The major drawback of all regression techniques is that only relationships are ascertained but it is difficult to validate the underlying mechanism.Multiple regression analysis is conducted to correlate the measured Financial Stress percent to four financial indices, viz.Credit Measures (CM), Asset Prices (AP), Macroeconomic Variables (MV) and Foreign Variables (FV) as given in Table 3.The multiple regression equation to predict S percent is given as follows: 9.226 10 2.500 10 5.339 10 6.600 10 0.155 The correlation coefficient between measured and predicted values is good measure to verify prediction performance of model.Figure 2 shows the relationships between measured and predicted values obtained from multiple regression equation for S % with good correlation coefficient.In this work, indices Variance Factor (VAF) and Root Mean Square Error (RMSE) given by Equations 2 and 3 respectively account for measured and predicted values.These indices were calculated to control performance of prediction capacity of the predictive model developed (Fuller et al., 1999;Lund et al., 2002).
Here y and y' are measured and predicted values respectively.The calculated indices are given in Here A i and P i denote the actual and predicted values respectively.High prediction performances are indicated from the calculated values of RMSE, VAF and MAPE as illustrated in Table 4.

Artificial Neural Networks
ANN is a mathematical model (Kosko, 2008;Zadeh, 1994) that is inspired by structure and functional aspects of biological neural systems.ANN consists of an interconnected group of artificial neurons which process information using connectionist approach.ANN is infact an adaptive system that changes its structure based on external or internal information that flows through the network during learning phase.Modern neural networks are non linear Statistical data modeling tools.They usually model complicated relationships between inputs and outputs in data to extract patterns or trends that are too complex to be observed by humans or other computational techniques.A trained ANN can be considered as an expert in the category of information it has been given to analyze.This expert can then be used to provide projections given new situations of interest and solve them.The particular network can be defined by three fundamental components, viz.transfer function, network architecture and learning law (Simpson, 1990).It is essential to define these components, to solve the problem satisfactorily.Neural networks consist of a large class of different architectures.Keeping this point in view ANN is often used as a direct substitute for auto correlation, multivariable regression, linear regression, trigonometric and other statistical analysis and techniques.
In this work, MLP and RBF (Kenneth et al., 2001;Kosko, 2008) are used as two ANN models to estimate the Financial Stress percent of the Financial Time Series data.Both MLP and RBF are the two most widely used ANNs for classification and regression problems as they produce appreciable results in pattern classification (Loh et al., 2000).They are robust classifiers with generalization ability for imprecise data.The main difference between MLP and RBF is that later is based on localist type of learning which is responsive to a limited section of input space.On the other hand, MLP relies on more distributed approach.The output of MLP is produced by linear combinations of outputs of hidden layer nodes in which every neuron maps a weighted average of inputs through a sigmoid function.In one hidden layer RBF network hidden nodes map distances between input vectors and center vectors to outputs through a nonlinear kernel or radial function.The entire data is first normalized and divided into three sets in the proportions of: (a) 60% for training; (b) 20% for testing and (c) 20% for verification.The implementation is performed using MATLAB.The ANNs used are decomposed into three layer feed-forward network that consists of: (a) one input layer having 3 neurons; (b) one hidden layer consisting of 2 neurons for MLP and 16 neurons for RBF and (c) one output layer with one neuron represented schematically in Figure 3.The optimum number of neurons in the hidden layer is decided after a series of trial runs in networks having minimum error.In this process network parameters of learning rate and momentum were set at 0.01 and 0.10 respectively.The variable learning rate with momentum trainlm as networks training function and tansig as an activation or transfer function are used for all layers here.
Figure 3. MLP and RBF ANN used in this work

Multi Layer Perception
MLP is an ANN with feed forward topology (Kenneth et al., 2001;Kosko, 2008) that maps set of weighted sum of input data and bias term into a set of desired outputs.MLP consists of multiple layers of nodes in a directed graph with each layer fully connected to the next one.Except for input nodes each node is a neuron with a nonlinear activation function.MLP utilizes a supervised learning technique called back propagation for training the network.MLP is a modification of standard linear perception which distinguishes data that is not linearly separable.MLP networks consist of an input layer, one or more hidden layers and an output layer.Each layer has a number of processing units and each unit is fully interconnected with weighted connections to units in subsequent layer.MLP transforms n inputs to l outputs through some nonlinear functions.The output of the network is determined by activation of units in output layer as follows: Here f() is activation function; x p is activation of hidden layer node and pn w  is interconnection weight between hidden layer node and n th output layer node which is modeled using the following 2  fuzzy membership function (Zadeh, 1994;Zimmermann, 2001 , The activation level of nodes in hidden layer is determined in a similar fashion.Based on differences between calculated output and target value an error function is defined as follows: Here M is the number of pattern in data set and N is the number of output nodes.The objective is to reduce error by adjusting interconnections between layers.The weights are adjusted using gradient descent Back Propagation (BP) algorithm.A training data consisting of a set of corresponding input and target pattern values n v is required by the algorithm.During training MLP starts with random set of initial weights distributed through fuzzy membership function given Equation ( 6).The process is continued until set of values ip w  and pn w  is optimized so that predefined error threshold is met between n x and n v (Chaudhuri et al, 2009;Cohen et al., 2003;Lee, 1990).Each interconnection between the nodes is adjusted by the amount of weight update value according to BP algorithm as follows: Here, ; ∑ ; 1 and 1 when bipolar sigmoid activation function is used.The cross correlation between predicted and observed values indicate that MLP is highly favorable for prediction of S % as indicated in Figure 4.The RMSE, VAF, MAPE and R values are presented in Table 4.

Radial Basis Function
RBF based on supervised learning (Cohen et al., 2002;Looney, 2002) emerged as a variant ANN in late 1980s.Their roots are entrenched in much older Pattern Recognition techniques such as potential functions, clustering, functional approximation, spline interpolation etc.They act as good alternative to MLP.RBF networks model nonlinear data effectively and can be trained in one stage rather than using an iterative process (Pao, 1994).RBF is similar in structure to MLP.It has a hidden layer which contains nodes called RBF units.Each RBF has two key parameters that describe location of function's center and its deviation or width.The hidden unit measures the distance between an input data vector and center of its RBF.RBF has its peak when distance between its center and input data vector is zero and declines gradually as this distance increases.There is a single hidden layer in RBF network and there are only two sets of weights, one connecting hidden layer to input layer and other connecting hidden layer to output layer.Those weights connecting to input layer contain parameters of basis functions.The weights connecting hidden layer to output layer are used to form linear combinations of activations of basis functions or hidden units to generate network outputs.Since hidden units are nonlinear, the outputs of hidden layer may be combined linearly and so processing increases.
RBF used here is a variant of radial basis functional link net, functional link net or radial basis function ANN (Cohen et al., 2002;Jang, 1995;Kenneth, 2001;Looney, 2002).It is more general than RBF because it consists of both non-linear and linear links.RBF shown in Figure 3 has input layer of N nodes, hidden layer of M units and output layer of P units.The M units represent RBF network architecture and are connected to output units by weight vector mp w ~which is modeled using the following gaussian fuzzy membership function (Zadeh, 1994;Zimmermann, 2001): Here and are the input and the center of RBF unit respectively; is the spread of gaussian basis function.Total output of one of the output units of RBF network is given by the expression: is centre of a specific RBF; m  is spread parameter and np w  is weight vector from input to output layer.The gaussian function centers, ( ) m v are initialized using fuzzy c-means clustering algorithm (Zadeh, 1994) The weights are updated by steepest descent through fuzzy membership functions as given by following equations (Zadeh, 1994;Zimmermann, 2001): In the above expressions 1  and 2  are the learning rates.As evident from Table 4 and Figure 5 of cross correlation between predicted and observed values RBF model is highly acceptable for prediction of S %.
Figure 5. Cross correlation of predicted and observed values of S% for RBF

Adaptive Neuro Fuzzy Inference System
Roger Jang (1993) suggested Adaptive Neuro Fuzzy Inference System (ANFIS) in which both the learning capabilities of an ANN and reasoning capabilities of Fuzzy Logic were combined in order to give enhanced prediction capabilities.The objective of ANFIS is to find an exemplar that will correctly map input values with target values.ANFIS can serve as a basis for constructing a set of fuzzy if-then rules with appropriate membership functions to generate stipulated input output pairs.Here, the membership functions are so tuned to input output data such that excellent results are obtained.ANFIS thus takes an initial Fuzzy Inference System (FIS) and tunes it with Back Propagation algorithm based on collection of input output data.FIS thus acts as a knowledge representation system where each fuzzy rule describes local behavior of the system.The basic structure of FIS consists of three conceptual components, viz.(i) rule base containing a selection of fuzzy rules; (ii) database defining membership functions used in fuzzy rules and (iii) reasoning mechanism which performs inference procedure upon the rules and given facts to derive a reasonable output.The fuzzy component of ANFIS takes care of inherent vagueness and impreciseness present in real life data.The bell membership function is used in ANFIS (Zadeh, 1994;Zimmermann, 2001):  (Lee, 1990).ANFIS model used here is a multilayer ANN based Fuzzy system.Its topology is represented in Figure 6 and the system has a total of five layers, viz.fuzzification layer, product layer, normalized layer, defuzzification layer and total output layer.
In the directed graph structure, input and output nodes represent training values and predicted values respectively.In hidden layers, there are nodes which take care of membership functions and corresponding rules.This architecture eliminates the disadvantage involved in normal feed forward multilayer network where it is difficult for an observer to understand or modify the network.We assume that FIS has two inputs x and y and one output.For a first order Sugeno Fuzzy model, a common rule set with two fuzzy if-then rules is defined as follows: 1: 2: Here, , , , , , are linear parameters and , , , are non-linear parameters.In ANFIS, layer 1 is the fuzzification layer in which x and y are input of nodes , and , respectively., , , are linguistic labels used in Fuzzy Set Theory for dividing the membership functions.The membership relationship between the output and input functions of this layer is expressed as follows:

Here
, denote the output of layer 5.Here ANFIS uses a hybrid learning algorithm which is a combination of gradient descent and least squares method.In forward pass of hybrid learning algorithm node outputs go forward until layer 4 and consequent parameters are identified by least squares method (Jang, 1993).In backward pass error signals propagate backwards and premise parameters are updated by gradient descent.The consequent parameters are optimized under the condition that the premise parameters are fixed.The major advantage of hybrid approach is that it converges much faster since it reduces search space dimensions of original back propagation methods.The overall output is expressed as linear combination of consequent parameters.The error measure to train the ANFIS is defined as follows (Jang, 1993): Here and ′ are the i th desired and estimated output respectively and is total number of input output pairs of data in the training set.In this work ANFIS is trained with the help of MATLAB.The RMSE and Statistical calculations were performed using Excel.The different parameter and their values used in ANFIS are presented in Table 5.According to RMSE, VAF, MAPE and R 2 values from Table 4 and cross correlation between predicted and observed values from Figure 7, ANFIS constructed to predict S % has a high performance of prediction.

Experimental Results
This section illustrates the results obtained towards prediction of financial stress percent of financial time series data using Multiple Regression, MLP, RBF and ANFIS models.A comparative analysis among different stated methods is also highlighted.According to the results of simple regression analyses, there are statistically meaningful relationships between Financial Stress percent with Credit Measures, Asset Prices, Macroeconomic Variables and Foreign Variables.The models of Multiple Regression, MLP, RBF and ANFIS for prediction of Financial Stress percent were then constructed using three inputs and one output.Based on experiments performed we have the following results: (a) The result obtained from prediction of Financial Stress percent showed that Multiple Regression has high prediction performance.
(b) ANFIS for prediction of Financial Stress percent revealed a more reliable prediction when compared with Multiple Regression.
(c) In order to predict Financial Stress percent RBF having three inputs and one output was applied successfully and exhibited more reliable predictions than Multiple Regression and ANFIS.
As a result of the comparison of VAF, RMSE and MAPE indices and R 2 for predicting Financial Stress percent, it was obtained that prediction performance of RBF is higher than those of MLP, ANFIS and Multiple Regression.In order to show deviations from observed values of Financial Stress percent, the distances of predicted values using Multiple Regression, MLP, RBF and ANFIS models constructed from observed values were also calculated and represented schematically in Figure 8.The Figure 8 indicates that deviation interval (-1.239 to +1.302) of predicted values from RBF is smaller than deviation interval of MLP (-1.536 to +2.116), ANFIS (-2.136 to +2.106) and Multiple Regression (-2.721 to +1.752).

Figure 2 .
Figure 2. Cross correlation of predicted and observed values of S % for Multiple Regression

Figure 4 .
Figure 4. Cross correlation of predicted and observed values of S% for MLP

Figure 6 .
Figure 6.Five layered ANFIS architecture Parameter b is usually positive.The desired bell membership function is obtained by a proper selection of parameter set {a, b, c}.During the learning phase these parameters are changing continuously in order to minimize error function between target output values and the calculated ones(Lee, 1990).ANFIS model used here is a multilayer ANN based Fuzzy system.Its topology is represented in Figure6and the system has a total of five layers, viz.fuzzification layer, product layer, normalized layer, defuzzification layer and total output layer.

Figure 8 .
Figure 8. Variation of values predicted by Multiple Regression, MLP, RBF and ANFIS from observed values

Table 2 .
is most effective in correctly signaling events that are widely associated with high financial Predictive models for assessing S% due to Pearson.It is employed to predict variance in an interval dependent, based on linear combinations of interval, dichotomous or dummy independent variables.However, the real life data is subjected to different distortions and therefore it is not possible to produce totally accurate predictions.Keeping this in view multiple regression allows the identification of a set of predictor variables which together provide a useful estimate of likely value on a criterion variable.The multiple regression equation takes the form . . .

Table 3 .
Summary of Multiple Regression for predicting S %

Table 4 .
The prediction model is excellent if RMSE is zero and VAF is 100.

Table 4 .
Performance indices such as RMSE, VAF, MAPE and R 2 for different models Another measure of accuracy in a fitted series value in statistics viz.Mean Absolute Percentage Error (MAPE) is also used for comparison of prediction performances of the models.MAPE which usually expresses accuracy as a percentage is given by the following Equation: 1 1 100 and standard deviation widths m The output of this layer is product of input signal, which is defined as follows:Here, denotes the output of layer 2. The 3 rd layer is normalized layer whose nodes are labeled .Its function is to normalize the weight function in the process of following conditions, where, denotes the output of layer 3.Layer 4 is the defuzzification layer whose nodes are adaptive.The output is where linear parameters , , are also called consequent parameters of the node.The defuzzification relationship between input and output of this layer is defined as follows:Here, denote the output of layer 4. The 5 th layer is total output layer whose node is labeled as Σ.The output of this layer is total of input signals which represent the results of Financial Stress percent and is represented as follows:

Table 5 .
Different parameter values for training ANFIS Figure 7. Cross correlation of predicted and observed values of S% for ANFIS