Performance Evaluation of ANFIS and GA-ANFIS for Predicting Stock Market Indices

A model of Adaptive Neuro-Fuzzy Inference System (ANFIS) trained with an evolutionary algorithm, namely Genetic Algorithm (GA) is presented in this paper. Further, the model is tested on the NASDAQ stock market indices which is among the most widely followed indices in the United States. Empirical results show that by determining the parameters of ANFIS (premise and consequent parameters) using GA, we can improve performance in terms of Mean Squared Error (MSE), Root Mean Squared Error (RMSE), coefficient of determination (R-Squared) in comparison with using solely ANFIS.


Introduction
The stock market is among the most comprehensive and important pillars of any country's economy, as it plays a crucial role in the process of growth of industries and over time brings the economic workforce into nourishment. As a never-ending trial to acquire profits through investment into certain stock market indices, the machine learning, forecasting, and time-series literature has been captivated with applying results of particular algorithms to forecasting of stock market price indices. Consequently, over the last decades, academic literature in stock market forecasting has gained additional attention. Gourav et al. (2020) discusses the challenges involved with the task of predicting stock market indices. The arguments presented generally involve issues resulted by the chaotic nature of stock price time series: the data is non-parametric, volatile, highly noisy, non-linear, complex, and dynamic. Further, Gholamiangonabadi et al. (2014) elaborates on the behavior of stock markets depending on numerous factors such as economic and political and/or businesses' situation, unrevealed preferences of institutional investors, other stock markets, expectation of the investors, and so on. Such factors either permanently and/or temporarily affect the stock market.
For reasons including above, scientific literature majorly relies on various types of intelligent systems to forecast stock indices, on which trading decisions will further be made on. Upon forecasting, the investment literature would potentially gain additional opportunities for gaining profits by receiving more accurate forecasts of the stock index.
Based upon ongoing research, adaptive neuro-fuzzy inference system (ANFIS) method handles the uncertainty in prediction and has widely been applied in various research contexts even including medical system, image processing, electrical system and so on. We have picked the methodology since it has majorly shown to perform a good performance for modeling complex datasets. In this paper, we first use adaptive neuro-fuzzy inference system (ANFIS) method and try to minimize the Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) and find the value of Coefficient of Determination (R-Squared) for different structures of ANFIS.
Although adaptive neuro-fuzzy inference systems have been widely used in different studies, they have a few drawbacks. First, the learning process of ANFIS is gradient-based, which may lead into particular traps in local minima. Second, when the number of input data are significantly large, the number of respective rules and their tunable parameters increase exponentially, causing into computational complexities (Salleh et al., 2018) Extensive research has been conducted in order to reduce the complexity of ANFIS and to overcome its drawbacks using different evolutionary algorithms. In this paper, we make an effort to improve its performance while avoiding the mentioned drawbacks (i.e., getting trapped at the local optima) by determining the parameters of ANFIS using and evolutionary algorithms, Genetic Algorithm, and compare the result based on error terms, R-Squared as well as running time. Despite the substantial ongoing research on ANFIS methodology and applications, our method and algorithm is novel in the context. The structure of the paper is organized as follows: The second section is dedicated to the the literature review. In the third section the detailed presentation of the used methodology, data description and experiment will be presented. In the fourth section, results and discussion will be presented; and finally, in the fifth section, the conclusion and possibilities of future work are discussed.

Literature Review
Few research papers have approached this question in line with our methodology. Salleh and Hussain (2016) refer to applications of ANFIS (excluding GA), noting that such methdods require training parameters effectively so as to perform efficiently as the inputs count increases. In terms of applications, they write of financial applications, namely, to predict financial crises, bankruptcy, credit risk, currency, stocks prices, and gold prices.  Similar ANFIS-hybrid methods in the literature of time-series forecasting include autoregressive adaptive network fuzzy inference system (AR-ANFIS) by Sarıca et al. (2016), a hybrid model of ARIMA (Auto Regressive Integrated Moving Average) and ANFIS by Barak and Sadegh (2016), and a Quantum-behaved PSO (Particle Swarm Optimization) and ANFIS by Bagheri et al. (2016).
With regards to similar contexts, Chen's (2013) hybrid ANFIS model predicts business failure using PSO and subtractive clustering. Another notable paper in forecasting stocks prices is Wei et. al. (2014), which applies a hybrid ANFIS-MATI method (moving average technical index model) based on an n-period moving average model to forecast TAIEX stock prices.

Adaptive Neuro-Fuzzy Inference System (ANFIS)
Neural networks can be used as appropriate tools of pattern matching and can have their weights automatically adjusted to have their behavior optimized, despite their inability to deduct how to reach to certain decisions. On the other hand, Fuzzy Logic fills such caveat by deducting how to reach to the certain decision yet are unable to -train‖ themselves and learn automatically. As a result, a simultaneous use of these methods could perform as complements in order to overcome a number of drawbacks proposed by hybrid networks.
Among them, the Adaptive Neuro-Fuzzy Inference System (ANFIS) is one of the best possible combinations that is broadly applied in various areas of research. The adaptive neuro-fuzzy inference system was introduced by Jang in 1993. (Salleh et. al., 2018) ANFIS provides links connecting the neural network and fuzzy logic for modeling complicated and dynamic systems. (Aghbashlo et. al., 2016) Through ANFIS, -IF-THEN‖ rules are applied through generating a mapping between the variables of input and output. The architecture of ANFIS is composed by five layers. There are several single-processing elements called neurons that are organized in each layer. Every neuron of each layer is connected to another layer by directed links. In order to produce the value of each neuron, neurons perform a particular function on its incoming signals. Figure 1 illustrates the ANFIS model. (Salleh et. al., 2018) The five layers of ANFIS model follow: Layer 1 (Fuzzification): Letting y and x define the inputs and O 1,i be the output of layer one for the ith node which is computed as follows: µ Ai and µ Bi are Gaussian membership functions specified by two parameters with center c and width σ, given that A i and B i are the members values. We proceed with the Gaussian membership function since it achieved better results. Layer 4 rule is as follows: O2,i = µ Ai (x).µ Bi−2 (y), i = 1,2 Layer 3 (Normalization): The output of Layer 3 is computed by normalizing the firing strength of a rule from the previous layer by calculating the ratio of the ith rule's firing strength to the sum of all rules' firing strength as below: (4) A normalized firing strength of a rule denoted by ϖ Layer 4 (Defuzzification): The output of Layer 4 is calculated as follows: O4,i = ifi = I (pix + qiy + ri), i = 1,2 Where r i , q i , and p i define the consequent parameters of the node i.
Layer 5 (Overall output): Layer 5 contains only one node and the output of layer 5 computed as follows:

Figure 1. Adaptive Neuro-Fuzzy inference system
The adaptive neuro-fuzzy inference system trains its parameters for minimizing the error term between actual and predicted output. The procedure follows: during forward pass, consequent parameters (r i , q i , and p i ) are updated by the least square estimator (LSE); and, during backward pass, the premise parameters (c and σ) are updated by gradient descent and neural networks trains.
ANFIS uses three methods, namely grid-partitioning, subtractive clustering, and Fuzzy CMean to partition the input and output values into rule patches. It is appropriate to use grid-partitioning only for problems with any number of input variables below six. Fuzzy C-Mean algorithm allows one set of data to be in two or more clusters. (Samat & Salleh, 2016) By considering each method's objective, we applied subtractive clustering method. Subtractive clustering is an appropriate method for more than six number of input variables, and reduces the computation time by finding the center of clustering by using the data itself. (Le & Altman, 2011)

Genetic Algorithm
Initially developed by John Holland, Genetic Algorithm (GA) is a metaheuristic inspired by the process of natural selection, mutation, and crossover. Genetic Algorithm can be implemented on the multidimensional problems with large-size search space and with a great number of variables. (Haznedar & Kalinli, 2016) Genetic Algorithm starts with a finite population of solutions and evolves it from one generation to the next. Basic steps of the Genetic Algorithm follow: 6) Population Selection: by considering their fitness, a new population is selected replacing some or the whole original population by an identical number of offspring.

ANFIS Trained by GA
In this paper, binary and continuous genetic algorithms were implemented for training adaptive neuro-fuzzy inference system's parameters such that during the training process, the genetic algorithm makes a binary vector which contains seven digits (seven input variables) with a total of 127 possible combinations. This vector defines the input parameters used throughout the training process of the model. ANFIS has two parameters type which require updating. These are premise parameters (c, σ) and consequent parameters (r i , q i , p i ) as explained in the method part. Genetic algorithm is used to update these parameters. In order to know the performance of the model we are using MSE and RMSE which gives values that we want to optimize. The computation follows. (7) In the equation above y i is the actual output and is the predicted output and n is the number of observations.

Data Description and Empirical Analysis
We have used NASDAQ-100 index data sourced from Yahoo! Finance. The NASDAQ-100 index is arguably the most widely followed index in the United States. More precisely, the used data is a modified capitalization-weighted index consisted of 103 equity securities issued by 100 of the most significant non-financial companies listed on the NASDAQ stock market. (Yahoo! Finance) The data points are of daily frequency and extend from 21 October 2020 through 17 March 2021. We considered the short-term historical stock prices (seven prior days) as the inputs dataset. So, considering the output to be y(t) (the stock price at time t), with n being the number of historical data, the equation looks like: In the next step the adaptive neuro-fuzzy inference system is developed for the dataset and the best results based on R-Squared, MSE and RMSE, running time and parameters setting are shown in Table 1. In the following step the different architectures of hybrid genetic algorithm and adaptive neuro fuzzy inference system (GA-ANFIS) method is developed for the dataset and the best results based on R-Squared, MSE and RMSE, running time and parameters setting are shown in Table 2.

Discussion
Comparison of the results of both the proposed hybrid GA-ANFIS method (genetic algorithm and adaptive neuro-fuzzy inference system) and the standard adaptive neuro-fuzzy inference system (ANFIS) to predict stock market indices is shown in Table 3. Based on the results, it can be concluded that the proposed method outperforms the standard ANFIS. As an example, by examining the results of Mean Squared Error (MSE) and Root Mean Squared Error (RMSE), it can be observed that the proposed model effectively achieves slightly smaller values. Additionally, based on the results the proposed method achieves better R-Squared to compare with the standard ANFIS. Despite the length of time required to perform a computational process for GA-ANFIS being slightly above the one of ANFIS, the GA-ANFIS method performs better than ANFIS in R-Squared, MSE and RMSE and this indicates the better quality of the proposed method (GA-ANFIS).