Predicting Budget Revenues of the Republic of Congo: Multiple Linear Regression Approach

Alongside political, legal and financial suggestions, prognosticating as part of a state's budget planning process endures a substantial element. An act that furnishes for and authorizing what the State hopes to recover as earnings along with what it intends to bear as expenditure for a calendar year and, however, which may be subject to revision during its implementation in reaction to the current economic situation; the state budget remains tactic for the extant, quality, functioning and organization of its Administration. As an oil-producing country, the Republic of Congo, which must, among other things, have the financial means to meet these ambitions, does not escape, like other countries selling hydrocarbons, the preponderance of revenues from this sector in its budget forecasts. In view of the unpredictability of international oil markets on which government revenues are largely dependent, the use of artificial intelligence would disclose ornaments in data volumes and model interdependent systems to generate outcomes synonymous with enhanced decision-making efficiency and value for money. In this article, we will have to use Machine Learning to create the prediction model using secondary data from international organizations and official annals of the Government of the Republic of Congo, that is, the annual price of a barrel of oil and the budget foresees of state incomes and expenditures entered in various initial and altering finance laws between the years 1980-2019. This representation is based on the multiple linear retrogression algorithms that will ascertain the linear relationship between a dependent variable and independent or explanatory variables. This will also concede us to approximate the foretell of the value “state revenues” from the values “oil prices” and “state expenses”. As a result of the evaluation, the coefficient of determination (R 2 ) of the performance of the dummy based on the test data is 99%. Finally, the stereotype will be practiced on a web interface granting users to enter the new independent data and then click a button to illustrate the result of the predictions of the dependent value.


Introduction
In order to carry out its tasks, the state must have the means of its policy. The Government of the Republic of Congo, pay attention to the sub-regional directives of the Economic and Monetary Community of Central Africa (CEMAC) linking to finance laws (CEMAC, 2011) and economic expeditions of the State (CEMAC, 2008), demonstrates a medium-term budgetary substructure that explains, based on realistic economic assumptions, the evolution over a minimum period of three years of all expenditures and revenues of the State, including contributions from international donors; the requirement for or ability to finance public administrations, as well as the overall level of financial indebtedness of the State. Each year on these same principles, the State budget reflects the financial expression of the economic, social and environmental development orientations of the Government which are declined through the main categories of public expenditure, by nature, by function and by Ministry in the National Development Plan (Ministry of Plan, 2018). Each Department prepares a draft estimate before sending it to the Department of Finance for selection and final improvement. On the other hand, the foresee of state earnings endures the entire privilege of the Ministry of Finance (Official Journal, 2017). After observing the financial situation and on the basis of calculations and statistical estimates (Ministry of Economy, 2019), the experts draw up a forecast of state revenues. The budget thus set in the finance bill (the other name of the budget), reflecting the expected economic assumptions of the government in terms of growth, inflation, revenue and the like, is put to a vote of Parliament. The Parliament may, however, propose amendments before adopting this initial finance law before it is signed by the President of the Republic and published in the official journal before the end of the year. This law can be revised during the fiscal year by a conceding law, also called "Collective" to take into account changes in outlook, unforeseen events or to change the fiscal policy of the state without waiting for the following year. Finally, when the year has passed, the budget is described as "executed" because the actual figures of expenditure and revenue have been collected (Cameroon, Ministry of Finance).
In the finance laws, budget revenues are presented and classified into four headings: (i) tax revenues including taxes, duties and other mandatory transfers other than social security contributions; (ii) donations and bequests, competition funds including donations from international cooperation; (iii) social contributions, including contributions to pension and social protection funds and (iv) other income, including income from property, sales of goods and services, fines, etc. Budgetary expenditures are classified into six headings: (i) financial expenses of the state debt; (ii) personnel expenses; (iii) Expenditure on goods and services; (iv) transfer expenses; (v) capital expenses and (vi) other expenses. The estimates of revenue are voted together for the general budget, the supplementary budgets and the special accounts of the Treasury (CEMAC, 2011) (Official Journal, 2017. Thus, the budget, as a management tool, allows the State to have a global view of the sources of its revenues and their allocations to the various expenses. Moreover, although the budget preparation procedure is well designed and clearly defined in the organic texts, however, the economic situation of the Republic of the Congo has deteriorated considerably over the past six years. In addition to the unorthodox execution of the state budget, the complexity of administrative procedures, the volatility of oil prices on international markets, random decisions and poor strategic choices have more or less a responsibility in the advent of the current economic crisis that is going through the Republic of Congo. This situation has forced the country to seek loans from international creditors (International Monetary Fund; to support its public finances where expenses are increasingly widening gap with domestic revenues (Jean, 2016). The accuracy of forecasts is therefore a major problem to be solved in order to partly improve the work of budget planning. This sector long acquired standard statistical analysis models and methods has been living for a few years, a real revolution like most sectors with the advent of predictive analysis enabled by artificial intelligence.
In this article, we will try to initiate predictions of state incomes using a controlled Machine Learning algorithm. The field of Machine Learning is full of algorithms to meet various requirements within its mathematical and algorithmic specificities. What was a few years ago still the order of science fiction is now reality. We talk with computers, our phones direct us and tell us the shortest path, our watches know if we have moved enough in the day. The technique is increasingly intelligent, and scientists, engineers and programmers become teachers: they "train" computers to learn autonomously (Digital Guide). Where a traditional computer program performs a task by following precise instructions, machine learning does not follow instructions, but learns from experiments. Arthur Samuel defined machine learning as "the set of techniques that allow a machine to learn to perform a task without having to explicitly program for it." (Antoine, 2011). Machine learning algorithms are divided into three broad categories: supervised learning, unsupervised learning, and reinforcement learning. Supervised learning can be divided into two: regression analysis used to predict numerical values and Classification analysis which is a series of techniques used to predict categorical values (Eric Biernat, 2015). Some researchers have used Machine Learning algorithms to address such interesting issues in multiple sectors. On the one hand, researchers studied the predictive performance of artificial neural networks and nonparametric regression models against the more conventional Box-Jenkins and structural econometric modeling approaches used in economic time series forecasting. This study allowed them to demonstrate that the former approach worked better than the latter in predicting GDP growth in African economies (Chuku et al., 2019). In their work, other researchers have found the equivalence of multiple linear regression and thermodynamic fluctuations approaches for the calculation of thermodynamic derivatives from molecular simulation (Ahmadreza et al., 2020). Some, on the other hand, worked on a Normal least squares Vector Support Machine (NLS-SVM) and its classification learning algorithm to demonstrate after simulations on artificial and real data that NLS-SVM outperformed LS-SVM on the results obtained (Xinjun et al., 2009). A researcher proposed a kernel learning machine model associated with the K-means and firefly clustering algorithms (K-means-FFA-KELM) with several subsets of data to evaluate the monthly evapotranspiration reference in the Poyang Lake basin in China (Lifeng et al., 2020). Aware of the impact of Machine Learning in the manufacturing industry, a researcher in his study uses neural networks to estimate the credibility of decisions leading to the growth of Chinese companies in the sector in the world (Da et al., 2020). Various scholars tend to argue that in order to apprehend deeply the garbage attitude of the city of Bogota, an evaluation must be done to combine machine learning based on decision trees, support vector machines and recurrent neural networks (Johanna et al., 2019) And on the other hand, researchers using their own algorithm (BNII) sought to mitigate the significant loss of information by using a Bayesian network in ijef.ccsenet.org International Journal of Economics and Finance Vol. 13, No.6; intelligent credit rating systems (Qiujun et al., 2020).
The Republic of Congo experienced nearly a dozen budget collectives in the period studied and three of them between 2014 and 2017. This budgetary instability is sufficient evidence that there is little need to revise the models and methods utilized in the process of drawing up and forecasting the state budget. This work would grant policy and administrative frameworks, public finance specialists to give regard the model shifts brought by artificial intelligence in the public finance sector. Using a combination of Mixed Data Sampling regression models (MIDAS), the author of a study finds that his approach based on mixed frequency data, composed of budget series and macroeconomic indicators to forecast in real time the annual federal budget, expenditure and revenue of the US yields better results compared to traditional models .

Methodology
Limited by the number of sources of information, we will be used the documentary technique to collect secondary data published on the websites (Ministry of Finance and Budget, 2020) (Official Journal, 2019) of the Government of the Republic of Congo, documents relating to the state budget, the various initial and/or amending finance laws between the years 1980-2019 in which the estimates in revenue and expenditure will enter in the chapter "general budget" will be interested in our work. In view of the significant weight of revenues from the hydrocarbon sector (EITI Executive Committee, 2017) in the state budget, we will have to associate with the first data, the annual prices of the barrel of oil of the same period, also online will be collected (Statista Research, 2020). The Excel software will be necessary for us to sort and retrieve useful data in these masses of information.
In machine learning, there are two different types of supervised learning methods: classification and regression. In general, regression is a statistical method that estimates relationships between variables. Classification also attempts to find relationships between variables, the main difference between classification and regression being the output of the model (Eric, 2015). In a regression task, the output variable is numeric or continuous in nature, while for classification tasks, the output variable is categorical or discrete in nature. Linear regression is one of the most commonly used algorithms. There are two types of linear regression analysis: single and multiple. It is called simple when it is composed of a dependent variable and an independent variable while it is multiple when it involves two or more independent variables and a dependent variable.
The multiple linear regression model can be presented as follows: Y t = β 0 +β 1 X 1t +β 2 X 2t +,… +β m X mt +ε t And for the estimated model we write: ̂= 0 + 1 1 + 2 2 +, … + Our case uses a model with two independent variables: With: is the variable dependent on the date «t » , called endogenous; ̂ is the predicted value of ; are the so-called exogenous independent variables; 1 is the coefficient associated with the variable 1 ; is the error term that crystallizes all the shortcomings of the model. This least square method will estimate the relationship between the variables and estimate how the dependent variable changes as the independent variables change by minimizing the sum of the squares of the deviations . In the Root mean squared error, errors are squared before being averaged. Greater weight is assigned to larger errors.
The coefficient of determination is the percentage of the total error on the dependent variable Y (state revenue) illustrated by the paradigm. This coefficient is formulated by: The Mean absolute error is one of the most commonly used metrics.
The Mean squared error gives a linear value, which averages the weighted individual differences. The lower the value, the better the performance of the model.
We will use the Python language under Jupyter Notebook with its libraries to model our algorithm before implementing it on a web page using Flask under PyCharm. This interface will support the processing of the data entered by the user and also the display of the response to be returned.

Objective
The broad objective of our work is to propose a web interface associated with a multiple linear regression algorithm allowing users to make predictions of state budget revenues.

1) Readiness of accumulating features;
2) Pretreatment along with numerical survey of data; 3) Generating the various linear lapse paradigm; 4) Implementation and testing of the model on a web interface.

Budget
Coming from the old French budget, bag serving as a purse. A budget is an approximate of the outlay and earnings of an organization, a state, a territorial authority, an institution, an association, any financial agent, an ijef.ccsenet.org International Journal of Economics and Finance Vol. 13, No.6; individual. The budget is the translation, expressed in monetary terms, of the various assumptions concerning the environment, activity, operations, investments and action plans envisaged in a given period, usually a year, corresponding to an accounting year (La Toupie). The state budget consists of the general budget, auxiliary budgets and special accounts. Once passed by Parliament, it becomes an initial finance law and presents all the revenues and expenses of the various ministries for a calendar year. By its amount, its orientations and its implications, the state budget constitutes a real political issue. If the Initial Finance Law (LFI) fixes and authorize all the resources and expenses of the State before the financial year, the Amending Finance Law (LFR) or "budget collectives" as for it modifies during the financial year, the provisions of the initial finance law and finally, the Regulation Law (LR) sets the final amount of revenue and expenditure of the budget, as well as the budgetary result (deficit or surplus). Table 1 below presents the general budget estimates in revenue and expenditure for the 2019 fiscal year of the Republic of the Congo.

Revenue
Revenue is the increase within net worth resulting from a transaction. Revenue transactions have counterpart inflows, in the form of either an increase in assets or a decrease in liabilities, which has the effect of increasing net worth (International Monetary Fund, 2014). Revenues can be tax or non-tax. Tax forms refer to all amounts of money paid to the State for the payment of tax, for example. And non-tax accounts for all proceed from sources other than taxes.

Expenditure
Expenses are a decrease in net worth resulting from a transaction. They have counterpart inflows, in the form of either a decrease in assets or an increase in liabilities, which has the effect of reducing net worth (International Monetary Fund, 2014). Expenses are authorized only by a finance law. They can support the expenses related to state investments, the repayment of public debt, the care of its staff, etc.

Dependence of the Budget on the Oil Sector
The current paradigm based on the exploitation of the oil rent is no longer feasible not only because of the current crisis but also in view of the high revolutions examined in the sectors of new information and communication technologies, transport with electric vehicles, green energy and biofuels. The implementation of regional and organizational strategies is urgently required in order both to meet external needs along with to correct internal constraints both in terms of economic (short earnings, great debt, large pro-cyclical expenditure) and public administration management (International Bank for reconstruction, 2018).

Artificial Intelligence
Artificial intelligence (AI) research is trying to create machines capable of acting like human beings: indeed, computers and robots are supposed to analyze their environment and thus make the best possible decision. Robots must therefore behave intelligently according to our standards. But this opens up a problem: what criteria should we use to judge our own intelligence? Today, AI cannot simulate the whole human being (especially emotional intelligence). Instead, partial aspects are isolated in order to cope with specific, precise tasks. This is commonly referred to as weak artificial intelligence (weak AI).
As human analysis capabilities become insufficient in the face of the growing production rate of the amounts of data created by humans and machines, it is necessary for humans to rely on machine learning systems to perform sharp analyses. This new paradigm also makes it possible to discover and realize unknown connections between the different types of data, long unknown and unsuspected by using the application of known models or by the search for new models of analysis. Thus, being able to offer predictions that are finer and even closer to reality.

Data
We relied on the possibility of making available a web interface associated with a multiple linear regression algorithm to users to predict state revenues taking into account the overall projections of expenditures and the price of the current barrel of oil. To carry out this work, we used historical data from the period 1980-2019, collected on Government websites (Ministry of Finance and Budget, 2020) (Official Journal, 2019) and sub-regional (CEMAC). In the course of the work, the missing and unavailable information for the years 1991, 1992 and 1993 at the level of Income and Expenditure was replaced by the information for the year 1994 in our dataset.
In general, the forecast of state revenue and expenditure remained in balance throughout the period under review. As shown in figure 2, this balance gradually follows the trend of the price of a barrel of oil. This situation exposes the fragility of public finances in the face of international oil sector markets and consolidates the label "economy of rent" to the Republic of Congo, ranked 6th African country according to the volume of oil produced in 2017 (Ecofin, 2018). In Table 3, we illustrate the measures of central trend and propagation that summarizes the characteristics of the data studied. After generating the various linear throwback algorithm, we coached it on a sample of 80% of the data. It was then tested on the remaining data to estimate predictions. Following this, in Table 4 below, we see the difference between estimates of actual values against forecasts made by the model.  Vol. 13, No.6; By superimposing the actual values and the predicted values in Figure 3, the model shows that the values predicted by it are close to the trend of the actual values. The error rate between actual and forecast values is 86% in 1981, i.e., in the second year of the dataset. In 1988, this rate rose to -37% before falling to 1.5% in 2014 and finally stabilizing at -0.10% in 2018. In Figure 3 below, the further we move in time, the less is the difference between the actual and predicted values by the model. can, however, make comparatively good predictions. The web interface implemented grants to enhance the interaction of users with our model to make predictions using new data. Figure 5 presents the home page of the web interface before the user enters the data and Figure 6 presents it with the data entered as well as the amount of state revenues predicted after a click on the "Predict" button. Figure 5. User interface home page Figure 6. Government revenues predicted using new data

Conclusion
Today, in the face of increasing amounts of data, the interdependence has enlarged in the sectors of working life. The economic and development prospects of the country on macroeconomic elements alone and standard statistical techniques are no longer sufficient. The continuous enhancement of the abilities of artificial intelligence compels us to contemplate it in order to revamp assets in all quarters. Having a quantitative dataset, this article, which is absorbed in forecasting state economic earnings, favored to maneuver various linear lapse to model a forecasting algorithm.
In view of the findings obtained, it is sensible to practice awareness in its functioning because of the restrictions posed by the data collected. Having estimated the test data with an R2 score of 99%, the model is implemented on a web interface to make predictions using new data. Faced with the current situation in the Republic of Congo, in addition to reforms such as those proposed by other researchers (Jean, 2016) in order to control public spending, break the dependence of the state budget on the hydrocarbon sector (EITI Executive Committee, 2017) and reorient the economy on other sectors such as agriculture and industry (Ministry of Plan, 2018), our work also emphasizes the need for the State to carry out structural reforms on the budget preparation process taking into account the evolution of methods and models of analysis offered by Machine Learning. This update would allow it to be able to maneuver with higher precision in this decision-making. The evolution of the proposed user interface and a mix of analysis models based on multi-sector and heterogeneous datasets will be the subject of our future research plans, with the aim of automating the entire process, from data collection to real-time forecasts of government funding capacity and needs.

Recommendations
In view of the situation in this 21st century, it would be wise for the Republic of Congo to adopt new standards taking into account local specificities in order to modernize the management and control framework of its public finances.
To break down appropriations by program and no longer by chapter as currently, moving from a logic of "medium "to a logic of" results" with the application of Results-Based Management (RBM) in order to improve the performance of public management.
In order to publish adequate information on a regular basis, the public administrations should be aware of the needs to produce statistical data and to strengthen the role of national specialized structures.
It should be updated and finalized the deployment of the state information system to increase the mechanism for collecting public revenues, rationalize its spending and direct investments on priority sectors for its economic and social development. ijef.ccsenet.org International Journal of Economics and Finance Vol. 13, No.6;