Prediction of Evaporation from Shallow Water Table Using Regression and Artificial Neural Networks

The relation between water table depth and evaporation rate from bare soil is of great importance in arid and semi-arid areas. In such areas, due to over irrigation, the water table is very close to the ground surface which leads to salinization of the soil. In this study a physical water table model was used to estimate evaporation rate in sandy loam, loam and clay loam soils under greenhouse conditions for 40, 60 and 80 cm water table depth. The evaporation from bare soil, evaporation from free surface, soil surface moisture (with using TDR) and maximum and minimum daily temperature were measured for 74 days in this study. In the next step, several nonlinear models have been efficiently developed with the aid of the Gamma test (GT), including local linear regression, two layer back propagation, and conjugate gradient descent and BFGS neural network to simulate the evaporation from soil. And finally, for evaluation of the models, the root mean square error and mean absolute error and larger determination coefficient were calculated. The results showed a suitable correlation between the predicted values and the test measures.


Introduction
In arid and semi-arid region, evaporation from bare soil is an important component of the water budget.Also, where ground water table is shallow, considerable amount of water would be lost due to capillary rise.Evaporation from soil surface results in a gradual accumulation of salt in the upper soil profile.Thus, the evaporation not only is responsible for water loss but also is a major reason for soil salinization (Gardner, 1958;Konukcu, 1997;Hillel, 1998;Zarei et al., 2010).
Knowledge of the relationship between the water table depth and the evaporation rate from bare soil is of great interest in arid and semi arid area.Due to over irrigation, water table is very close to ground surface and led to salinization in these regions, e.g. in playa areas and on the fringes of rivers and lakes (Gardner, 1958;Rose et al., 2005;Gowing et al., 2006) In arid and semi-arid regions, the evaporative demand is usually greater than the ability of the soil to conduct water in the liquid phase and a liquid-vapor phase discontinuity, known as the evaporation front (EF), occurs at an intermediate depth between the soil surface and the water table (Menenti, 1984;Bastiaanssen et al., 1989;Asghar, 1996;Gowing & Asghar, 1996;Rose et al., 2005).
In addition, the evaporation from the surface of bare soils is a crucial subject, because the evaporation and liquidity can be separated from their summation.In 1998 Hilel found that the soil surface evaporation is an endothermic activity and constitutes 10 to 60 % of the total evaporation and it significantly reduces the plants' liquidity.
The evaporation is a complex and nonlinear process because it depends on different ecological factors while these factors affect each other.So preparation of a mathematical model for evaporation with consideration of all of the effective ecological factors is a difficult task; therefore if it is prepared, it faces so many errors or needs more information; the preparation of all of these information are difficult and time consuming.(Jain et al., 1999) However, as the evaporation process is strongly nonlinear in nature, researchers should emphasize on the accuracy of used methodologies in modeling it (Lindsey & Farnsworth, 1997;Xu & Singh, 1998;Bruton et al., 2000).
Even in the lack of knowledge of equations representing the system behavior, Gama test methodology can be always used as a tool to model the systematical behavior of the phenomenon based on the measured data (Jones et al., 2002).
In the current study a nonlinear data analysis technique, called the Gamma Test, was used.The Gamma Test examines the relationships between the input and output data sets.Suppose that we have a set of input and output observations; Gamma Test enables us to quickly evaluate and estimate the best mean squared error (Agalbjörn et al, 1997;Jones et al, 2002).
This test was first briefly presented by Agalbjörn et al. (1997) and then by Stefansson et al. (1997), the details of this method was presented in (1998)(1999) by various researchers (Chuzanova et al., 1998;Tsui, 1999;Durrant, 2001;Tsui et al., 2002;Jones et al., 2002).Recent studies also include Gama Test for field assessment of evaporation and liquidity.Using the Gama Test, Remesan et al. (2008) specified the effective factors on solar radiation in United Kingdom.Moghadamnia et al. (2008) used this methodology to model the regional evaporation of half wells in Sistan and Baloochestaan; and Piri et al. (2010) used this test to model the evaporation in warm and dry areas(Sistan and Baloochestan, Iran).
The main objective of this paper is to assess the efficiency of nonlinear methods such as linear regression and neural network in the estimating the evaporation from the water table using the Gama Test.In addition, the abilities of Gama Test in making non-dimensional nonlinear models based on the root mean square error will be studied.

Materials and Methodologies
In order to measure the evaporation rate for different soil types having various water table depth, three soil texture including Loam, Sandy Loam and Clay Loam were used in the study.The physical properties are presented in Table 1.The experiment was conducted in a greenhouse and the duration was 74 days.Frist, soils were passed through 2mm sieve and then they were poured into tube experiment using soil funnel.To prepare the column experiment, 200mm PVC tube with length of 400, 600 and 800 mm were used.
For stabilizing the water table in different depths, bottles were used which were placed upside down next to the tube experiment and was connected to it through a pipe, water was poured into column experiment from bottom of the tube experiment; therefore, following the law of U-shape Tubes the water table will be stabilized in a certain depth.
Water table was stabilized at depths of 40, 60 and 80 cm from the soil surface, and the experiment was repeated twice.Prior to pouring the soil into tube experiment, 2cm gravel of 4 mm diameter, were poured into tube experiment, to facilitate the water entry from bottles into the soil column.
Soil column were saturated from below, and to prevent the evaporation, the top of the tubes were covered with plastic sheets so that the saturation become complete during saturation process.After the completion of saturation process, the covers were removed, and the readings include daily evaporation rate through adding certain volume water into bottles for stabilizing water table.The soil surface moisture was measured using TDR.Moreover, the maximum and minimum temperatures were also recorded daily.
In order to measure the daily evaporation from free surface, two 60 cm height dead-end similar tubes were used beside the main tubes.Then, the daily evaporation rate was measured though certain volume of water added daily to fill up the pipe.

False Water Table Model Based on the Uniform Flow Equations
In order to estimate evaporation from surface soil, especially bare soil, (Gowing et al., 2006) offered a false water table model based on a Gardner (1958) uniform vertical flow equations.This model, unlike Gardner model that exclusively was considering the flow as liquid phase in the soil, considers both liquid and gas.When evaporation is performed from the soil surface, during the equilibrium stages the contour between the water table to the soil surface can be distinguished in three mean stages: a) there is no evaporation front in the soil (Evaporation front is the contour of region between the soil surface and the water table that the flow moves in its below as a liquid and as a vapor on the top of it); b) Evaporation front begins to come down to the earth; c) Evaporation front moves constantly.These steps are shown in Figure 1.In the first stage of evaporation, there is just surface evaporation and its modified amount is calculated according to the Penman equation as follows: (1) E: evaporation from soil surface (mm/day); Rh: Relative humidity of soil water; which in salinity conditions is divided in two parts: ; h o : Osmotic potential (cm); Matric potential (cm); Δ : the slope of vapor pressure-temperature curve (kPa K-1); equal to de/dT, which e is real vapor pressure (kPa) and T is temperature (K); Rn: net radiation (W.m -2 ); G: the background radiation (W.m -2 ); λ heat of vaporization (J kg-1); γ : dampening factor or Psychrometric (Kpa K-1); Ea: evaporability power of atmosphere (mm.d-1), ; u: The average wind speed in the height of 2 m (m/s); e sat : saturated vapor pressure (kPa).Amount of evaporation at this stage is approximately equal to the amount of evaporation from the free surface and regarding soil texture and water table, it takes from one to three days.
During the first stage in the progress of time, soil profile loses θ Δ amount of water which is calculated from the following equation (Gowing & Asghar 1996): In this equation, q l is amount of vertical flow from water surface.In the first period ( 1 t Δ ),the initial soil moisture in the surface soil is equal to the initial soil moisture ( i θ ) and the amount of surface moisture, early in the second period ( 2 t θ ), is calculated from the following equation: : z is the depth of the evaporation front).This short period of about one to three days for different soil texture was obtained.
In the second stage of the evaporation process, there are two flows, fluid flow under evaporation front and vapor flow above the evaporation front.In these conditions, evaporating power of the air is high, but the amount of water that moves upward from the soil profile and reaches to evaporation front and evaporates is limited.Gardner (1958) assumed when moisture of evaporation front reaches the under wilting point, soil moisture has reached to dry air.In this situation, maximum fluid flow when Z is equal to the depth of the water table ( w z ) can be calculated as follows: (4) That for n=3/2 A=3.77a; for n=2, A=2.46a ; for n=3, A=1.76a; and for n=4, A=1.52a, and a is a constant coefficient of equation, and for each soil texture is different.max q is the maximum flow transferring of water as a liquid from water table in a steady flow to the soil surface.Equation ( 4) is used to calculate the maximum flow of liquid from the water table in steady state when there is no evaporation front or when it is very near the ground surface.However, when the evaporation front moves down, distance of water table from the evaporation front is equal to e w z z − that should be used instead of w z in equation ( 4).Therefore equation ( 4) is modified as follow: Z w is the depth of the water table and Z e is the depth of the evaporation front in the first stage.For the flow of vapor in above the evaporation front ) ( , the following equation is used (Gowing et al, 2006) is the diffusion coefficient of water vapor in the soil that has been calculated by the equation introduced by Rose (1963).The depth of the evaporation front increases after the end of the first time period and the second time period ( 2 et z ) is calculated as follows: In equation ( 8), i θ indicates the initial moisture that has been obtained by using TDR.θ Δ is the reduce amount of moisture in this stage that can be calculated using . This process continues until the flow of water as a vapor is equal to the fluid flow.
In the third stage (steady state) the evaporation front remains constant and fluid flow and vapor flow in equations ( 5) and ( 6) are equal; in this study, after 67 days this case was created.Data required to simulate the evaporation from the water table using a false water table can be seen in Table 2.

Gamma Test
The Gamma Test estimates the minimum mean square error (MSE) that can be achieved when modelling the unseen data using any continuous nonlinear models.The Gamma test was first reported by Končar (1997) and Agalbjörn, et al. (1997), and later modified and discussed in detail by many researchers (Chuzhanova et al., 1998;Tsui, 1999;Tsui et al., 2002;Durrant, 2001;Jones et al., 2002;Moghaddamnia et al., 2008a;Piri et al., 2009).
The basic idea is quite distinct from the earlier attempts with nonlinear analysis.Suppose we have a set of data observations of the form where the input vectors x i Є R m are vectors confined to some closed bounded set C Є R m and, without loss of generality, the corresponding outputs y i Є R are scalars.The vectors x contain predicatively useful factors influencing the output y.The only assumption made is that the underlying relationship of the system is of the following form ( ) where f is a smooth function and r is a random variable that represents noise.Without loss of generality it can be assumed that the mean of the r's distribution is zero (since any constant bias can be subsumed into the unknown function ( f ) and that the variance of the noise Var (r) is bounded.The domain of a possible model is now restricted to the class of smooth functions which have bounded first partial derivatives.The Gamma statistic Γ is an estimate of the model's output variance that cannot be accounted for by a smooth data model.
The Gamma Test is based on [ ] k i N , , which are the k th ( ) for each vector ( ) .Specifically, the Gamma Test is derived from the Delta function of the input vectors: (3) where ... denotes Euclidean distance, and the corresponding Gamma function of the output values: is the corresponding y-value for the k th nearest neighbour of i x in Eq. ( 3).In order to compute Γ a least squares regression line is constructed for the p points ( ) The intercept on the vertical axis ( ) Calculating the regression line gradient can also provide helpful information on the complexity of the system under investigation.A formal mathematical justification of the method can be found in (Evans & Jones, 2002).We can standardise the result by considering another term V ratio , which estimate between zero and one.The V ratio is be defined as where, ) ( 2 y σ is the variance of output y, which allows a judgement to be formed independent of the output range as to how well the output can be modelled by a smooth function.A V ratio close to zero indicates that there is a high degree of predictability of the given output y (Monte, 1999).

Nonlinear Models
Nowadays, due to the advancement of computer technology, there are a large number of nonlinear methods such as Artificial Neural Networks, Support Vector Machines, Fuzzy Logical system, Polynomial function, Local Linear Regression, Bayesian Belie Networks, Decision trees, etc.In this study, due to the constraint of time and resources, we focused on only two popular model types: Local Linear Regression (LLR) and Artificial Neural networks (ANN).

Local Linear Regression (LLR)
The LLR technique is a widely studied nonparametric regression method which has been widely used in many low dimensional forecasting and smoothing problems.The only problem with LLR is to decide the size of p max , the number of near neighbours to be included for the local linear modelling.The method of choosing p max for linear regression is called influence statistics and is explained below.Given a neighbourhood of p max points, we must solve a linear matrix equation Where X is a d p × max matrix of the pmax input points in d-dimensions, max ( 1) are the nearest neighbour points, y is a column vector of length pmax of the corresponding outputs, and m is a column vector of parameters that must be determined to provide the optimal mapping from X to y, such that The rank r of the matrix X is the number of linearly independent rows, which will affect the existence or uniqueness of the solution for m.If the matrix X is square and non-singular then the unique solution to Eq. ( 8) is 1 − = m X y .If X is not square or singular, we modify Eq. ( 8) and attempt to find a vector m which minimises − Xm y (10) Which was proved by Penrose (1955) where the unique solution to this problem is provided by # = m X y where # X is a pseudo-inverse matrix (Penrose, 1955;Penrose, 1956).

Artificial Neural Networks (ANN)
Artificial Neural Networks created on according more extensive internal communication like nervous system and brain of human Neural Networks are elements of dynamic system that regulate all of data with experiential regulator and latent rule behind of information transit to structural Network.For this case we call them intelligent system because it is according to calculation of normal data and examples of general ways (Jain et al 1999).There are different learning algorithms and a popular algorithm is the back propagation algorithm that employs gradient descent Algorithms like Conjugate gradient, quasi-Newton, Levenberg-Marquardt (LM) etc. are considered as some of the faster algorithms, which are all make use of standard numerical optimisation techniques In this study, we used the Broyden-Fletcher-Goldfarb-Shanno (BFGS) neural network training algorithm (Fletcher, 1987), and conjugate Gradient (CGD) training algorithms along with two layer back propagation algorithm (TLBP) architecture which is embedded in WinGamma software.

Evaluation Prestige of Model
The performance of the LLR technique and neural network based models were compared using three global statistics: root mean squared error (RMSE) and mean absolute error (MAE), determination coefficient (R 2 ).The RMSE represents the deviation between simulated values and observed values.The lower MAE values indicate more accurate estimations.R 2 measures the degree to which two variables are linearly related ( Willmott, 1982).
where O i and p i are the observed and predicted at time i, respectively; o is the mean of the observed evaporation; and n is the number of data points.

Evaporation from the Soil Surface and Free Water Surface
Measured evaporation from sandy loam soils, loam and clay loam and water table surfaces, 40, 60 and 80 cm, which is measured on a daily basis, is available in  Evaporation from the soil surface decreased with time and finally has reached a constant value.But the relationship of amount of evaporation with the soil texture and water table is in the way that the lighter the soil texture is and the more shallow the water table is the more evaporation amount is at the first and this amount will get decreased after some time, so that this amount will be less than the heavy texture.Maximum and minimum temperatures are also measured in this study in daily base, with a maximum of 48 and minimum of 9 degrees Celsius.

Data Analysis Using Gama Test
Prediction of evaporation from the soil surface using four factors, evaporation from the water surface, maximum and minimum daily soil temperature, and amount of soil surface moisture was studied.Randomly, 45 pieces of data for training the models and 30 pieces for predicting the models were used.The results related to analysis obtained by gamma test for the sandy loam soil and water table of 60 cm is presented in Table 3. Slope (A) indicates an indicator for the complexity of the model and ratio (V) indicates the ability to predict output using inputs show.Model with four inputs evaporation from free surface of water, maximum temperature, minimum temperature and surface humidity was the best structure.In this model, the small gamma values, standard error, and the slope and V-gradients can prove the ability of the model.

Results of Logistic Regression and Neural Network Models
In this study two types of models for predicting evaporation from a shallow water table was used (local linear regression and neural networks).Neural network was developed using algorithms of Broyden-Fletcher-Goldfarb-Shanno (BFGS) simultaneously reducing slope, and two reversal layers.Optimal number of nearest neighbors for LLR (original dependency error rate) was 16 with trial and error.The performance of LLR technique with neural network models was compared using statistical indices (root mean square error, mean absolute error and coefficient of determination).Results from predicting evaporation for different surfaces of the water table and different textures are shown in the Tables 3-11.As can be seen from the results of the table, root of mean square and absolute mean of low error and coefficient of high determination represent the high accuracy of the models.Also, the local linear regression and neural network models with BFGS algorithm is relatively better than other methods.
In this study, different combinations of number of hidden layer neurons for the neural network models were tested.Forward 1-9-9-4 neural network structure was used to train the BFGS algorithms; simultaneously reducing slope algorithm, and tow reversal layer and their performance was compared with LLR model (Tables 4  -12).

Conclusion
This study describes a new method to estimate evaporation from a shallow water table using gamma test and its combination with non-linear models methods.In this study, the ability of gamma test in producing non-linear models to estimate the evaporation was successfully demonstrated.In this study, first, the amount of evaporation was measured from a shallow water table.Then, the performance of experimental models of false water table, regression, and neural network with 4 inputs, including evaporation from the water free surface, maximum temperature, minimum temperature and surface humidity in predicting evaporation from a shallow water table for three soil textures, loam, sandy loam, and clay loam, and in three water levels of 40, 60 and 80 cm in daily interval was evaluated.One of the reasons to use the experimental false water table, were the assumptions used in this model.Unlike Gardner model, water table model calculates the amount of evaporation in the vapor phase that increases its accuracy.To predict the intensity of evaporation using regression model and neural network models forty data for training and thirty data for predicting the model were randomly applied.The amount of evaporation from the free surface had the greatest impact.In the validation phase, based on the explanation coefficient, the highest amount related to linear regression model was for sandy loam soil the water  Gowing et al (2006) and Rose et al (2005), it is known that linear regression and artificial neural network methods perform the simulation with high accuracy.Generally, the results showed a good fit between measured and predicted results.Based on the results of the present study, it can be summarized that the proper use of regression methods and neural network models can help rapidly analyze the evaporation from soil in soil and water systems.

Figure 2 :
Figure 2: Evaporation from the soil surface vs. time for the sandy loam soil (a), loam (b) and clay loam (c) for water table depths of 40, 60 and 80 cm.
Figure shows the amount of evaporation from the free water surface.

Figure 3 :
Figure 3: Mean intensity of evaporation from free water surface over time

Table 1 .
Physical properties of soils the thickness of the layer which has lost moisture during the first phase.If 2 t θ is larger than e θ (humidity of dry air), the calculation is repeated for successive time intervals.When soil surface moisture decreased and is equivalent to e θ , the first stage is completed.At the end of this stage, the depth of the evaporation front is:

Table 2 .
Needed data for making a false water table evaporation model

Table 3 .
The results of the gamma test

Table 4 .
Results of prediction statistical analysis for the sandy loam soil and the water table of 40 cm

Table 6 .
Results of prediction statistical analysis for the clay loam soil and the water table of 40 cm

Table 7 .
Results of prediction statistical analysis for the sandy loam soil and the water table of 60 cm

Table 8 .
Results of prediction statistical analysis for loam soil and water table of 60 cm

Table 9 .
Results of prediction statistical analysis for the clay loam soil and the water table of 60 cm

Table 10 .
Results of prediction statistical analysis for the sandy loam soil and the water table of 80 cm

Table 11 .
Results of prediction statistical analysis for loam soil and water table of 80 cm

Table 12 .
Results of prediction statistical analysis for the clay loam soil and the water table of 80 cm table of 40cm with the value of 0.97 and after that BFGS model was for sandy loam soil and the water table of 80cm and the lowest amount related to the neural network model with two reversal layers and sandy loam soil and water table of 80cm with a value of 0.73.The amount of BSME has the lowest value of 0.129 for BFGS in clay loam soil and the water table 40 cm and the highest value of 2.484 is for false water table model and clay loam soil and the water table 40 cm.The amount of MAE has the lowest value of 0.376 for regression and sandy loam soil and the water table of 80 cm and its highest value is for false water table model for loam soil and water table of 60cm.Low error values indicate the accuracy of various methods used in this research.By comparing the results of this study with the results of