On the Comparison of Capacitance-Based Tomography Data Normalization Methods for Multilayer Perceptron Recognition of GasOil Flow Patterns

Normalization is important for Electrical Capacitance Tomography (ECT) data due to the very small capacitance values obtained either from the physical or simulated ECT system.  Thus far, there are two commonly used normalization methods for ECT, but their suitability has not been investigated.  This paper presents the work on comparing the performances of two Multilayer Perceptron (MLP) neural networks; one trained based on ECT data normalized using the conventional equation and the other normalized using the improved equation, to recognize gas-oil flow patterns.  The correct pattern recognition percentages for both MLPs were calculated and compared.  The results showed that the MLP trained with the conventional ECT normalization equation out-performed the ones trained with the improved normalization data for the task of gas-oil pattern recognition.


Introduction
Recognition of flow regimes of gas-liquid flow is important in industrial process such as gas-oil industry.However, this information cannot be easily determined since gas-oil flows are normally concealed within a pipe.One way to obtain such information is by employing the Electrical Capacitance Tomography (ECT) technique.ECT is a technique used to visualize the distribution of two dielectric components (Yang and Byars, 1999).It has been employed for industrial process containing different dielectric materials, such as gas/oil flows in oil pipeline.In an ECT sensor, several electrodes are mounted around the pipe vessel.It is said to be non-invasive and non-intrusive since the sensing electrodes are not physically in contact with the medium inside the pipe vessel.With N electrodes, the total number of M independence capacitance measurements is given by (Xie et al, 1992) (1) The measured capacitances are usually normalized before being used for any application.There two versions of equations for normalizing the capacitance values, referred to in this paper as the conventional normalization and improved normalization methods.The conventional normalization approach assumes that the distribution of the two materials is in parallel and hence, the normalized capacitance is a linear function of the measured capacitance (Yang and Byars, 1999).The normalization equation is given by (Xie et al, 1992) (2) where is the conventional normalized capacitance between a pair of electrodes i and j, is the measured capacitance, is the capacitance when the pipe is full of gas and is the capacitance when the pipe is full of the higher permittivity material than the permittivity of the gas, such as oil.An improved normalization approach is derived from a series sensor model by modeling the sensor capacitances as two capacitances in series and is given by the following equation (Yang and Byars, 1999) (3) where is the improved normalized capacitance between a pair of electrodes i and j, is the measured capacitance, is the capacitance when the pipe is full of gas and is the capacitance when the pipe is full of the higher permittivity material than the permittivity of the gas.
For both equations, the normalized capacitance values for empty and full pipe are 0 and 1, respectively.This can be proven by substituting C ij(p) = C ij(e) for empty and C ij(p) = C ij(f) for full pipe.Therefore, ideally, the maximum and minimum values for normalized capacitances are 1 and 0, respectively.If the normalized capacitance value is higher than 1, the value is said to overshoot and if it is less than 0, the value is said to undershoot.Until now, there has been no proper research to investigate which normalization method is the best for flow regime recognition using neural network.Thus, this paper presents the work on such investigation using the Multilayer Perceptron (MLP) neural network, the most commonly used neural network.

Multilayer Perceptron (MLP)
An Artificial Neural Network (ANN) or simply referred to as a 'neural network' is an intelligent system composed of simple processing elements which operate in parallel (Haykin, 1999).An MLP is a type of an ANN model.MLPs have been used in various different applications due to their ability to solve complex functions including pattern classification (Yan et al, 2004), function approximation (Lee et al, 2004), process control (Ren et al, 2000) and filtering tasks (Parlos, 2001).
Figure 1 shows the basic MLP supervised learning structure where pairs of input and target output are used to train the networks.The term 'supervised' refers to the involvement of target output that act as a 'teacher' during a neural network learning process.Typically, an MLP consists of neurons or nodes or processing elements arranged within an input layer, one or more hidden layers and an output layer.The input signal propagates through the network in a forward direction, on layer-by-layer basis.Figure 2 shows the architecture of an MLP (Haykin, 1999).

Methodology
ECT simulator based on the finite element method (Spink, 1996).The simulated data were used because of two reasons.Firstly, the actual data of flow regimes are very difficult to obtain since some flow patterns are non-repeatable.Secondly, the actual plant data are constantly interrupted by noise and external interference and hence the data collected are not accurate.The dataset were obtained based on various geometrical flow patterns to give a variety of patterns for each flow regime.Since the number of electrodes used for the ECT sensor was 12, each ECT dataset consists of 66 capacitance values corresponding to the difference in capacitances between all possible pairs of electrodes (refer to equation 1).Table 1 shows the number of readings corresponding to their pairs of electrodes.
For each flow regime, all of the capacitances values were computed into normalized value before being randomly divided into three sets in the ratio of 8:1:1 for training, validation and test, respectively.For this investigation, the number of training set, validation set and testing set were 1140, 142 and 142, respectively.The training set was used for computing the gradient and updating the network weights and biases during ANN learning.The validation set was used to stop the training process.The testing set was used to verify the network's generalization performance.
The second stage is data analysis.As already mentioned, the ideal normalized capacitance values are within 0 and 1.This stage is performed to determine the differences in normalized capacitance values between the conventional and improved normalization in terms of how much their overshoot and undershoot values differ.Normalized capacitance values for one pattern from each flow regime, for both normalization methods were plotted on the same graph and their differences were calculated and discussed.
The third stage is an ANN learning or training process.In this stage, MLP were trained with three different kinds of back-propagation training algorithms; the Resilient Back-propagation (RP), Quasi-Newton (QN) and the Levenberg Marquardt (LM).These training algorithms are the most commonly used training algorithms for classification using MLP neural networks.The number of inputs used in an MLP was 66 (corresponding to 66 normalized values) and outputs was 6 (corresponding to 6 flow regimes to classify).The output class representations are as listed in Table 2.The number of hidden neurons was determined using the network growing approach by adding one neuron at a time to the hidden layer.
Suitable activation functions must be applied to the MLP hidden and output neurons.The logistic sigmoid activation function is the most commonly used activation function for back-propagation algorithm because it is differentiable (Demuth and Beale, 1998).Due to this fact, the activation function had been applied to hidden neurons during training.Since the output neurons could result in either 0 or 1, the activation function applied to these neurons was the logistic sigmoid.
To ensure that the MLP was not stuck at a local minimum, 30 runs were made for each number of hidden neurons.Each training process stopped when there was no improvement in the validation error after 5 consecutive training iterations.At completion, the MLP weights and biases at the minimum of the validation error were saved.
The last step was MLP performance assessment.This stage was carried out to investigate the performance of the MLPs based on the percentage of correct classification (CCP) of test data to determine the best normalization method.The CCP is calculated using, (4)

Results and Discussion
The results on data analysis of the normalized ECT data and the performances of the MLPs trained with the conventional and improved normalized data are discussed in the following subsections.1. Table 3 shows the results of total shooting for the capacitance values based on the improved and conventional normalization equations.The table shows that the total undershooting for stratified flow normalized using the improved equation is higher than that normalized using the conventional normalization equation.On the other hand, its capacitance overshoot is higher when normalized using the conventional equation compared to the improved normalization.For bubble pattern, the overshoot value for both improved and conventional normalization methods is 0. However, the conventional normalization gives a slightly higher overshoot capacitance value of about 0.4%.The core example pattern gives 0 overshoot for both normalization methods, whilst the undershoot is higher when the improved normalization equation is used.The annular flow pattern for both, the improved and conventional normalization equations give 0 undershoot, whilst the conventional equation produces an undershoot value of about 0.2% higher than the improved equation.lower total overshoot value than the conventional normalization.However, the improved normalization equation results in 74% higher total undershoot value than the conventional normalization equation.Overall, the data analysis results have shown that the conventional normalization method has led to less total shooting compared to the improved normalization method.

Flow Regime Recognition
Figures 5, 6 and 7 shows the CCP plots of MLPs trained with the RP, QN and LM algorithms based on the improved and conventional normalization methods, respectively.From Figure 5, it is obvious that the MLP trained with the conventional normalized data produces higher CCP than the MLP trained with the improved normalized data.Also, it can be seen that the MLP trained with the improved normalized data give rather unstable CCP at different numbers of hidden neurons.Comparing the plots in Figure 6, it can be seen that the MLP trained with the conventional normalized data outperformed the MLP trained with the improved normalized data at 3 and more hidden neurons.The performances of the MLPs trained using the LM algorithm (see Figure 7) for both improved and conventional normalized data seems less competitive.However, it can be seen that the MLP trained with the improved normalized data is rather unstable in its CCP values for different numbers of hidden neurons.Although both MLPs produce the same maximum CCP, the MLP trained with the conventional normalized data reaches the maximum CCP earlier (at 6 hidden neurons) that the MLP trained with the improved normalized data.
Table 4 shows the comparison in the maximum performance of the MLPs trained with the improved and the conventional normalized data using various training algorithms as investigated.In the table, HN is the abbreviation for the number of hidden neurons.For the RP algorithm, the conventional normalization is better than the improved normalization in terms of the maximum CCP value.Also, its number of hidden neurons is less than the MLP trained with the improved normalized data.For the QN algorithm, the conventional normalization obtains its highest CCP at 98.6% with 11 hidden neurons whilst the improved normalization obtains its highest CCP at 96.5% with less number of hidden neurons.However, since the main concern for an MLP performance is the CCP value, it can be concluded that the conventional normalization is better than improved normalization.For the LM algorithm, even though both normalization methods obtain the same value of highest CCP with 99.3%, the conventional normalization has achieved its highest CCP with less number of hidden neurons compared to the improved normalization.Consequently, the conventional normalization method is better in terms of the number of hidden neurons, which results in smaller structure of the MLP.Thus, the network execution becomes faster.
The overall results demonstrate that it is better to use the conventional normalization for ECT data compared to the improved normalization method.However, it does not mean that the improved normalization method is of no value for other application.It has to be born in mind that this work focuses on flow regime recognition application.Perhaps, the improved ECT data normalization method might be of better used for other applications, such as image reconstruction.

Conclusion
An investigation had been carried out to determine the best normalization equation for ECT data for the task of flow regime recognition.For all the three training algorithms investigated, the results proved that the conventional equation was a better normalization equation for ECT data in the quest to train MLP neural networks.

Figure 3
Figure 3 shows schematic diagrams of flow regimes investigated in this work which are the empty flow, full flow, stratified, bubble, annular and core flows.The first stage of the work involved collection of ECT raw dataset using an

Figures 4
Figures 4(a) to 4(d) show plots of capacitance values for a variety of flow pattern examples normalized based on the improved and conventional normalization equations.The number of reading (i.e.x-axis) corresponds to the electrode pairs as listed in Table1.Table3shows the results of total shooting for the capacitance values based on the improved and conventional normalization equations.The table shows that the total undershooting for stratified flow normalized using the improved equation is higher than that normalized using the conventional normalization equation.On the other hand, its capacitance overshoot is higher when normalized using the conventional equation compared to the improved normalization.For bubble pattern, the overshoot value for both improved and conventional normalization methods is 0. However, the conventional normalization gives a slightly higher overshoot capacitance value of about 0.4%.The core example pattern gives 0 overshoot for both normalization methods, whilst the undershoot is higher when the improved normalization equation is used.The annular flow pattern for both, the improved and conventional normalization equations give 0 undershoot, whilst the conventional equation produces an undershoot value of about 0.2% higher than the improved equation.From the table, it shows that the improved normalization equation results in 38% Yan H., Liu Y. H. and Liu C. T. (2004).Identification of Flow Regimes Using Back-Propagation Networks Trained on Simulated Data Based on Capacitance Tomography Sensor.Measurement Science Technology, 15, 432-436.Yang W. Q. and Byars M. (1999).An Improved Normalisation Approach for Electrical Capacitance Tomography. 1 st Congress on Industrial Process Tomography, Buxton, UK, 215-218.Table 2. Output class representation for each flow

FigFigure 4 .Figure 6 .
Fig From the table, it shows that the improved normalization equation results in 38%

Table 3 .
Undershoot and overshoot values based on the improved and conventional normalization equations for examples of flow patterns