Satisfying Statistical Constraints in Preparing Edited Variable Amplitude Loading History Using Genetic Algorithm

A major concern that surfaces when performing the segment-based fatigue data editing technique is to certify that the values of two global statistics (root mean square and kurtosis) of the edited load history are within an acceptance interval whilst maximizing the data reduction rate and minimizing the loss in damage. The root mean square (rms) quantifies an overall energy underlying the history whilst kurtosis is important to identify impulsive character. In this paper, the stochastic Genetic Algorithm (GA) is employed as a post processing tool that helps the edited history satisfy the statistical requirements with minimum cost i.e. small decrement in the initial reduction rate. Consider the initial version of edited history being composed of high fatigue damage segments resulted from the non-overlapping segmentation method. In a case that the history does not comply with the statistical requirements, then importing a subset of low segments into the present edited history might reverse the outcome. Thus, the GA aims to search for the smallest subset that turns the history into fulfilling the rms and kurtosis needs without affecting the reduction rate too much. Experimental results show the capability of the proposed method in making the edited history fit the statistical constraints without imposing harm on the overall fatigue damage value.


Introduction
Time history of load (load history) has great influence in durability testing which is associated with fatigue failure analysis.Fatigue failure may be described as a product of oscillatory actions under varying loads at one or several stress concentration points.Prior to commencing the testing, it is recommended to edit load history in making a cost-effective environment e.g.short operation time and save money (Abdullah, Choi, Giacomin, & Yates, 2006;Petracconi, Ferreira, & Palma, 2010).The editing refers to the effort of simplifying a load history and it can be done by removing small amplitude cycles that make up a large percentage of generated cycles in the history.To date, those cycles are found individually as appeared in (Stephens, Dindinger, & Gunger, 1997) or based on the segment-to-segment analysis.
Segment-based fatigue data editing technique can be seen as a series of analysis involving segmentation, labelling and selection.A proper segmentation method partitions the loading history into number of meaningful segments.Once completion of labelling the segments, low (or high) damage segments are removed (or kept) to shorten the history.In depth segment-to-segment analysis provided details that assist editing techniques under the approach preserving cumulative fatigue damage and retaining load sequence in edited history.However, it is found that most of the techniques mentioned have not dealt with satisfying the statistical constraints-those of which are other criteria that edited history must meet in such detail.For example, authors in (Putra, Abdullah, Nuawi, & Nopiah, 2010) simply apply a try-and-error approach in their respective algorithm to ensure the values of rms and kurtosis of edited history that appear within the acceptance interval i.e. within 10% deviation.The approach could be seen as a negative approach compared to its predecessor, where an incremental step correction method is applied (Abdullah et al., 2006).Meanwhile, techniques in (Abdullah, Nidzwan, & Nuawi, 2009;Nopiah, Baharin, Abdullah, Khairir, & Ariffin, 2010) do not explain how the edited history fits the constraints i.e. conclusion is merely depending on the comparative results.It could be summed up that the authors overlook the fact that the data point in the data segment contributes to the statistical calculation.Thus, it might more meaningful and becoming less complex if the statistical constraints could be confronted separately i.e. not in the same level with other objectives.A simple way to perform is by identifying and removing only undamaging segments that contribute less changes in the history's statistics.In order to minimise risk when performing segment selection thus it is reformulated as a combinatorial optimization problem.
For the combinatorial problem, particularly the combination without repetitions, an optimization tool search for the best combination consists of r objects out of finite possible outcomes that minimize or maximize certain objective functions defined on some domains.However, solving a combinatorial problem i.e. finding optimal solution(s) is NP-hard, and it becomes worse as the value of parameter r is unknown i.e. significantly expanding the search space (Ahmed, 2010).Therefore, a heuristic algorithm e.g. the GA is often used as an approximation algorithm to produce good solutions (so-called approximate solutions) (Shiqiong et al., 2008).The GA implements a searching strategy based on natural evolution principle to seek an optimal solution.Besides no requirement to examine the data structure and to compile it with any auxiliary knowledge, the ability to perform solution exploration in many directions in large search space simultaneously have made GA a popular choice for solving combination optimization problems (Fung, Kwong, Siu, & Yu, 2012;Konak, Colt, & Smith, 2006).
In this study, a time series non-overlapping window segmentation method with fixed window length is applied to loading history; the easiest way to get meaningful and inarguable non-overlapped fatigue segments.Then, each of them is classified according to the damage level.Once the segments are located and labelled, then preliminary edited history is built by joining together all high damage segments.Here, we assume that the above strategy cannot guarantee satisfaction on statistical constraints.One of the possible ways to counter the problem is by retuning a subset of low fatigue segments into a sequence of history.Undoubtedly, this action will reduce the current data reduction rate due to more fatigue segments participating in the edited version; however, those extra segments contribute to total damage even if the amount is small (Wu, Liou, & Tse, 1997).To avoid unnecessary loss in reduction rate, the GA is applied to a collection of low damage fatigue segments.The GA aims to search for the best low segment combination to join a set of high fatigue segments in order to build applicable edited history at minimum cost.A combination cost i.e. fitness function is evaluated based on the size of segment subset weighted by a unique penalty function.Subset i.e. chromosome with smaller fitness value represents the subset that is worthier to be sent back inline.Comparison results on strain-stress cycle properties exhibits the effectiveness of the proposed GA-based segment selection.

Overview on Genetic Algorithm
Genetic algorithm is a stochastic search algorithm inspired from Darwin's evolution theory.The search uses an evolutionary strategy i.e. survival of the fittest among the string environment to solve a problem e.g.optimization problem.GA starts with zero knowledge about the true solution and entirely depends on the genetic operators such as selection, crossover and mutation to expand the initial population to reach optimal solution.In GA terminology, a population consists of large number of possible solutions called chromosome or individual.Each chromosome is evaluated by the fitness function, where in most optimization cases, it is directly defined by objectives of the search.Figure 1 illustrates the core operation of GA (basic mode) called reproduction cycle.On top of the cycle, s chromosomes are randomly selected and compared.A winner which is the fittest chromosome is passed into a mating pool while all the remaining is moved back to the recent population.Then the population is reshuffled and the selection process continues until the pool is fully-loaded.The size of pool is usually half of the population size, m.In the second phase, the use of a common single-point crossover is able to cross-join two random pool members.At this point, exchanging tails between two chromosomes automatically breeds two offspring where they are probably fitter than their parents.In some occasions, an offspring is produced via mutation, not from the crossover operation.By mutating one of the parent's genes, a new chromosome is obtained.The second phase is over after m/2 repetitions.Finally, both populations are combined and sorted in the descending order.Only top m chromosomes are later moved to substitute the previous population.The whole process is repeated over t times but it sometimes stops once it has met other predefined stopping criteria.

Fatigue Segment Combination Problem
In this study, loading history H is a time series of N points measured in the unit of microstrain    , . The history is transformed into a row of non-overlapping (disjoint) data segments where each consists of 500 points.The figure corresponds to sampling rate used in the data acquisition.This is a conservative but reliable way to split the time series into M meaningful (informative) segments ("How fast should a signal be sampled?").Now, the history is defined to be where the first segment The segments were then classified either to low or high damage.The fatigue segment combination (FSCo) problem can be described as follows: where S is a subset of a collection of low damage segments L i.e.L S  and an edited history Edt constitutes all high fatigue segments associated with the S .

X RMS
returns rms value of time series X .Similarly,

X kurtosis
returns kurtosis value of time series X .The objective is to find the optimal set * S that satisfies both constraints (refers Eqs.2-3) where each of which is an interval that corresponds to lower and upper limits of relative statistical parameter at the degree of tolerance  equals 0.1 (Nopiah, Abdullah, et al., 2010), with minimum cost i.e. small size of S .

Genetic Algorithm for the FSCo
Having appropriate fitness function associated to the problem domain is very important in the GA formulation to avoid distortion in the fitness evaluation of chromosome i.e. solution candidate.For the FSCo problem, Eq. 1 cannot be used directly as a fitness function because it does not have the ability to distinguish whether the participation of S causes the edited history to fit the constraints or not.The easiest way to overcome this weakness is by assigning such kinds of flag to Eq. 1 and we therefore propose the following form for FSCo fitness function: where a penalty function p returns H if the chromosome violates any of the constraints, otherwise 0. This forces the irrelevant combination far out from the neighbourhood of optimal point.The following subsection briefly explains central GA parameters, such as chromosome representation, crossover and mutation operator, etc.

Chromosome Encoding
The * S separates the collection of low fatigue segment into two groups; 1) chosen to participate in the edited hostory and 2) permanently removed.Thus, it is suitable to use standard binary-coded chromosome to encode the solution candidate.Here, all chromosomes have fixed N genes where N depends on the number of low segments resulted from the labelling task.A gene with the binary value of 0 representing a segment does not appear in the corresponding S; if the value is 1, it does appear.Consider 10  N , a chromosome with bits strings of 0110110000 that indicates that only the 2 nd , 3 rd , 5 th and 6 th segments are subscribed by S.

Genetic Operators
The settings of GA parameters used in the FSCo optimization problem are given in Table 1.

Generation size Fixed
Set to 1,000.

Experimental Design
Two automotive variable amplitude loading histories employed in literature (Nopiah, Baharin, et al., 2010) labelled as S1 and S2 were considered in the performance evaluation of the GA-based fatigue segment combination.For each history, the following activities were applied to generate inputs to the proposed system.
i.For scaling purposes, observation points of history were initially normalized to a range of -999 to 999.The corresponding rms and kurtosis value set the constraints.
ii.The history was then split into s H M  non-overlapping segments where s is a sampling rate.
iii.Next, the Palmgren-Miner's linear damage rule (Baek, Cho, & Joo, 2008) was used to calculate damage value for obtained segments meanwhile the strain-life Morrow model (Downing, 2004) involved in fatigue life estimation.For labelling purpose, a fatigue segment is conveniently labelled as low damage (high cycle) if damage value is less than 3 10  otherwise it will be labelled as high damage (low cycle) (Abdullah et al., 2010).
iv.Low damage segments were pulled out from the history and grouped into set L while the remaining segments were joined together as an initial edited history.
After the FSCo, the edited history suggested by the GA and the full length history were cycle-counted using the rainflow counting algorithm.Comparison results on the distribution of cycle properties e.g.mean and amplitude measures the performance of our method.Since GA is stochastic, then the experiment has been repeated for 10 times, each of which was populated randomly.The best result for each history is shown in the next section.

Results and Discussion
Some exact sixty 500-point non-overlapping segments were located in each of S1 and S2.The corresponding damage level of those segments is graphically shown in Figure 2. The plots show imbalance in class distribution, where a number of low damage is about 2-4 times more than the high damage.This finding corresponds to the nature of the loading history in which the majority of cycles come in a small amplitude cycle (Stephens et al., 1997).Using these figures the chromosome length is set to 47 and 50 for S1 and S2, respectively.This indicates that it provides extremely large space for the GA to search for the S * .It can be estimated that there would be www.ccsen 1 2 47  pos area, the so

Table 1 .
Configuration of GA parameters for the FSCo problem