Hybrid Simulated Annealing with Meta-Heuristic Methods to Solve UCT Problem

,


Introduction
Simulated Annealing, also known as (SA) is a stochastic optimization technique inspired by Metropolis algorithm for statistical mechanics (Metropolis et al., 1953;Van Laarhoven & Aarts, 1987).Annealing is the cooling process of materials in a heat case or the molten crystalline solids.The target of this cooling process is the alignment of atoms in the most regular possible crystalline structure.Kirkpatrick et al., (1983) showed that the solids annealing model, which proposed by Metropolis et al. (1953) could be used for optimization problems, where the (minimizing or maximizing) the objective function corresponds to the metal status energy.SA starts once with an initial solution (Sol) generated randomly or using constructive techniques.At each iteration, SA generates a neighbor solution, Sol' of Sol.The Sol' is accepted if it improves the cost value f(Sol) of Sol (where ∆= f(Sol') -f(sol) ≤ 0); Otherwise Sol is accepted if, p = e-∆/T > generated random number between 0 and 1. T is the current temperature.At the beginning of the search, T is high and decreased during the search following using a cooling scheme.SA stops when the T value is close to zero (frozen temperature), where there are no more moves to be accepted.As in many metaheuristics techniques, SA has the drawback of revisiting the same solution (recycling) and trapped in local optimum.This will lead to a longer time to find reasonable good solution.This problem will cost an extra computational time without any improvement.Therefore, using memory in SA is an effective way to overcome the problem of recycling as proposed in (Méndez-Díaz, Zabala, & Miranda-Bront, 2016).Moreover, the temperature reheating could divert the solutions to escape from local optimum when the temperature becomes very cold.Moreover, this paper presented enhanced exponential Monte Carlo algorithm with non-improvement counter, where the basic Monte Carlo (MC) method is defined by Woller (1996) as: "Monte Carlo (MC) methods are stochastic techniques--meaning they are based on the use of random numbers and probability statistics to investigate problems".
Monte Carlo (MC) is the simplest approximate algorithm.Ayob & Kendall (2003) introduced three probabilistic approaches to accept the worse solution or non-improving moves.
1) Linear Monte Carlo (LMC) uses a linear probability of accepting to the worst solution.
2) Exponential Monte Carlo (EMC) has the same behaviour as LMC but it uses an exponential ratio.
3) Exponential Monte Carlo with counter (EMCQ) has extend the EMC acceptance criterion to accept the worse solution depending on the solution quality, and a consecutive non-improving iterations number as a counter to adaptively accept the worst solution.Abdullah et al. (2005) presented a modified variable neighborhood search approach with EMC acceptance criterion for course timetabling problem (called VNS-EMC).The experimental results showed that the VNS-EMC is better than the standard VNS and it comparable with other approaches in the literature.Saber et al. (2009) hybridized the EMCQ with Tabu list.The main contribution of this hybrid approach is to save the moves during the search, in order to avoid cyclic move, by keeping the accepted move in a Tabu list for a certain number of iterations.Therefore, this work presents a hybrid solution using SA with EMC-counter and Tabu list memory to addresses the limitations of SA trapped in a local optimum by escaping from local optimum solution.

SA Components
Generally, when applying SA algorithm, a number of common components must be set in order to obtain an effective SA algorithm, which are: (i) Initial temperature, (ii) Cooling schedule, (iii) Neighborhood structure.

Initial Temperature
The initial temperature leads SA to walk randomly over the landscape.Thus, high or infinity initial temperature would be the best choice (Triki et al., 2005).On other hand, the high or infinity initial temperature leads SA to take a longer time to cool down (Triki et al., 2005).However, Poupaert & Deville (2000) estimated the temperature during the search process and made the control parameter using the current acceptance probability.
The value of the initial temperature is a fixed value selected according to the acceptance probability criteria during the search process.
Table 1.Summary of the presented initial temperature from the literature Author Mechanism Disadvantages Tarawneh et al. (2013a) Dynamic mechanism to initialize the initial temperatures according to some solutions for each instance.Given the feasible initial solution, the SA starts several iterations and calculates the deviations average.Where the mechanism will decide the initial moderate temperature according to the SA acceptance criterion ratio.
The moderated temperature leads the solution to local optimum and minimize the worst accepted solutions.Zhang et al. (2010) Initialize the initial temperature at the first stage of the algorithm The temperature is still high especially after the first part of the process, which consumed more computational time Aycan & Ayav (2009 Select the initial temperature before the algorithm start, comparing the bad transition ratio for the first current solutions with a given value the temperature is still high after the first part of the search process Poupaert & Deville (2000) Estimate the temperature during the search process The temperature is not control parameter anymore but the acceptance probability Kirkpatrick et al. (1983) Starting with high initial temperature value, and then estimate it by comparing the accepted transitions ratio applied for several bad transition with a given value The temperature is not a control parameter anymore and the SA could trap in local optimum very fast According to the above literature, many researchers selected the initial temperature as their starting point in order to avoid consuming more computational time and to avoid quickly trapping in local optima.Poupaert & Deville (2000) initialized the temperature based on the initial acceptance ratio χ0, as described in equation 1.
(1) Where χ is the starting acceptance probability between (60% to 80%), T0 is the starting temperature; δi = f(si)f(s0), s0 is the initial solution, s is the new neighbour of s0, f(si) is the objective function for si, and m is the neighbors solutions space size.They repeat the above procedure until the acceptance ratio exceeds χ0 (i.e χ0 > 0.8).

Cooling Schedule
The Cooling schedule is the process of decreasing the temperature value in SA search process.Romeo and Sangiovanni-Vincentelli (1991) presented a procedure in order to design the cooling schedule in the following steps: 1) Start with an initial temperature T0, leading to a satisfactory approximation of the steady distribution DT0. 2) Reduce T0 by small increment α(T) such that DT0 is a good starting point in order to approximate DT0 -α(T).
3) Repeat the above process until no more improvement.
The cooling schedule has two categories: 4) Static schedule specifies at the beginning (before the algorithm starts).Example of static cooling schedule is geometric Cooling Schedule (GCS) which proposed by Kirkpatrick et al. (1983), is the most popular cooling schedule, in term of the simplicity to obtain.In GCS, the temperature is reduced as in equation ( 2).
Where ti+1 is temperature after i+1 iteration; α (0< α <1) is the cooling rate or factor.5) Adaptive schedule is the process of adjusting the temperature decrement during the algorithm process in reference to the information obtained (e.g.use the objective values for each level to decide the decrement amount).Namely, the amount of temperature decrement in the next iteration obtained based on the run history.
The purpose of this adaptive cooling schedule is to maintain the search solutions to be close with each other.
Where β represents the parameter that helps to determine the value for λ in each step.λ is used to influence the amount of concavity or convexity present in the cooling schedule; M is the number of decreasing steps that takes the temperature T0 to be close to zero Tmin.Lewis et al. (2007) used this cooling schedule to get the initial feasible solution and set M is equal 100; β = -0.99.Proposed the geometric cooling schedule Ti+1=α .Ti , where α is the cooling rate between zero and one, and Tk is the current temperature value Most popular cooling schedule, in term of the simplicity to be applied Static cooling schedule, must be specified before the search starts and it need to be carefully tuned Based on the literature reviews above, it has been concluded that the adaptive cooling schedule have more strengths than the static cooling schedule.However, we will apply adaptive cooling schedule as proposed in the Lewis et al. (2007).

Introduction to Timetable
Wren (1995) considered timetabling as a unique case of scheduling problems and describes it as follows: "The allocation, subject to constraints, of given resources to objects being placed in space-time, in such a way as to satisfy as nearly as possible a set of desirable objectives''.
Another definition of the timetable presented by Burke et al., (2004) as follows:"a timetabling problem is a problem with four parameters: T, a finite set of times; R, a finite set of resources; M, a finite set of meetings; and C, a finite set of constraints.The problem is to assign time and resources to the meetings so as to satisfy the constraints as far as possible".
Building the academic timetable is a typical real world-scheduling problem and became a very challenging work in every academic institution each year.In educational filed, there are three deferent timetabling problems: school timetable, university course timetable and university examination timetable.The three categories may have the same basic characteristic but still have some differences.For example, the class size (student's enrolment) for all school courses usually are very similar and the same group of students is associated with a set of courses, whilst the university courses different in the class size.

University Course Timetabling Problem
"The university course timetabling problems (UCT) is the task of allocating courses and lecturers to the rooms and timeslots, in such a way as to satisfy the hard constraints and minimize the soft constraints" (McCollum & Ireland, 2006).
There are two commons rules, that the timetable should follows, which are: 1) Hard constraints, where this constraint needs to be satisfied e.g. two or more courses should schedules at the different room and time-period.2) Soft constraints, where this constraint needs to be satisfied if possible e.g.female lecturers is preferred to teach at the morning timeslots.
Generally, the timetable quality is measured or calculated by the satisfaction degree of the soft constraints, whilst, for the hard constraints, it is measured by the complete fulfillment degree.

Constructive Heuristic for University Course Timetabling Problem
In order to generate the initial solution, the lectures placed in the periods and rooms without any hard constraints violations.Generally, the heuristic begins with an empty timetable and list of unscheduled events U.The procedure starts by taking the events one by one form U and place them into a feasible places in a timetable.The procedure is continued until U= Ө. Burke et al. (2007) presented common constructive timetabling heuristics with free conflict: 1) Largest Degree First (LDF): Schedule the courses with the greatest number of conflicting courses first.
2) Largest Colour Degree First (LCDF): Schedule the courses with the largest number of conflicting courses that already been scheduled.In order to break ties.3) Least Saturation Degree First (LSDF): Schedule the course with the least number of valid periods currently available.
Arntzen & Løkketangen (2005) designed a mechanism called a sequential assignment of events to places, as follow: 1) Initialize list L that contains all unassigned events E.
2) Select E with fewest possible places from L.
-If there is no unique E that has a fewest possible places, -Then select E randomly among the events with fewest possible places.
3) Find a place P for E. 4) Insert E to the chosen place P.Then, update the information of the possible places and fitting events.Furthermore, remove E from L. Then return to step II if L is not empty.Otherwise, generate a feasible solution.5) Finally, if the above process failed to generate a feasible solution when some event is unassigned, then restarts the process again with new different random seed.
The next section discusses our proposed method to optimize the University Course Timetable (UTC) problem using simulated annealing and Tabu search with temperature reheating function to divert the solution when the algorithm trapped on local optimum solution.

Problem Statement and Discussion
Based on literature reviewed above, we found that each algorithm has strengths and weakness in terms of the search ability and the solution quality.Tabu Search (TS) performed effectively when the neighborhood structure is small and the landscape is fat, by escaping from the local optimum.
For the SA algorithm, many researchers tried to overcome it weakness, such as the longer computational time to reach a good quality solutions, and to solve the problem of trapping in the local optimum at the end of the search space.For example, using Variable Neighborhood Search (VNS) technique with SA will enhance the SA performance, by improving the search ability and solution quality, a hybrid SA with TS to avoid revisiting solutions and it will guide the SA diversification part in order to explore more of the search space.The main aim is to capitalize on both SA and TS strengths.
Based on the recommendations from the literature, we have been motivated to enhance the SA performance with the following challenges in mind: 1) How to avoid trapping in local optimum when the initial temperature reaches the very cold one?2) How to escape from local optimum if the search solution trapped in a local optimum and divert the solution to other good promising region?

Proposed Algorithm
Figure (1 & 2) shows our proposed algorithm.The algorithm begins with a given initial solution (Sol) and generates n neighbours from the neighbourhood structure.The best solution (Sol 1 ) selected based on the quality of solutions.The algorithm checks the solution movements using the Tabu list, then the solution (Sol 1 ) is accepted if its quality is better than or equal (objective value is less than or equal) to the current solution (Sol).
Otherwise, (i.e. the quality of (Sol 1 ) is worse than Sol), the acceptance criteria p(X) =℮ -∆f /T i is applied or p(X) =℮ -∆f /Tanh(Non-improving iterations .If p(X)> generated random number between 0 and 1, the Sol 1 is accepted.
Given an initial solution Sol; set the total iteration number Iter; Set the non-improvement iterationsNon_improve; Set the Tabu list memory µ; Set the size of the non-improvement iteration ω; Do while (termination criteria does not met) Generate k neighbors from N neighborhood structures; Update the µ // update the Tabu list

Assign the best neighbor to Sol 1
Set the initial temperature T 0 using EQ 2.
Sol * ← Sol 1 // Update the best solution so far.

Experimental and Results
This section presents our experiment results to test the performances of our EMCQ in comparison to others as describes in the literature review.The initial parameters were set in our work to:

Comparing the SA-EMCQ and SA on ITC-2007 Track3 Dataset
In Table 3, we have compared SA-EMCQ with standard SA under the ITC2007 with track3 timeout condition of 460 seconds in our machine, in order to analyze the search capability for the proposed method against SA.For this comparison, we have used the standard SA as described in (Van Laarhoven & Aarts, 1987).

•
Mmis the average quality solutions results; b is the best solution.
Table 5 showed that the hybrid SA-EMCQ outperformed some approaches that tested in ITC 2007 Track3, under the same time out condition.In addition, SA-EMCQ outperformed threshold accepting local search presented by Geiger (2008) and also outperformed repair-based Local Search presented by Clark et al. (2008) (for all instances).Moreover, the SA-EMCQ approach also gave results better than TS in (Lau & Zhao, 2008) and the Hybrid approach in Müller (2009) (i.e. comp 1, 2, 5, 9, 12, 15, 16, 20 and 21) and comparable with other instances.Furthermore, our proposed algorithm gave best-known results in comparison to all techniques for instances Comp1 and Comp11.According to the average results in table 5, SA-EMCQ ranked second place among the others.

Conclusion and Future Work
This paper focused on reviewing several heuristics and meta-heuristics algorithms that implemented in the literature to solve the combinatorial optimization problems, especially university course timetabling problem.
Mainly, we have focused on SA algorithm and identifying its strengths and weakness such as consuming more computational time and trapped in local optimum, especially at the end of the search process when the temperature closed to zero.Thus, this revision motivated us to improve SA features and components using the most effective hybrid approach between SA with other EMCQ algorithms.
Test results showed that our proposed algorithm gave best-known results in comparison to all techniques for some instances and the average results of SA-EMCQ ranked second place among the others.Furthermore, we can gladly conclude that our proposed algorithm is generally able to produce good and comparable solutions when compared to the best known and some other approaches in the literature.As a future work, this proposed algorithm can be modified using another metaheuristics algorithms such as genetic algorithm and practical fish swarm, moreover this hybrid algorithm will be applied on real word case study.

Table 2 .
Summary of the presented cooling schedules and investigations from the literature That is, it is measured by penalized each violation of soft constraints (penalties cost).Small penalty values indicate good quality solution.This evaluationfunction is called the objective function or penalties cost function."The problem of university course solution Sol 1 is accepted //Adaptive cooling schedule (see Eq 3) End while; Output the final solution;

Table 3 ,
shows that EMCQ outperformed the standard SA across all instances, for all (best, worst, mean, and the standard deviation).Table4presents the t-test and P-Value of SAEMCQ against Standard SA. Results tested using Wilcoxon Signed-Rank Test in order to support our Alternative Hypothesis stated above.

Table ( 4
) shows clearly that there are statistically significant differences between SA-EMCQ performance and the standard SA where SA-EMCQ improved the solution quality almost in every case.Note: the significance level in this test is equal 0.05, meaning, if the P-Value is less than 0.05 then we have statistically significant results, which achieved in our work for the majority-tested instances.In addition, the significance value in table (4) evidently supports the alternative hypothesis that SA-EMCQ will enhance the SA performance (the solution quality) (which means that the alternative hypothesis is accepted).

Table 5 .
Average and best results of SA-EMCQ in comparison with the top five competitors and best-known