Optimal Workload Allocation in a Network of Computers with Single Class Job

Queueing models for multiple queue with multiple server are used to model workload allocation problems in a network of computers. The problem of determining optimal allocation of workload with single class jobs to a parallel of computers using optimization technique is presented. The generalized exponential (GE) distributional model has been used to represent general inter arrival and service time distributions as various jobs have various traffic characteristic. A close-loop expression is derived from a non-linear optimization problem based on a queueing theory objective function to obtain an optimal value for jobs arrival. The analysis of the recomputation has been done and has shown improvement.


Introduction
In a distributed computer system, task generated by a user or a group of users can be allocated over a number of available computers.This situation is opposed to a system which a single computer provide its capacity for all users, or systems in which each user is provided with its own local processor, usually with very limited capacity.An operational aspect of such a distributed system is the availability of workload balancing policy.Such policy balances the workload over the available computers, aiming to optimize performance measures for the system.Most traffic allocation problems in the literature have been tackled by assuming that all jobs are identical or single class (Chow & Kohler, 1979;Ni & Hwang, 1981;Tantawi & Towsley, 1985;Ross & Yao, 1991;Chombe & Boxma, 1995, Tavana & Rappaport, 1997;Wolf & Yu, 2001).This assumption is normally done when the diversity of jobs is not of importance.
In this paper, we stress on the quantitative measures of workload allocation to a network of computers.Based on this, we show that by using quantitative modeling, arrival to a network of computers can be reallocated to get the optimal performance measure.We focus on the issue of job allocation in a network of computers where different computers have different job processing time.The optimization criterion studied here is to minimize the expected job response time in the systems to which jobs are allocated.Jobs arrive at a scheduler that allocates jobs to the computers according to a calculated arrival rate computed using Lagrange multiplier theorem.The paper is organized as follows: system model is described and the proposed GE optimization model of workload allocation is presented in section 2.0 and 2.1, followed by the computational results in section 2.2.

Optimal Workload Allocation in a Network of Computers with General Exponential (GE) Arrival and Service time Distribution
In this section, we consider workload allocation problems for static allocation protocol for the model of a single GE stream of jobs offered to a fixed number of computers.The allocation protocol which has been studied is static in the sense that only the total incoming traffic and information about basic characteristic, like arrival rate and service times are used.The objective of the network of computer systems studied here is to maximize user perceived performance, which is a function of the amount of time the user spends waiting for a file to download from a server.In this context, download refers to the actions from the time the user requests a file from a computer to the time the file or an error message is delivered to the user's terminal.The shorter the download time, the higher the user's perceived performance.
Most of the previous studies on workload allocation used exponential distribution for inter arrival and service time.The reason is that network traffic has long been assumed to have exponential behavior.However this situation is not always true since the number of network users is unpredictable.Furthermore mean queue length derived from exponential distribution does not factor in variation in the inter arrival and service time.Clearly, the more information is available for making decision, the better the workload allocation can be.So our proposed models include variation parameters in inter arrival and service time that we found lack in previous models.In this model, requests arrive at the system according to a GE process.They are numbered in the order that they arrive at the system.Once a request has entered the system, it does not leave until it completes service.The metric of interest is mean response time, the user spends on the system an amount relative to the download time upon the server's completion of the user's request.Here, the transit time required to send the result of the request back to user's interface has been ignored, and the response time is assumed to be achieved instantaneously upon the file's departure from the server or computer.

Criterion of Optimality
For a given total network traffic φ, find the optimal traffic workload λ i, , i = 1, 2,…,N so that the expected response time incurred on any system is minimized.

Mathematical Model Description
We will first present a mathematical description of the related workload allocation problems.Jobs arrive at a routing point according to a GE process with rate φ.At the instance of arrival, each job has to be assigned to one of N servers in parallel.The service rate µ i that is assigned to server i has GE distribution as well.All service times are independent.Any job that is not fully processed, branches with certain probability and returns back to the scheduler for further processing.Otherwise the job is complete and exits the system.Let π denote an allocation policy and p i , i = 1,…, N, be the fraction of the jobs that is routed to computer i under policy π.In our workload allocation problem, the aim is to minimize W i (π )denote the mean response time of a job assigned to computer i under allocation policy π.D i is the cost associated with waiting one time unit at queue i.The objective function can have various interpretations, by varying D i .Little's Law shows that the objective is to minimize the mean number of jobs in the queues.Instead of W i (π ), we use L i (π ), the mean queue length of queue i.The objective is to minimize a weighted sum of the mean queue length in the system.To obtain the assignment probabilities which minimize this function, the following Mathematical Programming problem has to be solved The term L i is strictly convex functions in λ i and it can also be verified that the problem has a feasible solution provided that is the arrival rate does not exceed the total service capacity.Before analyzing the model, it is important to understand the meaning of the model parameters.Network Traffic (λ i ) is the average number of file request received by the computer each second.Service rate (µ i ) is the average rate the computer can serve.Obviously, this value will vary widely from one request to another.
Problem P1 allows an analytical solution.Using Lagrange multiplier theorem we obtain δ the Lagrange multiplier with the following first order Kuhn-Tucker constraints: From (2.6) we find the unique optimal values * i λ as follows: Lagrange multiplier, δ is derived by solving the constraint equation below The computation has been developed in the MathCAD version 7 professional.

Computational result
In this section, numerical results are presented to assess the credibility of the GE distribution used.Two configurations will be shown.For the first configuration, service rate of the tasks are assumed to be . The improvement of the performance measures is presented in Figure 2.1 and 2.3.To verify the results, we use simulation and the comparative results are presented in Figure 2.2 and 2.4.Further analysis of sample cases for N = 2, 3, 4, 5 and 6 computers are computed and the analysis shows that a larger range for the service rates results in greater percentage improvements of the aggregate objectives.For example, a two computer system with µ 1 = 2, µ 2 = 1 and ρ = 0.9 results in a 1.75 per cent improvement in mean queue length compared with 11. 6 per cent improvement for a six computer system with µ i , i = 1, 2, …, N. The results of the analysis for such queueing systems are summarized in Figure 2.5.

Conclusions
The proposed solution mechanism focuses on the workload allocation of single class jobs through the use of optimal GE arrival rate in the workload allocation scheme.The key idea of optimizing the workload allocation scheme is to send a disproportionately fraction of workload to the computers with known capacities.GE component processes are expected to be more regular that a Poisson process, in the sense of having variation parameters.
We have described models that consider the workload allocation decision for the single class job case.The problem of workload allocation in an open network of queues is formulated as a non-linear optimization problem.The problems of maximizing system's mean queue length and mean response time for a given total arrival rate, and a specified arrival and service variation, are found to have the optimality condition.These optimality conditions are used to prove that, for queueing networks with unbalanced configuration of computer capacity, the optimal allocation of workload is unbalanced.A larger per computer share of workload goes to a larger capacity of computers.The unbalanced allocation result is related to efficiencies gained from computer pooling systems, we show that, holding utilization per computer constant, increasing the number of computers in the network reduces the average number in queue.2.2 provides numerical results after recomputation of arrival rate and the improvement in performance measures of mean queue length and waiting time for the stated parameters.Figure 2.1 shows the improvement of the two computers system's mean response time using dataset in Table 2.1.

Classical Proposed
Classical Proposed

Figure 2 .
Figure2.3 shows the improvement of the two computers system's mean response time using dataset in Table2.2.

Figure 2 .
Figure 2.4 shows the verification of the results with simulation using the stated parameters.

Figure 2 .
Figure2.5 shows the results of the analysis for sample cases for N = 2, 3, 4, 5 and 6 computers.The analysis shows that a larger range for the service rates results in greater percentage improvements of the aggregate objectives.

Table 2 .
1 Results for the classical and proposed approaches of a dual GE/GE/1 with Ca 1 Table2.1 provides numerical results after recomputation of arrival rate and the improvement in performance measures of mean queue length and waiting time for the stated parameters.