SIPEW 2008: All About Benchmarking

Scheduling in Server Farms

Open vs. Closed Systems

In the following example, a web server farm is used to create an environment in which incoming tasks are distributed to systems. All the servers operate based on preemptive processor sharing, just as most of the multitasking operating systems work today. The main question now is how to distribute the workload to the servers in an effort to get the shortest response times, and hence the best performance. In this example, the solution deciding on the routing policy (see image below) needs to be aware of the workload to be able to make a decision on the scheduling technique (see below).

A Poisson process, as used in this example, is a “stochastic process in which events occur continuously and independently of one another. A well-known example is radioactive decay of atoms.” (Text source: Wikipedia). HTTP requests are classified as Poisson processes. Will there be performance differences according to various scheduling techniques?

Task Scheduling Policies for Server Farms

Harchol-Balter discusses four possible task scheduling techniques:

  • RandomIncoming tasks are assigned to a random system with no other policy behind.
  • Join-Shortest-Queue In this case, tasks are forwarded to the server that has the fewest jobs to process. This technique requires a prior analysis of the server system and usually also assumes a first-come, first-served policy.

  • Least-Work-LeftThis is similar to join-shortest-queue, but actually considers the task that has to be processed. The systems with the least actual workload will receive the next jobs first. Here, jobs are typically ordered for dispatching by workload.
  • Size Interval SplittingThis approach distributes workload evenly to all available clients, which creates a good workload balance across your servers, but increases response time.

The entire topic of scheduling has been discussed for more than 40 years, and it is obvious that there are various strategies that may or may not be suitable for individual requirements. The most important parameter certainly is the workload itself, and a detailed analysis of closed as opposed to open systems. The selection of proper policies and hardware comes in a second step.

TOPICS