About loadavgs

This document written by Ben Clifford, benc@hawaga.org.uk and is available on the web from http://www.hawaga.org.uk/ben/.

About Linux Processor loadavgs

Processor load averages are those numbers you get when you use the uptime command. Three loadavgs are returned. Each is the result of performing the computation with a different half life.

On a normal users linux box, the load average is usually something pretty low, such as 0.03. This means that on average there are 0.03 processes ready to run at any one time.

The loadavg can be compared to a "percentage of CPU used" metric as found, for example, in the Windows NT Task Manager. However, whilst a CPU percentage measure can only go up to 100% (or 1.00 on the loadavg scale), the loadavg can go arbitrarily high. The reason for this is that the loadavg measures the average number of processes that are ready to run, rather than the average number that are actually running. Obviously, you can only have a maximum of one running process per processor at any given instant.

On an asynchronous server (one that is not interacting directly with users; for example, a mail server or upstream news server), it might be desirable to have the loadavg at 1.00 (call it perfectly loaded). This means that no processor capacity is wasted (or more specifically, no money has been wasted buying a fast processor that is not being used), but the system is not overloaded. It is possible to use loadavgs to determine if this is the case, whereas a simple "percentage of CPU used" metric can not distinguish between an overloaded and perfectly loaded server. (actually this is not true - processes can be ready to run even if they can't immediately use the CPU, I think)

Calculating loadavgs

Loadavg values use an exponentially-weighted average, with increasingly smaller weights over a (theoretically) infinite period of time extending from the present into the past. More recent measurements have larger weight than previous readings.

The theoretical calculation of the load average is as follows:

We have the following values:

A (possibly infinite) series of readings labelled $x_{n}$ where n starts at 0 for the most recent reading and increases into the past. Readings before the start of the "universe" (in the case of a unix processor loadavg, before the machine was booted) should be set to 0.
A decay factor, d, satisfying $0 < d < 1$

Then, we can define the loadavg at time t as follows:

$L_{t} = Σ_{n = 0}^{\infty} \frac{1}{d^{n}} (1 - d) x_{n + t}$

For practical calculation, note that the present loadavg can be computed iteratively from the present reading $x_{1}$ , the decay factor and the loadavg of the previous period as follows:

$L_{t} = {(1 - \frac{1}{d}) x_{0} + \frac{1}{d} L}_{t - 1}$

The initial value of L should be set to 0.

This permits the loadavg to be computing very efficiently on an on-going basis with only a small, fixed number of data.

Decay factors have a length of time associated with them, called the half life. This is the period of time it takes for the loadavg to halve in value if all future input values are 0. No matter what the particular value of the loadavg is at the start of the decay, the time taken for it to half, and hence the length of the half life, is constant.

Some approximate values of half-life are given below:

Decay constants and their associated half-lives.
Decay constant d	Half life
0.5	1
0.25	<1
0.75	2.5
0.1	<<1
0.9	7
0.95	14
0.965	20

Anyone who has studied A-level physics should find the concept of half-lives familiar.

How the linux kernel actually computes the loadavg

As mentioned at the start of this document, linux provides three load averages, with different decay constants and half lifes. The relevant values are listed in the following table:

Standard decay values in sched.h
Decay constant	Decay time (not half life)	Kernel constant
	1 min (12 periods)	1884
	5 min (60 periods)	2014
	15 min (180 periods)	2037

The code is defined in sched.c and sched.h.

Loadavgs are stored in the three element array avenrun[] as fixed point numbers, with 11 bits for the fractional part. That means, to convert an integer i into this representation, write i<<11 and to extract the integer part from a number in this fractional form, write i>>11.

Readings are taken every 5 seconds, by calling the count_active_tasks() function. This counts the number of tasks that are running, swapping or uninterruptible.