Re: [take2] OOM documentation update [was: Linux killed Kenny,bastard!]

From: Randy Dunlap
Date: Wed Jan 14 2009 - 16:36:34 EST


On Wed, 14 Jan 2009 20:06:06 +0300 Evgeniy Polyakov wrote:

> Updated version fixes some errors and extends description by adding
> swapping and children relation explaintaion.
>
> Signed.

??

> diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
> index d105eb4..4aa1918 100644
> --- a/Documentation/filesystems/proc.txt
> +++ b/Documentation/filesystems/proc.txt
> @@ -2311,6 +2311,33 @@ increase the likelihood of this process being killed by the oom-killer. Valid
> values are in the range -16 to +15, plus the special value -17, which disables
> oom-killing altogether for this process.
>
> +The process to be killed in an out-of-memory situation is selected among all others
> +based on its badness score. This value equals the original memory size of the process
> +originally and is then updated according to its CPU time (utime + stime) and the

drop "originally" since earlier part of sentence says "original memory size".

> +run time (uptime - start time). The longer it runs the smaller is the score.
> +Badness score is divided by the square root of the cpu time and then by

CPU

> +the double square root of the run time.
> +
> +Swapped out tasks are killed first. Half of each child's memory size is added to
> +the parent's score if they do not share the same memory. Thus forking servers
> +are the prime candidates to be killed. Having only one 'hungry' child will make
> +parent less preferable than the child.
> +
> +/proc/<pid>/oom_score shows process' current badness score.
> +
> +The following heuristics are then applied:
> + * if the task was reniced, its score doubles
> + * superuser or direct hardware access tasks (CAP_SYS_ADMIN, CAP_SYS_RESOURCE
> + or CAP_SYS_RAWIO) have their score divided by 4
> + * if oom condition happened in one cpuset and checked task does not belong
> + to it, its score is divided by 8
> + * resulted score is multiplied by the two in the power of oom_adj when it is

confusing. Is this: multiplied by two to the power of oom_adj when it is ... ?

> + positive, and divided otherwise, i.e.
> + points <<= oom_adj when it is positive and
> + points >>= oom_adj otherwise
> +
> +The task with the highest badness score is then killed.
> +
> 2.13 /proc/<pid>/oom_score - Display current oom-killer score
> -------------------------------------------------------------


---
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/