Re: [PATCH] oom killer (Core)

From: Andrea Arcangeli
Date: Thu Dec 23 2004 - 20:20:30 EST


On Fri, Dec 10, 2004 at 05:52:33PM +0100, Thomas Gleixner wrote:
> + if ((p->flags & PF_MEMDIE) ||
> + ((p->flags & PF_EXITING) && !(p->flags & PF_DEAD)))

this had to be:

if (((p->flags & PF_MEMDIE) || (p->flags & PF_EXITING)) &&
!(p->flags & PF_DEAD))

I didn't take zombies into account. A task may be in memdie state and zombie at
the same time. We must not wait a PF_MEMDIE to go away completely, we must wait
only until PF_DEAD is not set. After PF_DEAD is set we know all ram we
were waiting for is already in the free list.

I noticed some deadlocks during an oom-torture-test before applying the stuff
into the suse kernel. The above change fixed all my deadlocks. Everything else
was working fine already.

Actually before finding the above bug I fixed PF_MEMDIE too and I converted it
to p->memdie, an unsigned char. All archs should support 1 byte granularity
for per-process atomic instructions, it's near used_math that I also converted
to a char to signal it cannot be a bitflag sharing the same char with globals.

Struct layout looks like this.

/*
* All archs should support atomic ops with
* 1 byte granularity.
*/
unsigned char memdie;
/*
* Must be changed atomically so it shouldn't be
* be a shareable bitflag.
*/
unsigned char used_math;
/*
* OOM kill score adjustment (bit shift).
* Cannot live together with used_math since
* used_math and oomkilladj can be changed at the
* same time, so they would race if they're in the
* same atomic block.
*/
short oomkilladj;

As for the |= PF_MEMALLOC in oom_kill_task that was a gratuitous smp breakage,
I didn't need to do anything in PF_MEMALLOC since alloc_pages checks _both_
PF_MEMALLOC and p->memdie already.

I also added a !chosen in the below code to make that logic more self
contained and less dependent on the badness implementation (should never
be necessary though):

if (points > maxpoints || !chosen) {
chosen = p;
maxpoints = points;
}

I can port the final patch (including fixage of PF_MEMDIE races) to 2.6.10-rc
after I finished.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/