Re: FYI: mmap_sem OOM patch

From: Peter Zijlstra
Date: Thu Jul 08 2010 - 06:30:44 EST


On Wed, 2010-07-07 at 16:11 -0700, Michel Lespinasse wrote:
> What happens is we end up with a single thread in the oom loop (T1)
> that ends up killing a sibling thread (T2). That sibling thread will
> need to acquire the read side of the mmap_sem in the exit path. It's
> possible however that yet a different thread (T3) is in the middle of
> a virtual address space operation (mmap, munmap) and is enqueue to
> grab the write side of the mmap_sem behind yet another thread (T4)
> that is stuck in the OOM loop (behind T1) with mmap_sem held for read
> (like allocating a page for pagecache as part of a fault.
>
> T1 T2 T3 T4
> . . . .
> oom: . . .
> oomkill . . .
> ^ \ . . .
> /|\ ----> do_exit: . .
> | sleep in . .
> | read(mmap_sem) . .
> | \ . .
> | ----> mmap .
> | sleep in .
> | write(mmap_sem) .
> | \ .
> | ----> fault
> | holding read(mmap_sem)
> | oom
> | |
> | /
> \----------------------------------------------/

So what you do is use recursive locking to side-step a deadlock.
Recursive locking is poor taste and leads to very ill defined locking
rules.

One way to fix this is to have T4 wake from the oom queue and return an
allocation failure instead of insisting on going oom itself when T1
decides to take down the task.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/