Re: can't oom-kill zap the victim's memory?

From: Michal Hocko
Date: Thu Oct 01 2015 - 10:48:27 EST


On Mon 28-09-15 15:24:06, David Rientjes wrote:
> On Fri, 25 Sep 2015, Michal Hocko wrote:
>
> > > > I am still not sure how you want to implement that kernel thread but I
> > > > am quite skeptical it would be very much useful because all the current
> > > > allocations which end up in the OOM killer path cannot simply back off
> > > > and drop the locks with the current allocator semantic. So they will
> > > > be sitting on top of unknown pile of locks whether you do an additional
> > > > reclaim (unmap the anon memory) in the direct OOM context or looping
> > > > in the allocator and waiting for kthread/workqueue to do its work. The
> > > > only argument that I can see is the stack usage but I haven't seen stack
> > > > overflows in the OOM path AFAIR.
> > > >
> > >
> > > Which locks are you specifically interested in?
> >
> > Any locks they were holding before they entered the page allocator (e.g.
> > i_mutex is the easiest one to trigger from the userspace but mmap_sem
> > might be involved as well because we are doing kmalloc(GFP_KERNEL) with
> > mmap_sem held for write). Those would be locked until the page allocator
> > returns, which with the current semantic might be _never_.
> >
>
> I agree that i_mutex seems to be one of the most common offenders.
> However, I'm not sure I understand why holding it while trying to allocate
> infinitely for an order-0 allocation is problematic wrt the proposed
> kthread.

I didn't say it would be problematic. We are talking past each other
here. All I wanted to say was that a separate kernel oom thread wouldn't
_help_ with the lock dependencies.

> The kthread itself need only take mmap_sem for read. If all
> threads sharing the mm with a victim have been SIGKILL'd, they should get
> TIF_MEMDIE set when reclaim fails and be able to allocate so that they can
> drop mmap_sem.

which is the case if the direct oom context used trylock...
So just to make it clear. I am not objecting a specialized oom kernel
thread. It would work as well. I am just not convinced that it is really
needed because the direct oom context can use trylock and do the same
work directly.
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/