Re: [PATCH v2] mm: terminate shrink_slab loop if signal is pending

From: Tetsuo Handa
Date: Sun Dec 10 2017 - 06:38:14 EST


Michal Hocko wrote:
> > > I agree that making waits/loops killable is generally good. But be sure to be
> > > prepared for the worst case. For example, start __GFP_KILLABLE from "best effort"
> > > basis (i.e. no guarantee that the allocating thread will leave the page allocator
> > > slowpath immediately) and check for fatal_signal_pending() only if
> > > __GFP_KILLABLE is set. That is,
> > >
> > > + /*
> > > + * We are about to die and free our memory.
> > > + * Stop shrinking which might delay signal handling.
> > > + */
> > > + if (unlikely((gfp_mask & __GFP_KILLABLE) && fatal_signal_pending(current)))
> > > + break;
> > >
> > > at shrink_slab() etc. and
> > >
> > > + if ((gfp_mask & __GFP_KILLABLE) && fatal_signal_pending(current))
> > > + goto nopage;
> > >
> > > at __alloc_pages_slowpath().
> >
> > I was thinking about something similar and will experiment to see if
> > this solves the problem and if it has any side effects. Anyone sees
> > any obvious problems with this approach?
>
> Tetsuo has been proposing this flag in the past and I've had objections
> why this is not a great idea. I do not have any link handy but the core
> objection was that the semantic would be too fuzzy. All the allocations
> in the same context would have to be killable for this flag to have any
> effect. Spreading it all over the kernel is simply not feasible.
>

Refusing __GFP_KILLABLE based on "All the allocations in the same context
would have to be killable" does not make sense. Outside of MM, we update
code to use _killable version step by step based on best effort basis.
People don't call efforts to change like

func1() {
// As of this point it is easy to bail out.
if (mutex_lock_killable(&lock1) == 0) {
func2();
mutex_unlock(&lock1);
}
}

func2() {
mutex_lock(&lock2);
// Do something which is not possible to bail out for now.
mutex_unlock(&lock2);
}

pointless.

If you insist on "All the allocations in the same context would
have to be killable", then we will offload all activities to some
kernel thread.