Re: [PATCH 2/3] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically

From: Hillf Danton
Date: Wed Jan 25 2017 - 03:42:25 EST


On Wednesday, January 25, 2017 4:00 PM Michal Hocko wrote:
> On Wed 25-01-17 15:00:51, Hillf Danton wrote:
> > On Tuesday, January 24, 2017 8:41 PM Michal Hocko wrote:
> > > On Fri 20-01-17 16:33:36, Hillf Danton wrote:
> > > >
> > > > On Tuesday, December 20, 2016 9:49 PM Michal Hocko wrote:
> > > > >
> > > > > @@ -1013,7 +1013,7 @@ bool out_of_memory(struct oom_control *oc)
> > > > > * make sure exclude 0 mask - all other users should have at least
> > > > > * ___GFP_DIRECT_RECLAIM to get here.
> > > > > */
> > > > > - if (oc->gfp_mask && !(oc->gfp_mask & (__GFP_FS|__GFP_NOFAIL)))
> > > > > + if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS))
> > > > > return true;
> > > > >
> > > > As to GFP_NOFS|__GFP_NOFAIL request, can we check gfp mask
> > > > one bit after another?
> > > >
> > > > if (oc->gfp_mask) {
> > > > if (!(oc->gfp_mask & __GFP_FS))
> > > > return false;
> > > >
> > > > /* No service for request that can handle fail result itself */
> > > > if (!(oc->gfp_mask & __GFP_NOFAIL))
> > > > return false;
> > > > }
> > >
> > > I really do not understand this request.
> >
> > It's a request of both NOFS and NOFAIL, and I think we can keep it from
> > hitting oom killer by shuffling the current gfp checks.
> > I hope it can make nit sense to your work.
> >
>
> I still do not understand. The whole point we are doing the late
> __GFP_FS check is explained in 3da88fb3bacf ("mm, oom: move GFP_NOFS
> check to out_of_memory"). And the reason why I am _removing_
> __GFP_NOFAIL is explained in the changelog of this patch.
>
> > > This patch is removing the __GFP_NOFAIL part...
> >
> > Yes, and I don't stick to handling NOFAIL requests inside oom.
> >
> > > Besides that why should they return false?
> >
> > It's feedback to page allocator that no kill is issued, and
> > extra attention is needed.
>
> Be careful, the semantic of out_of_memory is different. Returning false
> means that the oom killer has been disabled and so the allocation should
> fail rather than loop for ever.
>
By returning false, I mean that oom killer is making no progress.
And I prefer to give up looping if oom killer can't help.
It's a change in the current semantic to fail the request and I have
to test it isn't bad.

thanks
Hillf