Re: [PATCH] mm, page_alloc: do not break __GFP_THISNODE by zonelist reset

From: Vlastimil Babka
Date: Mon May 28 2018 - 11:43:34 EST


On 05/28/2018 09:21 AM, Michal Hocko wrote:
> On Fri 25-05-18 12:43:00, Andrew Morton wrote:
>> On Fri, 25 May 2018 15:08:53 +0200 Vlastimil Babka <vbabka@xxxxxxx> wrote:
>>
>>> we might consider this for 4.17 although I don't know if there's anything
>>> currently broken. Stable backports should be more important, but will have to
>>> be reviewed carefully, as the code went through many changes.
>>> BTW I think that also the ac->preferred_zoneref reset is currently useless if
>>> we don't also reset ac->nodemask from a mempolicy to NULL first (which we
>>> probably should for the OOM victims etc?), but I would leave that for a
>>> separate patch.
>>
>> Confused. If nothing is currently broken then why is a backport
>> needed? Presumably because we expect breakage in the future? Can you
>> expand on this?
>
> __GFP_THISNODE is documented to _use_ the given node. Allocating from a
> different one is a bug. Maybe the current code can cope with that or at
> least doesn't blow up in an obvious way but the bug is still there.
>
> I am still not sure what to do about the zonelist reset. It still seems
> like an echo from the past

Hmm actually it seems that even at the time of commit 183f6371aac2
introduced the reset, the per-policy zonelists for MPOL_BIND policies
were gone for years. Mempolicy only affects which node's zonelist is
used, but that always contains all the nodes (unless __GFP_THISNODE) so
there's no reason to get another node's zonelist to escape mempolicy
restrictions.

Mempolicy restrictions are given as nodemask, so if we want to ignore
them for OOM victims etc, we have to reset nodemask instead. But again
we have to be careful in case the nodemask doesn't come from mempolicy,
but from somebody who might be broken if we ignore it.

> but using numa_node_id for __GFP_THISNODE is
> a clear bug because our task could have been migrated to a cpu on a
> different than requested node.
>
> Acked-by: Michal Hocko <mhocko@xxxxxxxx>
>