Re: [PATCH] mm/page_alloc: detect allocation forbidden by cpuset and bail out early

From: Michal Hocko
Date: Wed Sep 08 2021 - 03:06:49 EST


On Wed 08-09-21 09:50:14, Feng Tang wrote:
> On Tue, Sep 07, 2021 at 10:44:32AM +0200, Michal Hocko wrote:
[...]
> > While this is a good fix from the functionality POV I believe you can go
> > a step further. Please add a detection to the cpuset code and complain
> > to the kernel log if somebody tries to configure movable only cpuset.
> > Once you have that in place you can easily create a static branch for
> > cpuset_insane_setup() and have zero overhead for all reasonable
> > configuration. There shouldn't be any reason to pay a single cpu cycle
> > to check for something that almost nobody does.
> >
> > What do you think?
>
> I thought about the implementation, IIUC, the static_branch_enable() is
> easy, it could be done when cpuset.mems is set with movable only nodes,
> but disable() is much complexer,

Do we care about disable at all? The point is to not have 99,999999%
users pay overhead of the check which is irrelevant to them. Once
somebody wants to use this "creative" setup then paying an extra check
sounds perfectly sensible to me. If somebody cares enough then the
disable logic could be implemented. But for now I believe we should be
OK with only enable case.

> as we may need a global reference
> counter to track the set/unset, and the unset could be the time when
> freeing the cpuset data structure, also one cpuset.mems could be changed
> runtime, and system could have multiple cpuset dirs (user space usage
> could be creative or crazy :)).
>
> While checking cpuset code, I thought more about configuring cpuset with
> movable only nodes, that we may still have normal usage: mallocing a big
> trunk of memory and do some scientific calculation, or AI training. It
> works with current code.

It might work but it would be inherently subtle because a single
non-movable allocation will throw the whole thing off the cliff.
I do not think we want to even pretend we support such a setup.
--
Michal Hocko
SUSE Labs