Re: [RFC PATCH] cgroup: introduce dynamic protection for memcg

From: Michal Hocko
Date: Mon Apr 04 2022 - 05:32:34 EST


On Mon 04-04-22 17:23:43, Zhaoyang Huang wrote:
> On Mon, Apr 4, 2022 at 5:07 PM Zhaoyang Huang <huangzhaoyang@xxxxxxxxx> wrote:
> >
> > On Mon, Apr 4, 2022 at 4:51 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > >
> > > On Mon 04-04-22 10:33:58, Zhaoyang Huang wrote:
> > > [...]
> > > > > One thing that I don't understand in this approach is: why memory.low
> > > > > should depend on the system's memory pressure. It seems you want to
> > > > > allow a process to allocate more when memory pressure is high. That is
> > > > > very counter-intuitive to me. Could you please explain the underlying
> > > > > logic of why this is the right thing to do, without going into
> > > > > technical details?
> > > > What I want to achieve is make memory.low be positive correlation with
> > > > timing and negative to memory pressure, which means the protected
> > > > memcg should lower its protection(via lower memcg.low) for helping
> > > > system's memory pressure when it's high.
> > >
> > > I have to say this is still very confusing to me. The low limit is a
> > > protection against external (e.g. global) memory pressure. Decreasing
> > > the protection based on the external pressure sounds like it goes right
> > > against the purpose of the knob. I can see reasons to update protection
> > > based on refaults or other metrics from the userspace but I still do not
> > > see how this is a good auto-magic tuning done by the kernel.
> > >
> > > > The concept behind is memcg's
> > > > fault back of dropped memory is less important than system's latency
> > > > on high memory pressure.
> > >
> > > Can you give some specific examples?
> > For both of the above two comments, please refer to the latest test
> > result in Patchv2 I have sent. I prefer to name my change as focus
> > transfer under pressure as protected memcg is the focus when system's
> > memory pressure is low which will reclaim from root, this is not
> > against current design. However, when global memory pressure is high,
> > then the focus has to be changed to the whole system, because it
> > doesn't make sense to let the protected memcg out of everybody, it
> > can't
> > do anything when the system is trapped in the kernel with reclaiming work.
> Does it make more sense if I describe the change as memcg will be
> protect long as system pressure is under the threshold(partially
> coherent with current design) and will sacrifice the memcg if pressure
> is over the threshold(added change)

No, not really. For one it is still really unclear why there should be any
difference in the semantic between global and external memory pressure
in general. The low limit is always a protection from the external
pressure. And what should be the actual threshold? Amount of the reclaim
performed, effectivness of the reclaim or what?
--
Michal Hocko
SUSE Labs