Re: [PATCH linux-next] mm/madvise: allow KSM hints for process_madvise

From: CGEL
Date: Mon Jul 04 2022 - 03:29:47 EST


On Mon, Jul 04, 2022 at 08:48:06AM +0200, Michal Hocko wrote:
> On Fri 01-07-22 21:12:56, David Hildenbrand wrote:
> > On 01.07.22 15:19, Michal Hocko wrote:
> > > On Fri 01-07-22 14:39:24, David Hildenbrand wrote:
> > >>> I am not sure about exact details of the KSM implementation but if that
> > >>> is not a desirable behavior then it should be handled on the KSM level.
> > >>> The very sam thing can easily happen in a multithreaded (or in general
> > >>> multi-process with shared mm) environment as well.
> > >>
> > >> I don't quite get what you mean.
> > >
> > > I meant to say that if KSM needs to be aware of a special CoW semantic
> > > then it should be handled on the KSM layer regardless whether the KSM
> > > has been set by the process itself or any other process that has acccess
> > > to the MM. process_madvise is just another way to access a remote MM
> > > other than sharing the full MM.
> >
> > Okay.
> >
> > KSM has been a corner case feature that was restricted to well-defined
> > and well-tested environments. Until recently, R/O pins of any KSM pages
> > was essentially completely unreliably. And applications don't expect
> > such surprises. The shared zeropage is most probably the last
> > problematic piece.
> >
> > Yes, we're getting there that it's a real feature that can see more
> > (forced) wide-spread use. However, until the known issues in KSM have
> > been fixed (e.g., below -- there is a whole list of papers regarding
> > attacks on memory deduplication), it should be limited to well defined
> > environments and applications only -- IMHO.
>
> Very much agreed on all this! To be completely honest I am not really
> sure that all those consequences are widely understood and optmizing
> solely on memory savings is a very short sighted strategy IMO. But, it
> seems that there is a demand for this feature and previous attempts for
> APIs were much worse both from the semantic and maintainability POV. I
> am not sure we can get anything more sane than madvise.
>
> I also very much agree that current shortcomings have to be adressed
> first before we open this can of worms to 3rd party actors. I was not
> aware of those so thank for bringing them up. Maybe I was overly
> optimistic here.
>
> So I guess we have following questions to answer:
> 1) Do we really want to support KSM triggered by 3rd party? Does it
> impose new challenges other than existing ones in multi "threaded"
> environemnts?
> 2) If yes, is the process_madvise the most appropriate existing API? Or
> do we need a new one?

Maybe new semantics is needed similarly to MADV_NOHUGEPAGE that ensures that
there will *not* be huge pages.

> 3) Should this be a highly privileged operation or we want to allow
> userspace to shoot its feet because consequences are subtle and not very
> well understood?
>
> > So what I want to express here is that if we're adding an interface that
> > can be used to just enable KSM on the whole system easily, it might be a
> > bit to soon for that. No matter what you document, people will ignore it.
>
> Agreed.
>

Agree too.
Thanks.