Re: [RFC] Add mempressure cgroup

From: Anton Vorontsov
Date: Sat Dec 01 2012 - 03:05:02 EST


Hi Luiz,

Thanks for your email!

On Fri, Nov 30, 2012 at 03:47:25PM -0200, Luiz Capitulino wrote:
[...]
> > But there is one, rather major issue: we're crossing kernel-userspace
> > boundary. And with the scheme we'll have to cross the boundary four times:
> > query / reply-available / control / reply-shrunk / (and repeat if
> > necessary, every SHRINK_BATCH pages). Plus, it has to be done somewhat
> > synchronously (all the four stages), and/or we have to make a "userspace
> > shrinker" thread working in parallel with the normal shrinker, and here,
> > I'm afraid, we'll see more strange interactions. :)
>
> Wouldn't this be just like kswapd?

Sure, this is similar, but only for indirect reclaim (obviously).

How we'd do this for the direct reclaim I have no idea, honestly, with
Andrew's idea it must be all synchronous, so playing ping-pong with
userland during the direct reclaim will be hard.

So, the best thing to do with the direct recaim, IMHO, is just send a
notification.

> > But there is a good news: for these kind of fine-grained control we have a
> > better interface, where we don't have to communicate [very often] w/ the
> > kernel. These are "volatile ranges", where userland itself marks chunks of
> > data as "I might need it, but I won't cry if you recycle it; but when I
> > access it next time, let me know if you actually recycled it". Yes,
> > userland no longer able to decide which exact page it permits to recycle,
> > but we don't have use-cases when we actually care that much. And if we do,
> > we'd rather introduce volatile LRUs with different priorities, or
> > something alike.
>
> I'm new to this stuff so please take this with a grain of salt, but I'm
> not sure volatile ranges would be a good fit for our use case: we want to
> make (kvm) guests reduce their memory when the host is getting memory
> pressure.

Yes, for this kind of things you want a simple notification.

I wasn't saying that volatile ranges must be a substitute for
notifications, quite the opposite: I was saying that you can do volatile
ranges in userland by using "userland-shrinker".

It can be even wrapped into a library, with the same mmap() libc
interface. But it will be inefficient.

Thanks,
Anton.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/