Re: cgroup: status-quo and userland efforts

From: Lennart Poettering
Date: Sun Jun 30 2013 - 15:40:21 EST


Heya,

On 29.06.2013 05:05, Tim Hockin wrote:
Come on, now, Lennart. You put a lot of words in my mouth.

I for sure am not going to make the PID 1 a client of another daemon. That's
just wrong. If you have a daemon that is both conceptually the manager of
another service and the client of that other service, then that's bad design
and you will easily run into deadlocks and such. Just think about it: if you
have some external daemon for managing cgroups, and you need cgroups for
running external daemons, how are you going to start the external daemon for
managing cgroups? Sure, you can hack around this, make that daemon special,
and magic, and stuff -- or you can just not do such nonsense. There's no
reason to repeat the fuckup that cgroup became in kernelspace a second time,
but this time in userspace, with multiple manager daemons all with different
and slightly incompatible definitions what a unit to manage actualy is...

I forgot about the tautology of systemd. systemd is monolithic.

systemd is certainly not monolithic for almost any definition of that term. I am not sure where you are taking that from, and I am not sure I want to discuss on that level. This just sounds like FUD you picked up somewhere and are repeating carelessly...

But that's not my point. It seems pretty easy to make this cgroup
management (in "native mode") a library that can have either a thin
veneer of a main() function, while also being usable by systemd. The
point is to solve all of the problems ONCE. I'm trying to make the
case that systemd itself should be focusing on features and policies
and awesome APIs.

You know, getting this all right isn't easy. If you want to do things properly, then you need to propagate attribute changes between the units you manage. You also need something like a scheduler, since a number of controllers can only be configured under certain external conditions (for example: the blkio or devices controller use major/minor parameters for configuring per-device limits. Since major/minor assignments are pretty much unpredictable these days -- and users probably want to configure things with friendly and stable /dev/disk/by-id/* symlinks anyway -- this requires us to wait for devices to show up before we can configure the parameters.) Soo... you need a graph of units, where you can propagate things, and schedule things based on some execution/event queue. And the propagation and scheduling are closely intermingled.

Now, that's pretty much exactly what systemd actually *is*. It implements a graph of units with a scheduler. And if you rip that part out of systemd to make this an "easy cgroup management library", then you simply turn what systemd is into a library without leaving anything. Which is just bogus.

So no, if you say "seems pretty easy to make this cgroup management a library" then well, I have to disagree with you.

We want to run fewer, simpler things on our systems, we want to reuse as

Fewer and simpler are not compatible, unless you are losing
functionality. Systemd is fewer, but NOT simpler.

Oh, certainly it is. If we'd split up the cgroup fs access into separate daemon of some kind, then we'd need some kind of IPC for that, and so you have more daemons and you have some complex IPC between the processes. So yeah, the systemd approach is certainly both simpler and uses fewer daemons then your hypothetical one.

much of the code as we can. You don't achieve that by running yet another
daemon that does worse what systemd can anyway do simpler, easier and
better.

Considering this is all hypothetical, I find this to be a funny
debate. My hypothetical idea is better than your hypothetical idea.

Well, systemd is pretty real, and the code to do the unified cgroup management within systemd is pretty complete. systemd is certainly not hypothetical.

The least you could grant us is to have a look at the final APIs we will
have to offer before you already imply that systemd cannot be a valid
implementation of any API people could ever agree on.

Whoah, don't get defensive. I said nothing of the sort. The fact of
the matter is that we do not run systemd, at least in part because of
the monolithic nature. That's unlikely to change in this timescale.

Oh, my. I am not sure what makes you think it is monolithic.

What I said was that it would be a shame if we had to invent our own
low-level cgroup daemon just because the "upstream" daemons was too
tightly coupled with systemd.

I have no interest to reimplement systemd as a library, just to make you happy... I am quite happy with what we already have....

This is supposed to be collaborative, not combative.

It certainly sounds *very* differently in what you are writing.

Lennart
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/