Re: [PATCH 0/4] CPU hotplug, cpusets: Fix CPU online handlingrelated to cpusets

From: Peter Zijlstra
Date: Tue Feb 07 2012 - 22:23:33 EST


On Wed, 2012-02-08 at 00:25 +0530, Srivatsa S. Bhat wrote:
> There is a very long standing issue related to how cpusets handle CPU
> hotplug events. The problem is that when a CPU goes offline, it is removed
> from all cpusets. However, when that CPU comes back online, it is added
> *only* to the root cpuset. Which means, any task attached to a cpuset lower
> in the hierarchy will have one CPU less in its cpuset, though it had this
> CPU in its cpuset before the CPU went offline.

Yeah so? That's known behaviour..

> The issue gets enormously aggravated in the case of suspend/resume.

Why does suspend resume does this anyway? hotunplug is terribly
expensive, surely not doing it would make suspend ever so much faster?

> During
> suspend, all non-boot CPUs are taken offline. Which means, all those CPUs
> get removed from all the cpusets. When the system resumes, all CPUs are
> brought back online; however, the newly onlined CPUs get added only to the
> root cpuset - and all other cpusets have cpuset.cpus = 0 (boot cpu alone)!
> This means, (as is obvious), all those tasks attached to non-root cpusets
> will be constrained to run only on one single cpu!
>
> So, imagine the amount of performance degradation after suspend/resume!!
>
> In particular, libvirt is one of the active users of cpusets. And apparently,
> people hit this problem long ago:
> https://bugzilla.redhat.com/show_bug.cgi?id=714271
>
> But unfortunately this never got resolved since people probably thought that
> the bug was in libvirt... and all this time the kernel was the culprit!

/me boggles, why do you use cpusets on a system small enough to suspend,
and I'm so not going to ask about libvirt because I know I'll just get
sad.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/