Re: Inquiry: Should we remove "isolcpus= kernel boot option? (mayhave realtime uses)

From: Mark Hounschell
Date: Thu Jun 05 2008 - 07:47:56 EST


Peter Zijlstra wrote:
On Wed, 2008-06-04 at 12:26 -0700, Max Krasnyansky wrote:
Mark Hounschell wrote:
IMHO,

What is an abonination, is that cpusets are equired for this type of
isolation to begin with, even on a 2 processor machine.

I would like the option to stay and be extended like Max originally
proposed. If cpusets/hotplug are configured isolation would be obtained
using them. If not then isolcpus could be used to get the same isolation.

From a user land point of view, I just want an easy way to fully isolate
a particular cpu. Even a new syscall or extension to sched_setaffinity
would make me happy. Cpusets and hotplug don't.

Again this is just MHO.
Mark, I used to be the same way and I'm a convert now. It does seems like an
overkill for 2cpu machine to have cpusets and cpu hotplug. But both options
cost around 50KB worth of text and maybe another 10KB of data. That's on the
x86-64 box. Let's say it's a 100KB. Not a terribly huge overhead.

Now if you think about it. In order to be able to dynamically isolate a cpu we
have to do exact same thing that CPU hotplug does. Which is to clear all
timers, kernel, threads, etc from that CPUs. It does not make sense to
implement a separate logic for that. You could argue that you do not need
dynamic isolation but it's too inflexible in general even on 2way machines
it's waste to not be able to use second cpu for general load even when RT app
is not running. Given that CPU hotplug is necessary for many things, including
suspend on multi-cpu machines it's practically guaranteed to be very stable
and well supported. In other words we have a perfect synergy here :).

Now, about the cpusets. You do not really have to do anything fancy with them.
If all you want to do is to disable systemwide load balancing
mount -tcgroup -o cpuset cpuset /dev/cpuset
echo 0 > /dev/cpuset/cpuset.sched_load_banace

That's it. You get _exactly_ the same effect as with isolcpus=. And you can
change that dynamically, and when you switch to quad- and eight- core machines
then you'll be to do that with groups of cpus, not just system wide.

Just to complete the example above. Lets say you want to isolate cpu2
(assuming that cpusets are already mounted).

# Bring cpu2 offline
echo 0 > /sys/devices/system/cpu/cpu2/online

# Disable system wide load balancing
echo 0 > /dev/cpuset/cpuset.sched_load_banace

# Bring cpu2 online
echo 1 > /sys/devices/system/cpu/cpu2/online

Now if you want to un-isolate cpu2 you do

# Disable system wide load balancing
echo 1 > /dev/cpuset/cpuset.sched_load_banace

Of course this is not a complete isolation. There are also irqs (see my
"default irq affinity" patch), workqueues and the stop machine. I'm working on
those too and will release .25 base cpuisol tree when I'm done.


Thanks for the detailed tutorial Max. I'm personally still very skeptical. I really don't believe you'll ever be able to run multiple
_demanding_ RT environments on the same machine. Now matter how many processors you've got. But even though I might be wrong there, thats actually OK with me. I, and I'm sure most, don't have a problem with dedicating a machine to a single RT env.

You've got to hold your tongue just right, look at the right spot on the wall, and be running the RT patched kernel, all at the same time, to run just one successfully. I just want to stop using my tongue and staring at the wall. I personally feel that a single easy method of completely isolating a single processor from the rest of the machine _might_ benefit the RT community more than all this fancy stuff coming down the pipe. Something like your original proposed isolcpus or even a simple SCHED_ISOLATE arg to the setschedular call.

Furthermore, cpusets allow for isolated but load-balanced RT domains. We
now have a reasonably strong RT balancer, and I'm looking at
implementing a full partitioned EDF scheduler somewhere in the future.

This could never be done using isolcpus.


I'm sure my thoughts reflect a gross under estimate of what really has to happen. I will hope for the best and wait.

Regards
Mark


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/