Re: [v7 0/8] Reduce cross CPU IPI interference

From: Chris Metcalf
Date: Thu Mar 01 2012 - 13:27:36 EST


(Sorry, away on vacation for a while, and just getting back to this thread.)

On 2/20/2012 8:34 PM, Frederic Weisbecker wrote:
> On Wed, Feb 15, 2012 at 04:50:39PM -0500, Chris Metcalf wrote:
>> The Tilera dataplane code is available on the "dataplane" branch (off of
>> 3.3-rc3 at the moment):
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile.git
>>
>> I'm still looking at Frederic's git tree, but I'm assuming the following
>> are all true of tasks that are running on a nohz cpuset core:
>>
>> - The core will not run the global scheduler to do work stealing, since
>> otherwise you can't guarantee that only tasks that care about userspace
>> nohz get to run there. (I suppose you could loosen thus such that the core
>> would do work stealing as long as no task was pinned to that core by
>> affinity, at which point the pinned task would become the only runnable task.)
> A nohz cpuset doesn't really control that. It actually reacts to the scheduler
> actions. Like try to stop the tick if there is only one task on the runqueue,
> restart it when we have more.
>
> Ensuring the CPU doesn't get distracted is rather the role of the user by
> setting the right cpusets to get the desired affinity. And if we still have
> noise with workqueues or something, this is something we need to look at
> and fix on a case by case basis.

So won't it still be the case that the nohz cpus will try to run the global
scheduler to do load balancing? Or are you relying on something like the
idle load balancer functionality to do the load balancing externally? The
reason for isolcpus in the Tilera code is just to avoid having the
dataplane cpus ever end up having to schedule a tick just so they can do
load balancing work.

Frederic, do you have a design document, or anything else I can read other
than the code in your git tree? I still haven't found time to do that,
though I'd definitely like to start figuring out how I can integrate your
stuff and the Tilera stuff into a single thing that both meets our
customers' needs, AND is actually integrated into the kernel.org master :-)

>> - Kernel "background" tasks are disabled on that core, at least while
>> userspace nohz tasks are running: softlockup watchdog, slab reap timer,
>> vmstat thread, etc.
> Yeah that's examples of "noisy" things. Those are in fact a seperate issues
> that nohz cpusets don't touch. nohz cpuset are really only about trying to
> shut down the periodic tick, or defer it for a far as possible in the future.
>
> Now the nohz cpuset uses some user/kernel entry/exit hooks that we can extend
> to cover some of these cases. We may want to make some timers "user-deferrable",
> ie: deactivate, reactivate them on kernel entry and exit.
>
> That need some thinking though, this may not always be a win for every workload.
> But those that are userspace-mostly can profit.

Yes. The workloads we are focused on (along with Gilad and some others) is
just the very simple one where we want to be able to have something go into
userspace, and get 100.000% of the cpu until the task itself takes some
action that requires kernel support.

--
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/