Re: [PATCH 09/10] sched/fair: disable stealing if too many NUMA nodes

From: Peter Zijlstra
Date: Mon Oct 22 2018 - 18:22:58 EST


On Mon, Oct 22, 2018 at 03:21:20PM -0400, Steven Sistare wrote:
> On 10/22/2018 2:47 PM, Steven Sistare wrote:
> > On 10/22/2018 1:06 PM, Peter Zijlstra wrote:
> >> On Mon, Oct 22, 2018 at 07:59:40AM -0700, Steve Sistare wrote:
> >>> The STEAL feature causes regressions on hackbench on larger NUMA systems,
> >>> so disable it on systems with more than sched_steal_node_limit nodes
> >>> (default 2).
> >>
> >> How come? From a quick read the stealing is per LLC, where do we steal
> >> across nodes?
> >
> > See the complete explanation in this patch. It is deeper than can be gleaned
> > from a quick read.
>
> I should have said a bit more. Your quick take on stealing is correct, we do
> not steal across nodes. However, stealing reduces average run queue length which
> influences wake_affine migrations. Now see the complete explanation.

Right; read a bit more just now.

hackbench is a fairly poor benchmark for numa performance. One that
comes to mind is multi wharehouse specjbb stuff (assuming you have numa
balance enabled of course).