Re: [PATCH v3] sched/topology: Improve load balancing on AMD EPYC

From: Peter Zijlstra
Date: Tue Jul 23 2019 - 10:09:25 EST


On Tue, Jul 23, 2019 at 02:03:21PM +0100, Mel Gorman wrote:
> On Tue, Jul 23, 2019 at 02:00:30PM +0200, Peter Zijlstra wrote:
> > On Tue, Jul 23, 2019 at 12:42:48PM +0100, Mel Gorman wrote:
> > > On Tue, Jul 23, 2019 at 11:48:30AM +0100, Matt Fleming wrote:
> > > > Signed-off-by: Matt Fleming <matt@xxxxxxxxxxxxxxxxxxx>
> > > > Cc: "Suthikulpanit, Suravee" <Suravee.Suthikulpanit@xxxxxxx>
> > > > Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> > > > Cc: "Lendacky, Thomas" <Thomas.Lendacky@xxxxxxx>
> > > > Cc: Borislav Petkov <bp@xxxxxxxxx>
> > >
> > > Acked-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> > >
> > > The only caveat I can think of is that a future generation of Zen might
> > > take a different magic number than 32 as their remote distance. If or
> > > when this happens, it'll need additional smarts but lacking a crystal
> > > ball, we can cross that bridge when we come to it.
> >
> > I just suggested to Matt on IRC we could do something along these lines,
> > but we can do that later.
> >
>
> That would seem fair but I do think it's something that could be done
> later (maybe 1 release away?) to avoid a false bisection to this patch by
> accident.

Quite agreed; changing reclaim_distance like that will affect a lot of
hardware, while the current patch limits the impact to just AMD-Zen
based bits.

> I don't *think* there are any machines out there with a 1-hop
> distance of 14 but if there is, your patch would make a difference to
> MM behaviour. In the same context, it might make sense to rename the
> value to somewhat reflective of the fact that "reclaim distance" affects
> scheduler placement. No good name springs to mind at the moment.

Yeah, naming sucks. Let us pain this bicycle shed blue!