Re: [PATCH] sched/numa: use runnable_avg to classify node

From: Mel Gorman
Date: Thu Aug 27 2020 - 14:22:28 EST


On Thu, Aug 27, 2020 at 05:43:11PM +0200, Vincent Guittot wrote:
> > The testing was a mixed bag of wins and losses but wins more than it
> > loses. Biggest loss was a 9.04% regression on nas-SP using openmp for
> > parallelisation on Zen1. Biggest win was around 8% gain running
> > specjbb2005 on Zen2 (with some major gains of up to 55% for some thread
> > counts). Most workloads were stable across multiple Intel and AMD
> > machines.
> >
> > There were some oddities in changes in NUMA scanning rate but that is
> > likely a side-effect because the locality over time for the same loads
> > did not look obviously worse. There was no negative result I could point
> > at that was not offset by a positive result elsewhere. Given it's not
> > a univeral win or loss, matching numa and lb balancing as closely as
> > possible is best so
> >
> > Reviewed-by: Mel Gorman <mgorman@xxxxxxx>
>
> Thanks.
>
> I will try to reproduce the nas-SP test on my setup to see what is going one
>

You can try but you might be chasing ghosts. Please note that this nas-SP
observation was only on zen1 and only for C-class and OMP. The other
machines tested for the same class and OMP were fine (including zen2). Even
D-class on the same machine with OMP was fine as was MPI in both cases. The
bad result indicated that NUMA scanning and faulting was higher but that
is more likely to be a problem with NUMA balancing than your patch.

In the five iterations, two iterations showed a large spike in scan rate
towards the end of an iteration but not the other three. The scan rate
was also not consistently high so there is a degree of luck involved with
SP specifically and there is not a consistently penalty as a result of
your patch.

The only thing to be aware of is that this patch might show up in
bisections once it's merged for both performance gains and losses.

--
Mel Gorman
SUSE Labs