Re: regression/bisected commit 773688a6cb24b0b3c2ba40354d883348a2befa38 make my system completely unusable under high load

From: Marco Elver
Date: Fri Feb 02 2024 - 04:00:51 EST


On Thu, 1 Feb 2024 at 23:08, Mikhail Gavrilov
<mikhail.v.gavrilov@xxxxxxxxx> wrote:
>
> On Tue, Jan 30, 2024 at 4:14 AM Andrey Konovalov <andreyknvl@xxxxxxxxx> wrote:
> > Hi Mikhail,
> >
> > Please try to apply these two patches on top:
> > https://lore.kernel.org/linux-mm/20240129100708.39460-1-elver@xxxxxxxxxx/
[1]
> >
> > They effectively revert the change you mentioned.
> >
>
> I tried applying these patches on top of 6.8-rc2 and
> 6.8-git6764c317b6bb but performance unfortunately has not changed and
> is still on regression level.
> Maybe we can try something else?

That's strange - the patches at [1] definitely revert the change you
bisected to. It's possible there is some other strange side-effect. (I
assume that you are still running all this with a KASAN kernel.)

Just so I understand it right:
You say before commit cc478e0b6bdffd20561e1a07941a65f6c8962cab the
game's FPS were good. But that is strange, because at that point we're
already doing stackdepot refcounting, i.e. after commit
773688a6cb24b0b3c2ba40354d883348a2befa38 which you reported as the
initial performance regression. The patches at [2] fixed that problem.

So now it's unclear to me how the simple change in
cc478e0b6bdffd20561e1a07941a65f6c8962cab causes the performance
problem, when in fact this is already with KASAN stackdepot
refcounting enabled but without the performance fixes from [1] and
[2].

[2] https://lore.kernel.org/all/20240118110216.2539519-2-elver@xxxxxxxxxx/

My questions now would be:
- What was the game's FPS in the last stable kernel (v6.7)?
- Can you collect another set of performance profiles between good and
bad? Maybe it would show where the time in the kernel is spent.
- Could it be an inconclusive bisection?

Thanks,
-- Marco