Re: [GIT pull] locking/urgent for v5.10-rc6

From: Peter Zijlstra
Date: Tue Dec 01 2020 - 14:15:30 EST


On Tue, Dec 01, 2020 at 06:57:37PM +0000, Mark Rutland wrote:
> On Tue, Dec 01, 2020 at 07:15:06PM +0100, Peter Zijlstra wrote:
> > On Tue, Dec 01, 2020 at 03:55:19PM +0100, Peter Zijlstra wrote:
> > > On Tue, Dec 01, 2020 at 06:46:44AM -0800, Paul E. McKenney wrote:
> > >
> > > > > So after having talked to Sven a bit, the thing that is happening, is
> > > > > that this is the one place where we take interrupts with RCU being
> > > > > disabled. Normally RCU is watching and all is well, except during idle.
> > > >
> > > > Isn't interrupt entry supposed to invoke rcu_irq_enter() at some point?
> > > > Or did this fall victim to recent optimizations?
> > >
> > > It does, but the problem is that s390 is still using
> >
> > I might've been too quick there, I can't actually seem to find where
> > s390 does rcu_irq_enter()/exit().
> >
> > Also, I'm thinking the below might just about solve the current problem.
> > The next problem would then be it calling TRACE_IRQS_ON after it did
> > rcu_irq_exit()... :/
>
> I gave this patch a go under QEMU TCG atop v5.10-rc6 s390 defconfig with
> PROVE_LOCKING and DEBUG_ATOMIC_SLEEP. It significantly reduces the
> number of lockdep splats, but IIUC we need to handle the io_int_handler
> path in addition to the ext_int_handler path, and there's a remaining
> lockdep splat (below).

I'm amazed it didn't actually make things worse, given how I failed to
spot do_IRQ() was arch code etc..

> If this ends up looking like we'll need more point-fixes, I wonder if we
> should conditionalise the new behaviour of the core idle code under a
> new CONFIG symbol for now, and opt-in x86 and arm64, then transition the
> rest once they've had a chance to test. They'll still be broken in the
> mean time, but no more so than they previously were.

We can do that I suppose... :/