Re: [patch V6 12/37] x86/entry: Provide idtentry_entry/exit_cond_rcu()

From: Paul E. McKenney
Date: Wed May 20 2020 - 14:05:50 EST


On Wed, May 20, 2020 at 09:51:17AM -0700, Andy Lutomirski wrote:
> On Wed, May 20, 2020 at 8:36 AM Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> >
> > On Tue, May 19, 2020 at 7:23 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> > >
> > > On Tue, May 19, 2020 at 05:26:58PM -0700, Andy Lutomirski wrote:
> > > > On Tue, May 19, 2020 at 2:20 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> First, the patch as you submitted it is Acked-by: Andy Lutomirski
> <luto@xxxxxxxxxx>. I think there are cleanups that should happen, but
> I think the patch is correct.
>
> About cleanups, concretely: I think that everything that calls
> __idtenter_entry() is called in one of a small number of relatively
> sane states:
>
> 1. User mode. This is easy.
>
> 2. Kernel, RCU is watching, everything is sane. We don't actually
> need to do any RCU entry/exit pairs -- we should be okay with just a
> hypothetical RCU tickle (and IRQ tracing, etc). This variant can
> sleep after the entry part finishes if regs->flags & IF and no one
> turned off preemption.
>
> 3. Kernel, RCU is not watching, system was idle. This can only be an
> actual interrupt.
>
> So maybe the code can change to:
>
> if (user_mode(regs)) {
> enter_from_user_mode();
> } else {
> if (!__rcu_is_watching()) {
> /*
> * If RCU is not watching then the same careful
> * sequence vs. lockdep and tracing is required.
> *
> * This only happens for IRQs that hit the idle loop, and
> * even that only happens if we aren't using the sane
> * MWAIT-while-IF=0 mode.
> */
> lockdep_hardirqs_off(CALLER_ADDR0);
> rcu_irq_enter();
> instrumentation_begin();
> trace_hardirqs_off_prepare();
> instrumentation_end();
> return true;
> } else {
> /*
> * If RCU is watching then the combo function
> * can be used.
> */
> instrumentation_begin();
> trace_hardirqs_off();
> rcu_tickle();
> instrumentation_end();
> }
> }
> return false;
>
> This is exactly what you have except that the cond_rcu part is gone
> and I added rcu_tickle().
>
> Paul, the major change here is that if an IRQ hits normal kernel code
> (i.e. code where RCU is watching and we're not in an EQS), the IRQ
> won't call rcu_irq_enter() and rcu_irq_exit(). Instead it will call
> rcu_tickle() on entry and nothing on exit. Does that cover all the
> bases?

>From an RCU viewpoint, yes, give or take my concerns about someone
putting rcu_tickle() on entry and rcu_irq_exit() on exit. Perhaps
I can bring some lockdep trickery to bear.

But I must defer to Thomas and Peter on the non-RCU/non-NO_HZ_FULL
portions of this.

Thanx, Paul