Re: [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context

From: Paul E. McKenney
Date: Mon Nov 24 2014 - 15:54:52 EST


On Mon, Nov 24, 2014 at 12:22:13PM -0800, Andy Lutomirski wrote:
> On Sat, Nov 22, 2014 at 3:41 PM, Paul E. McKenney
> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > On Fri, Nov 21, 2014 at 09:53:29PM -0800, Andy Lutomirski wrote:
> >> On Fri, Nov 21, 2014 at 8:20 PM, Paul E. McKenney
> >> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> >> > On Fri, Nov 21, 2014 at 06:00:14PM -0800, Andy Lutomirski wrote:
> >> >> On Fri, Nov 21, 2014 at 3:38 PM, Paul E. McKenney
> >> >> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>
> > Returning state sounds like a bad idea, if we can reasonably avoid it.
>
> I agree, except that we already do it for exception_enter(), etc. But
> yes, changing fewer things is nice.
>
> >
> > And I think I finally see what you are pointing out about my code: If
> > another NMI comes in between the time I increment ->dynticks_nmi_nesting
> > and the time I atomically increment ->dynticks, the nested NMI handler
> > will incorrectly believe that RCU is already paying attention to this CPU.
> > Which would indeed not be at all good, so good catch!!!
> >
> >> Otherwise, I think that there may need to be enough state somewhere so
> >> that the outermost nested rcu_nmi_enter knows whether to increment
> >> dynticks. For example, dynticks_nmi_nesting could store the nesting
> >> count * 2 - (1 if the outermost nested user needs to increment
> >> dynticks). Something like:
> >>
> >> void rcu_nmi_enter(void)
> >> {
> >> /* Be very careful -- this function may be called reentrently on the
> >> same CPU. */
> >> atomically: increment dynticks if it's even.
> >>
> >> /* If an rcu_nmi_enter/rcu_nmi_exit pair happens here, then it will not change
> >> * the state. */
> >>
> >> local_inc(&dynticks_nmi_nesting, (we incremented dynticks ? 1 : 2));
> >>
> >> WARN_ON(we incremented dynticks and dynticks_nmi_nesting was nonzero);
> >> }
> >>
> >> void rcu_nmi_exit(void)
> >> {
> >> WARN_ON(!(dynticks & 1));
> >> locally atomically: dynticks_nmi_nesting -= 2, unless
> >> dynticks_nmi_nesting == 1, in which case set it to zero
> >>
> >> if (dynticks_nmi_nesting was 1)
> >> atomic_inc(&dynticks);
> >> }
> >>
> >> The invariant here is that, for a single unnested enter/exit, if
> >> dynticks_nmi_nesting != 0, then dynticks is odd. As a result, an
> >> rcu_nmi_enter/rcu_nmi_exit pair at any time when dynticks_nmi_nesting
> >> != 0 *or* dynticks is odd will have no net effect, so the invariant,
> >> in fact, holds for all invocations, nested or otherwise.
> >>
> >> At least one of those conditions is true at all times during the
> >> execution of outermost pair, starting with the first atomic operation
> >> and ending with the final atomic_inc. So they nest properly no matter
> >> what else happens (unless, of course, someone else pokes dynticks in
> >> the middle).
> >>
> >> Thoughts?
> >
> > Let's see... The evenness of ->dynticks should be preserved by nested NMI
> > handlers, so the check and increment need not be atomic. We don't have
> > any way (other than atomic operations) to do local atomic modifications
> > on all architectures, because we cannot mask NMIs. (Yes, it can work
> > on x86, but this is common code that needs to work everywhere.) On the
> > other hand, presumably NMIs are rare, so atomic modification of the NMI
> > nesting counter should be OK, at least if it proves absolutely necessary.
> > And I am thinking that a mechanical proof will be needed here. :-/
> >
> > But first, let me try generating the code and informally evaluating it:
> >
> > 1 struct rcu_dynticks {
> > 2 long long dynticks_nesting;
> > 3 int dynticks_nmi_nesting;
> > 4 atomic_t dynticks;
> > 5 };
> > 6
> > 7 void rcu_nmi_enter(void)
> > 8 {
> > 9 struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> > 10 int incby = 2;
> > 11
> > 12 if (!(atomic_read(&rdtp->dynticks) & 0x1)) {
> > 13 smp_mb__before_atomic();
> > 14 atomic_inc(&rdtp->dynticks);
> > 15 smp_mb__after_atomic();
> > 16 WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
> > 17 incby = 1;
>
> WARN_ON_ONCE(rdtp->dynticks_nmi_nesting < 1) here, perhaps?

That would make sense.

> > 18 }
> > 19 rdtp->dynticks_nmi_nesting += incby;
>
> Oh, I see why you don't need local_add -- it's because an nmi in the
> middle of this increment won't have any effect on the interrupted
> code, so even a software RMW will be okay.

Yep! ;-)

> > 20 barrier();
> > 21 }
> > 22
> > 23 void rcu_nmi_exit(void)
> > 24 {
> > 25 struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> > 26
> > 27 WARN_ON_ONCE(!rdtp->dynticks_nmi_nesting);
> > 28 WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
> > 29 if (rdtp->dynticks_nmi_nesting != 1) {
>
> WARN_ON_ONCE(rdtp->dynticks_nmi_nesting < 2), perhaps?

This is already implied by the WARN_ON_ONCE() on line 27 and the check
on line 29.

> > 30 rdtp->dynticks_nmi_nesting -= 2;
> > 31 return;
> > 32 }
> > 33 rdtp->dynticks_nmi_nesting = 0;
> > 34 smp_mb__before_atomic();
>
> This implies barrier(), right?

Yep!

> > 35 atomic_inc(&rdtp->dynticks);
> > 36 smp_mb__after_atomic();
> > 37 WARN_ON_ONCE(atomic_read(&rdtp->dynticks) & 0x1);
> > 38 }
> >
> > Line 9 picks up a pointer to this CPU's rcu_dynticks structure and line 10
> > assumes that we don't need to increment ->dynticks.
> >
> > Line 12 checks to see if ->dynticks is even. Note that this check is
> > stable: If there are nested NMIs, they will increment ->dynticks twice
> > or not at all, and either way preserves the evenness (to be proven, of
> > course, but that is the plan). If ->dynticks is even, lines 13-15
> > atomically increment it, line 16 complains if still even, and line 17
> > says we will increment ->dynticks_nmi_nesting by only 1.
> >
> > Either way, line 19 increments ->dynticks_nmi_nesting as needed and
> > line 20 keeps the compiler from getting too cute.
> >
> > For rcu_nmi_exit(), line 25 again picks up this CPUs rcu_dynticks
> > structure. Lines 27 and 28 complain bitterly if invariants are violated.
> > If line 29 finds that the value of ->dynticks_nmi_nesting is not 1,
> > then line 30 subtracts 2 from ->dynticks_nmi_nesting and line 31 returns.
> >
> > Otherwise, line 33 sets ->dynticks_nmi_nesting to zero, lines 34-36
> > atomically increment ->dynticks with full ordering, and line 37
> > complains bitterly if ->dynticks is not even.
> >
> > So, if an NMI occurs before rcu_nmi_enter's atomic increment, then the
> > nested NMI's rcu_nmi_enter() and rcu_nmi_exit() will think that they are
> > not nested, which is the correct thing for them to think in that case.
> > They will increment ->dynticks twice and restore ->dynticks_nmi_nesting
> > to zero (adding and then subtracting 1). If the NMI happens after the
> > atomic increment, then the nested rcu_nmi_enter() and rcu_nmi_exit()
> > will leave ->dynticks alone, and will restore ->dynticks_nmi_nesting
> > to zero (adding and subtracting two again). If the NMI happens after
> > the increment of ->dynticks_nmi_nesting, the nested NMI's rcu_nmi_enter()
> > and rcu_nmi_exit() will again restore ->dynticks_nmi_nesting, but this
> > time to one (again adding and subtracting two).
> >
> > In rcu_nmi_exit(), ->dynticks_nmi_nesting of zero had better not happen,
> > one means we need to atomically increment ->dynticks, and other values
> > mean that we are partially or fully nested. Reasoning proceeds as for
> > rcu_nmi_enter(), but in the opposite direction.
> >
> > Whew! That might even work.
>
> I think I like this, with the warnings above.

OK with dropping the one that I called out as redundant?

> > But how about taking a different approach. Assuming that there can
> > never be more than (say) 14 nesting NMI-like things, use the lower
> > four bits of ->dynticks to represent the NMI nesting and the upper
> > 28 bits as the counter. This of course requires modifying lots of
> > places in RCU that check the counter, but it is probably time to
> > abstract the check anyway.
> >
> > This would allow my earlier attempted logic to work and (maybe) simplify
> > the reasoning a bit (and yes, the "magic" constants need macros):
> >
> > void rcu_nmi_enter(void)
> > {
> > struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> > int nesting = atomic_read(&rdtp->dynticks) & 0xf;
> > int incby = 0x01;
> >
> > WARN_ON_ONCE(nexting == 0xf);
> > if (nesting == 0) {
> > if (atomic_read(&rdtp->dynticks) & 0x10)
> > return;
> > incby = 0x11;
> > }
> > smp_mb__before_atomic();
> > atomic_add(&rdtp->dynticks, incby);
> > smp_mb__after_atomic();
> > WARN_ON_ONCE(!(atomic_read(&rdtp->dynticks) & 0x1));
> > }
> >
> > void rcu_nmi_exit(void)
> > {
> > struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> > int nesting = atomic_read(&rdtp->dynticks) & 0xf;
> > int incby = 0x0f;
> >
> > if (nesting == 0)
> > return;
> > if (nesting > 1)
> > incby = -1;
> > smp_mb__before_atomic();
> > atomic_add(&rdtp->dynticks, incby);
> > smp_mb__after_atomic();
> > WARN_ON_ONCE(atomic_read(&rdtp->dynticks) & 0x1);
> > }
> >
> > Over to you! ;-)
>
> This latter one is all you :)

Well, let's see how I feel about it after trying a Promela model of
the first code sequence. ;-)

Thanx, Paul

> --Andy
>
> >
> > Thanx, Paul
> >
>
>
>
> --
> Andy Lutomirski
> AMA Capital Management, LLC
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/