Re: [PATCH 1/2] lockdep: improve current->(hard|soft)irqs_enabled synchronisation with actual irq state

From: Nicholas Piggin
Date: Tue Jul 28 2020 - 07:22:58 EST


Excerpts from peterz@xxxxxxxxxxxxx's message of July 26, 2020 10:11 pm:
> On Sun, Jul 26, 2020 at 02:14:34PM +1000, Nicholas Piggin wrote:
>> Excerpts from Peter Zijlstra's message of July 26, 2020 6:26 am:
>
>> > Which is 'funny' when it interleaves like:
>> >
>> > local_irq_disable();
>> > ...
>> > local_irq_enable()
>> > trace_hardirqs_on();
>> > <NMI/>
>> > raw_local_irq_enable();
>> >
>> > Because then it will undo the trace_hardirqs_on() we just did. With the
>> > result that both tracing and lockdep will see a hardirqs-disable without
>> > a matching enable, while the hardware state is enabled.
>>
>> Seems like an arch problem -- why not disable if it was enabled only?
>> I guess the local_irq tracing calls are a mess so maybe they copied
>> those.
>
> Because, as I wrote earlier, then we can miss updating software state.
> So your proposal has:
>
> raw_local_irq_disable()
> <NMI>
> if (!arch_irqs_disabled(regs->flags) // false
> trace_hardirqs_off();
>
> // tracing/lockdep still think IRQs are enabled
> // hardware IRQ state is disabled.

... and then lockdep_nmi_enter can disable IRQs if they were enabled?

The only reason it's done this way as opposed to a much simple counter
increment/decrement AFAIKS is to avoid some overhead of calling
trace_hardirqs_on/off (which seems a bit dubious but let's go with it).

In that case the lockdep_nmi_enter code is the right spot to clean up
that gap vs NMIs. I guess there's an argument that arch_nmi_enter could
do it. I don't see the problem with fixing it up here though, this is a
slow path so it doesn't matter if we have some more logic for it.

Thanks,
Nick