Re: [PATCH] nohz: Make tick_nohz_irq_exit() irq safe

From: Frederic Weisbecker
Date: Wed Feb 20 2013 - 17:01:25 EST


2013/2/20 Thomas Gleixner <tglx@xxxxxxxxxxxxx>:
> On Wed, 20 Feb 2013, Frederic Weisbecker wrote:
>
>> As it stands, irq_exit() may or may not be called with
>> irqs disabled, depending on __ARCH_IRQ_EXIT_IRQS_DISABLED
>> that the arch can define.
>>
>> It makes tick_nohz_irq_exit() unsafe. For example two
>> interrupts can race in tick_nohz_stop_sched_tick(): the inner
>> most one computes the expiring time on top of the timer list,
>> then it's interrupted right before reprogramming the
>> clock. The new interrupt enqueues a new timer list timer,
>> it reprogram the clock to take it into account and it exits.
>> The CPUs resumes the inner most interrupt and performs the clock
>> reprogramming without considering the new timer list timer.
>>
>> This regression has been introduced by:
>> 280f06774afedf849f0b34248ed6aff57d0f6908
>> ("nohz: Separate out irq exit and idle loop dyntick logic")
>>
>> Let's fix it right now with the appropriate protections.
>
> That's not a fix. That's an hack.

I know it looks that way. That's because it's a pure regression fix,
minimal for backportability.

I'm distinguishing two different things here: the fact that some archs
can call irq_exit() with interrupts enabled which is a global design
problem, and the fact that tick_nohz_irq_exit() was safe against that
until 3.2 when I broke it with a commit of mine.

My goal was basically to restore that protection in a minimal commit
such that we can backport the regression fix, then deal with
__ARCH_IRQ_EXIT_IRQS_DISABLED afterward, since it requires some more
invasive changes.

>> A saner long term solution will be to remove
>> __ARCH_IRQ_EXIT_IRQS_DISABLED.
>
> We really want to enforce that interrupt disabled condition for
> calling irq_exit(). So why make this exclusive to tick_nohz_irq_exit()?

I need a fix that I can backport. Is the below fine with a stable tag?
It looks a bit too invasive for the single regression involved.

Thanks.

>
> Thanks,
>
> tglx
>
> Index: linux-2.6/kernel/softirq.c
> ===================================================================
> --- linux-2.6.orig/kernel/softirq.c
> +++ linux-2.6/kernel/softirq.c
> @@ -322,18 +322,10 @@ void irq_enter(void)
>
> static inline void invoke_softirq(void)
> {
> - if (!force_irqthreads) {
> -#ifdef __ARCH_IRQ_EXIT_IRQS_DISABLED
> + if (!force_irqthreads)
> __do_softirq();
> -#else
> - do_softirq();
> -#endif
> - } else {
> - __local_bh_disable((unsigned long)__builtin_return_address(0),
> - SOFTIRQ_OFFSET);
> + else
> wakeup_softirqd();
> - __local_bh_enable(SOFTIRQ_OFFSET);
> - }
> }
>
> /*
> @@ -341,6 +333,14 @@ static inline void invoke_softirq(void)
> */
> void irq_exit(void)
> {
> +#ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED
> + unsigned long flags;
> +
> + local_irq_save(flags);
> +#else
> + BUG_ON(!irqs_disabled();
> +#endif
> +
> account_irq_exit_time(current);
> trace_hardirq_exit();
> sub_preempt_count(IRQ_EXIT_OFFSET);
> @@ -354,6 +354,9 @@ void irq_exit(void)
> #endif
> rcu_irq_exit();
> sched_preempt_enable_no_resched();
> +#ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED
> + local_irq_restore(flags);
> +#endif
> }
>
> /*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/