Re: [PATCH v2 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

From: Steven Rostedt
Date: Tue Apr 07 2015 - 08:42:04 EST


On Tue, 7 Apr 2015 14:04:03 +0200
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Tue, Apr 07, 2015 at 01:47:16PM +0200, Mike Galbraith wrote:
> > On Tue, 2015-04-07 at 13:23 +0200, Thomas Gleixner wrote:
> > > On Mon, 6 Apr 2015, Thavatchai Makphaibulchoke wrote:
> > >
> > > > This patch fixes the problem that the ownership of a mutex acquired
> > > > by an interrupt handler(IH) gets incorrectly attributed to the
> > > > interrupted thread.
> > >
> > > An hard interrupt handler is not allowed to take a mutex. End of
> > > story, nothing to fix here.
> >
> > Well, the patch that started this thread..
> >
> > timers-do-not-raise-softirq-unconditionally.patch
>
> Aah, that is the problem..
>

Yep, all this nonsense came from that patch and trying to get
NO_HZ_FULL working with -rt. It's a bit ironic that the push to get
NO_HZ_FULL into mainline came from our RT mini summit, but its
implementation is broken on -rt :-p

Ideally, we don't want to take mutexes in hard interrupt context.


> @@ -1454,8 +1452,32 @@ static void run_timer_softirq(struct softirq_action *h)
> */
> void run_local_timers(void)
> {
> + struct tvec_base *base = __this_cpu_read(tvec_bases);
> +
> hrtimer_run_queues();
> - raise_softirq(TIMER_SOFTIRQ);
> + /*
> + * We can access this lockless as we are in the timer
> + * interrupt. If there are no timers queued, nothing to do in
> + * the timer softirq.
> + */
> +#ifdef CONFIG_PREEMPT_RT_FULL
> + if (!spin_do_trylock(&base->lock)) {
> + raise_softirq(TIMER_SOFTIRQ);
> + return;
> + }
> +#endif
> + if (!base->active_timers)
> + goto out;
> +
> + /* Check whether the next pending timer has expired */
> + if (time_before_eq(base->next_timer, jiffies))
> + raise_softirq(TIMER_SOFTIRQ);
> +out:
> +#ifdef CONFIG_PREEMPT_RT_FULL
> + rt_spin_unlock_after_trylock_in_irq(&base->lock);
> +#endif
> + /* The ; ensures that gcc won't complain in the !RT case */
> + ;
> }
>
> That smells like something we should be able to do without a lock.
>
> If we use {READ,WRITE}_ONCE() on those two fields (->active_timers and
> ->next_timer) we should be able to do this without the spinlock.
>
> Races here aren't really a problem I think, if you manage to install a
> timer at the current jiffy and have already missed the tick you're in
> the same boat. You get to wait for the next tick.

I'll take a deeper look at this code too. If we can get rid of this
hack, then we don't need the mutex-in-irq hack either.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/