Re: [PATCH] sched_rt: Use root_domain of rt_rq not current processor

From: Mike Galbraith
Date: Mon Jan 14 2013 - 22:55:23 EST


On Mon, 2013-01-14 at 11:55 -0600, Shawn Bohrer wrote:
> When the system has multiple domains do_sched_rt_period_timer() can run
> on any CPU and may iterate over all rt_rq in cpu_online_mask. This
> means when balance_runtime() is run for a given rt_rq that rt_rq may be
> in a different rd than the current processor. Thus if we use
> smp_processor_id() to get rd in do_balance_runtime() we may borrow
> runtime from a rt_rq that is not part of our rd.
>
> This changes do_balance_runtime to get the rd from the passed in rt_rq
> ensuring that we borrow runtime only from the correct rd for the given
> rt_rq.
>
> This fixes a BUG at kernel/sched/rt.c:687! in __disable_runtime when we
> try reclaim runtime lent to other rt_rq but runtime has been lent to
> a rt_rq in another rd.

Ah, so there was only one cpu in span when the bug fired in your case.
Good fix. Damn throttle bugs are at least as deadly as the bugs the
thing tries to protect you from. (Yet) another one bites the dust ;-)

> Signed-off-by: Shawn Bohrer <sbohrer@xxxxxxxxxxxxxxx>

That wants a Cc: stable@xxxxxxxxxxxxxxx too.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/