[PATCH 0/2] sched/rt: Fix race with sched IPIs and root domain changes

From: Steven Rostedt
Date: Tue Jan 23 2018 - 20:56:53 EST


Hi Ingo and Peter,

Pavan reported that he was able to trigger a crash when doing CPU
hotplug and scheduling of real-time tasks. It came down to the
IPI logic. Here's the issue broken down:

- The RT overloaded mask and variables are associated to a root domain.

- The IPI logic uses the RT overload mask and variables

- The scheduler (under rq->lock) kicks off the IPI logic

- The IPI logic accesses the RT overload mask and variables without
the rq->lock held.

- The root domain can be modified while the rq->lock is not held

What Pavan saw first came from the taking of the rq->rd->rto_lock
and releasing it. There was issues where the unlock was done to a lock
not held. What happened was during CPU hotplug, the rq->rd changed
while the IPIs were going around. spin_lock(rq->rd->rto_lock)
was a different lock than the spin_unlock(rq->rd->rto_lock)

The first patch fixes that. Instead of using rq = this_rq()
and then accessing the root domain with rq->rd, as the irq work
is also associated to the root domain, we could simply grab the
rd from a container_of(work).

Then another issue came up while discussing this. That is, not only
can rq->rd change, but the rd itself can be freed. Thus to keep
the rd around, the second patch adds sched_get_rd() and sched_put_rd()
interface where the rt scheduler can up the ref count of the root domain
when it kicks off the IPIs, and then dec it, when there's no more
IPI to go around. The sched_put_rd() will also free the root domain
if the ref count goes to zero, just like it does when being released.

Please take these patches, unless you have any concerns with them.
Let me know what those concerns are.

Thanks!

-- Steve



git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
tip/sched/core

Head SHA1: 4397f04575c44e1440ec2e49b6302785c95fd2f8
fa972f829b2ce71bb0ff58bb97e80dcd001ab66c


Steven Rostedt (VMware) (2):
sched/rt: Use container_of() to get root domain in rto_push_irq_work_func()
sched/rt: Up the root domain ref count when passing it around via IPIs

----
kernel/sched/rt.c | 24 +++++++++++++++---------
kernel/sched/sched.h | 2 ++
kernel/sched/topology.c | 13 +++++++++++++
3 files changed, 30 insertions(+), 9 deletions(-)