Re: [RFC][PATCH] sched: Have do_idle() call __schedule() without enabling preemption

From: Steven Rostedt
Date: Wed Apr 12 2017 - 10:41:36 EST


[ tl;dr; version ]

Peter, In order to have synchronize_rcu_tasks() work, the idle task can
never be preempted. There's a very small window in
schedule_preempt_disable() that enables preemption, and when this
happens, it breaks synchronize_rcu_tasks() (see above email for
details).

Is there any reason to enable preemption, or can we simply have idle
call into schedule without ever allowing it to be preempted, as in my
patch?

Note, it is almost good enough to just change
schedule_preempt_disable() to do the exact same thing, but there's one
instance in kernel/locking/mutex.c that calls it in a non running state.

Signed-off-by: Steven Rostedt (VMware) <rostedt@xxxxxxxxxxx>
---
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3b31fc0..cb9ceda 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3502,6 +3502,13 @@ asmlinkage __visible void __sched schedule(void)
}
EXPORT_SYMBOL(schedule);

+void __sched schedule_idle(void)
+{
+ do {
+ __schedule(false);
+ } while (need_resched());
+}
+
#ifdef CONFIG_CONTEXT_TRACKING
asmlinkage __visible void __sched schedule_user(void)
{
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index ac6d517..229c17e 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -264,7 +264,7 @@ static void do_idle(void)
smp_mb__after_atomic();

sched_ttwu_pending();
- schedule_preempt_disabled();
+ schedule_idle();
}

bool cpu_in_idle(unsigned long pc)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 5cbf922..c5ee02b 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1465,6 +1465,8 @@ static inline struct cpuidle_state *idle_get_state(struct rq *rq)
}
#endif

+extern void schedule_idle(void);
+
extern void sysrq_sched_debug_show(void);
extern void sched_init_granularity(void);
extern void update_max_interval(void);