Re: [RFC][PATCH 02/11] sched: Create preempt_count invariant

From: Frederic Weisbecker
Date: Tue Sep 29 2015 - 08:55:42 EST


On Tue, Sep 29, 2015 at 11:28:27AM +0200, Peter Zijlstra wrote:
> Ensure that upon scheduling preempt_count == 2; although currently an
> additional PREEMPT_ACTIVE is still possible.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> ---
> arch/x86/include/asm/preempt.h | 3 ++-
> include/asm-generic/preempt.h | 2 +-
> kernel/sched/core.c | 14 ++++++++++----
> 3 files changed, 13 insertions(+), 6 deletions(-)
>
> --- a/arch/x86/include/asm/preempt.h
> +++ b/arch/x86/include/asm/preempt.h
> @@ -31,7 +31,8 @@ static __always_inline void preempt_coun
> * must be macros to avoid header recursion hell
> */
> #define init_task_preempt_count(p) do { \
> - task_thread_info(p)->saved_preempt_count = PREEMPT_DISABLED; \
> + task_thread_info(p)->saved_preempt_count = \
> + 2*PREEMPT_DISABLE_OFFSET + PREEMPT_NEED_RESCHED; \
> } while (0)
>
> #define init_idle_preempt_count(p, cpu) do { \
> --- a/include/asm-generic/preempt.h
> +++ b/include/asm-generic/preempt.h
> @@ -24,7 +24,7 @@ static __always_inline void preempt_coun
> * must be macros to avoid header recursion hell
> */
> #define init_task_preempt_count(p) do { \
> - task_thread_info(p)->preempt_count = PREEMPT_DISABLED; \
> + task_thread_info(p)->preempt_count = 2*PREEMPT_DISABLED; \

Since it's not quite obvious why we use this magic value without looking
at schedule_tail() details, maybe add a little comment? (Just "/* see schedule_tail() */").

> } while (0)
>
> #define init_idle_preempt_count(p, cpu) do { \
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2588,11 +2588,17 @@ asmlinkage __visible void schedule_tail(
> {
> struct rq *rq;
>
> - /* finish_task_switch() drops rq->lock and enables preemtion */
> - preempt_disable();
> - rq = finish_task_switch(prev);
> + /*
> + * Still have preempt_count() == 2, from:
> + *
> + * schedule()
> + * preempt_disable(); // 1
> + * __schedule()
> + * raw_spin_lock_irq(&rq->lock) // 2
> + */

I found that a bit confusing first, because that's a preempt_count()
we actually emulate for a new task. Maybe something like:

+ /*
+ * New task is init with preempt_count() == 2 because prev task left
+ * us after:
+ *
+ * schedule()
+ * preempt_disable(); // 1
+ * __schedule()
+ * raw_spin_lock_irq(&rq->lock) // 2
+ */

> + rq = finish_task_switch(prev); /* drops rq->lock, preempt_count() == 1 */
> balance_callback(rq);
> - preempt_enable();
> + preempt_enable(); /* preempt_count() == 0 */

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/