Re: [PATCH 1/3] sched: select_task_rq() should check cpu_active() like select_fallback_rq()

From: Oleg Nesterov
Date: Sun Oct 11 2015 - 14:07:38 EST


On 10/10, Oleg Nesterov wrote:
>
> I do not understand the cpu_active() check in select_fallback_rq().
> x86 doesn't need it, and the recent commit dd9d3843755d "sched: Fix
> cpu_active_mask/cpu_online_mask race" documents the fact that on any
> architecture we can ignore !active starting from CPU_ONLINE stage.
>
> But any possible reason why do we need this check in "fallback" must
> equally apply to select_task_rq().

And I still think this is true, select_task_rq() and select_fallback_rq()
should use the same check in any case...

> +static inline bool cpu_allowed(int cpu)
> +{
> + return cpu_online(cpu) && cpu_active(cpu);
> +}
...
> @@ -1390,7 +1391,7 @@ int select_task_rq(struct task_struct *p, int cpu, int sd_flags, int wake_flags)
> * not worry about this generic constraint ]
> */
> if (unlikely(!cpumask_test_cpu(cpu, tsk_cpus_allowed(p)) ||
> - !cpu_online(cpu)))
> + !cpu_allowed(cpu)))
> cpu = select_fallback_rq(task_cpu(p), p);

But as Fengguang reports (thanks a lot!) this change is wrong. It leads
to another BUG_ON(td->cpu != smp_processor_id()) before ht->park(td->cpu)
in smpboot_thread_fn().

I should have realized this. smpboot_park_threads() is called after
CPU_DOWN_PREPARE. And this can break other PF_NO_SETAFFINITY threads.

Perhaps I am totally confused, but to me this looks like another
indication that select_fallback_rq() should not check cpu_active(),
or at least this needs some changes...

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/