Re: [PATCH v2 1/2] sched: fix init NOHZ_IDLE flag

From: Vincent Guittot
Date: Tue Feb 19 2013 - 05:57:00 EST


On 19 February 2013 11:29, Vincent Guittot <vincent.guittot@xxxxxxxxxx> wrote:
> On 18 February 2013 16:40, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
>> 2013/2/18 Vincent Guittot <vincent.guittot@xxxxxxxxxx>:
>>> On 18 February 2013 15:38, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
>>>> I pasted the original at: http://pastebin.com/DMm5U8J8
>>>
>>> We can clear the idle flag only in the nohz_kick_needed which will not
>>> be called if the sched_domain is NULL so the sequence will be
>>>
>>> = CPU 0 = = CPU 1=
>>>
>>> detach_and_destroy_domain {
>>> rcu_assign_pointer(cpu1_dom, NULL);
>>> }
>>>
>>> dom = new_domain(...) {
>>> nr_cpus_busy = 0;
>>> set_idle(CPU 1);
>>> }
>>> dom =
>>> rcu_dereference(cpu1_dom)
>>> //dom == NULL, return
>>>
>>> rcu_assign_pointer(cpu1_dom, dom);
>>>
>>> dom =
>>> rcu_dereference(cpu1_dom)
>>> //dom != NULL,
>>> nohz_kick_needed {
>>>
>>> set_idle(CPU 1)
>>> dom
>>> = rcu_dereference(cpu1_dom)
>>>
>>> //dec nr_cpus_busy,
>>> }
>>>
>>> Vincent
>>
>> Ok but CPU 0 can assign NULL to the domain of cpu1 while CPU 1 is
>> already in the middle of nohz_kick_needed().
>
> Yes nothing prevents the sequence below to occur
>
> = CPU 0 = = CPU 1=
> dom =
> rcu_dereference(cpu1_dom)
> //dom != NULL
> detach_and_destroy_domain {
> rcu_assign_pointer(cpu1_dom, NULL);
> }
>
> dom = new_domain(...) {
> nr_cpus_busy = 0;
> //nr_cpus_busy in the new_dom
> set_idle(CPU 1);
> }
> nohz_kick_needed {
> clear_idle(CPU 1)
> dom =
> rcu_dereference(cpu1_dom)
>
> //cpu1_dom == old_dom
> inc nr_cpus_busy,
>
> //nr_cpus_busy in the old_dom
> }
>
> rcu_assign_pointer(cpu1_dom, dom);
> //cpu1_dom == new_dom

The sequence above is not correct in addition to become unreadable
after going through gmail

The correct and readable version
https://pastebin.linaro.org/1750/

Vincent

>
> I'm not sure that this can happen in practice because CPU1 is in
> interrupt handler but we don't have any mechanism to prevent the
> sequence.
>
> The NULL sched_domain can be used to detect this situation and the
> set_cpu_sd_state_busy function can be modified like below
>
> inline void set_cpu_sd_state_busy
> {
> struct sched_domain *sd;
> int cpu = smp_processor_id();
> + int clear = 0;
>
> if (!test_bit(NOHZ_IDLE, nohz_flags(cpu)))
> return;
> - clear_bit(NOHZ_IDLE, nohz_flags(cpu));
>
> rcu_read_lock();
> for_each_domain(cpu, sd) {
> atomic_inc(&sd->groups->sgp->nr_busy_cpus);
> + clear = 1;
> }
> rcu_read_unlock();
> +
> + if (likely(clear))
> + clear_bit(NOHZ_IDLE, nohz_flags(cpu));
> }
>
> The NOHZ_IDLE flag will not be clear if we have a NULL sched_domain
> attached to the CPU.
> With this implementation, we still don't need to get the sched_domain
> for testing the NOHZ_IDLE flag which occurs each time CPU becomes idle
>
> The patch 2 become useless
>
> Vincent
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/