Re: [PATCH v1] sched: fix nohz idle load balancer issues

From: Srivatsa Vaddagiri
Date: Wed Sep 28 2011 - 00:15:14 EST


* Suresh Siddha <suresh.b.siddha@xxxxxxxxx> [2011-09-27 16:49:36]:

> One of the reasons why we saw lib_cpu not idle is probably because that
> info was stale.
>
> Consider this scenario.
>
> a. got a tick when the cpu was busy, so idle_at_tick was not set
> b. cpu went idle
> c. same cpu got the kick IPI from other busy cpu
> d. and as it has idle_at_tick not set, it couldn't proceed with the nohz
> idle balance.

Good point ..we chould use idle_cpu() instead there ..

> I think we are mostly likely seeing the above mentioned scenario.
>
> Also Vatsa, there is a deadlock associated by using
> __smp_call_funciton_single() in the nohz_balancer_kick(). So I am
> planning to remove the IPI that is used to kick the nohz balancer and
> instead use the resched_cpu logic to kick the nohz balancer.
>
> I will post this patch mostly tomorrow. That patch will not use the
> idle_at_tick check in the nohz_idle_balance(). So that should address
> your issue in some cases if not most.

Ok ..would be glad to test your change ..I am however doubtfull if it
will eliminate rest of the issues I pointed out with nohz load balancer.

- vatsa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/