Re: broken suspend (sched related) [Was: 2.6.24-rc4-mm1]

From: Jiri Slaby
Date: Mon Dec 10 2007 - 03:55:50 EST


On 12/10/2007 09:19 AM, Gautham R Shenoy wrote:
> commit 15bfb662b35c609490185fba2fd4713d230b9374
> Author: Gautham R Shenoy <ego@xxxxxxxxxx>
> Date: Mon Dec 10 13:41:45 2007 +0530
>
> softlockup: remove get_online_cpus() which doesn't help here.
>
> The get_online_cpus() protection seems to be bogus
> in kernel/softlockup.c as cpu cached in check_cpu can go offline
> once we do a put_online_cpus().
>
> This can also cause deadlock during a cpu offline as follows:
>
> WATCHDOG_THREAD: OFFLINE_CPU:
> mutex_down(&cpu_hotplug.lock);
> /* All subsequent get_online_cpus
> * will be blocked till we're
> * done with this cpu-hotplug
> * operation.
> */
>
> get_online_cpus();
> /* watchdog is blocked
> Thus we cannot
> go further until
> the cpu-hotplug
> operation completes
> */
> CPU_DEAD:
> kthread_stop(watchdog_thread);
>
> /* we're trying to stop a
> * thread which is blocked
> * waiting for us to finish.
> *
> * Since we cannot finish until
> * the thread stops, we deadlock here!
> */
>
> Signed-off-by: Gautham R Shenoy <ego@xxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxxx>
> Cc: Arjan van de Van <arjan@xxxxxxxxxxxxxxx>
> Cc: Jiri Slaby <jirislaby@xxxxxxxxx>

Tested-by: Jiri Slaby <jirislaby@xxxxxxxxx>

> diff --git a/kernel/softlockup.c b/kernel/softlockup.c
> index e50b44a..576eb9c 100644
> --- a/kernel/softlockup.c
> +++ b/kernel/softlockup.c
> @@ -219,9 +219,7 @@ static int watchdog(void *__bind_cpu)
> /*
> * Only do the hung-tasks check on one CPU:
> */
> - get_online_cpus();
> check_cpu = any_online_cpu(cpu_online_map);
> - put_online_cpus();
>
> if (this_cpu != check_cpu)
> continue;
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/