Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock

From: Oleg Nesterov
Date: Wed Dec 10 2014 - 12:51:50 EST


On 12/10, David Hildenbrand wrote:
>
> @@ -127,20 +119,16 @@ void put_online_cpus(void)
> {
> if (cpu_hotplug.active_writer == current)
> return;
> - if (!mutex_trylock(&cpu_hotplug.lock)) {
> - atomic_inc(&cpu_hotplug.puts_pending);
> - cpuhp_lock_release();
> - return;
> - }
> -
> - if (WARN_ON(!cpu_hotplug.refcount))
> - cpu_hotplug.refcount++; /* try to fix things up */
>
> - if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer))
> - wake_up_process(cpu_hotplug.active_writer);
> - mutex_unlock(&cpu_hotplug.lock);
> - cpuhp_lock_release();
> + if (atomic_dec_and_test(&cpu_hotplug.refcount) &&
> + waitqueue_active(&cpu_hotplug.wq))
> + wake_up(&cpu_hotplug.wq);

OK, waitqueue_active() looks safe... prepare_to_wait() has a barrier.

> void cpu_hotplug_begin(void)
> {
> + DEFINE_WAIT(wait);
> +
> cpu_hotplug.active_writer = current;
>
> - cpuhp_lock_acquire();
> for (;;) {
> + cpuhp_lock_acquire();

not sure I understand why did you move cpuhp_lock_acquire() into
the loop, but this is minor.

> mutex_lock(&cpu_hotplug.lock);
> - apply_puts_pending(1);
> - if (likely(!cpu_hotplug.refcount))
> + prepare_to_wait(&cpu_hotplug.wq, &wait, TASK_UNINTERRUPTIBLE);
> + if (likely(!atomic_read(&cpu_hotplug.refcount)))
> break;
> - __set_current_state(TASK_UNINTERRUPTIBLE);
> mutex_unlock(&cpu_hotplug.lock);
> + cpuhp_lock_release();
> schedule();
> }
> +
> + finish_wait(&cpu_hotplug.wq, &wait);
> }

This is subjective, but how about

static bool xxx(void)
{
mutex_lock(&cpu_hotplug.lock);
if (atomic_read(&cpu_hotplug.refcount) == 0)
return true;
mutex_unlock(&cpu_hotplug.lock);
return false;
}

void cpu_hotplug_begin(void)
{
cpu_hotplug.active_writer = current;

cpuhp_lock_acquire();
wait_event(&cpu_hotplug.wq, xxx());
}

instead?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/