Re: [PATCH 2/3] work_on_cpu: Use our own workqueue.

From: Andrew Morton
Date: Wed Jan 28 2009 - 14:48:05 EST


On Wed, 28 Jan 2009 23:32:28 +1030 Rusty Russell <rusty@xxxxxxxxxxxxxxx> wrote:

> +static int do_work_on_cpu(void *unused)
> +{
> + for (;;) {
> + struct completion *done;
> +
> + wait_event(woc_wq, current_work != NULL);
> +
> + set_cpus_allowed_ptr(current, cpumask_of(current_work->cpu));
> + WARN_ON(smp_processor_id() != current_work->cpu);
> +
> + current_work->ret = current_work->fn(current_work->arg);
> + /* Make sure ret is set before we complete(). Paranoia. */
> + wmb();
> +
> + /* Reset current_work so we don't spin. */
> + done = &current_work->done;
> + current_work = NULL;
> +
> + /* Reset current_work for next work_on_cpu(). */
> + complete(done);
> + }
> +}
> +
> +/**
> + * work_on_cpu - run a function in user context on a particular cpu
> + * @cpu: the cpu to run on
> + * @fn: the function to run
> + * @arg: the function arg
> + *
> + * This will return the value @fn returns.
> + * It is up to the caller to ensure that the cpu doesn't go offline.
> + */
> +long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
> +{
> + struct work_for_cpu work;
> +
> + work.cpu = cpu;
> + work.fn = fn;
> + work.arg = arg;
> + init_completion(&work.done);
> +
> + mutex_lock(&woc_mutex);
> + /* Make sure all is in place before it sees fn set. */
> + wmb();
> + current_work = &work;
> + wake_up(&woc_wq);
> +
> + wait_for_completion(&work.done);
> + BUG_ON(current_work);
> + mutex_unlock(&woc_mutex);
> +
> + return work.ret;
> +}

We still have a queue - it's implicit now, rather than explicit.

It's vulnerable to the same deadlock, I think? Suppose we have:

- A lock, L

- A callback function which takes that lock, called function_which_takes_L()

- A task A which does work_on_cpu(function_which_takes_L)

- A task B which does

lock(L);
work_on_cpu(something_else);


Now,

- A calls work_on_cpu() and takes woc_mutex.

- Before function_which_takes_L() has started to execute, task B takes L
then calls work_on_cpu() and task B blocks on woc_mutex.

- Now function_which_takes_L() runs, and blocks on L

Nothing else happens...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/