Re: [PATCH v4 1/3] task_work_add: generic process-context callbacks

From: Linus Torvalds
Date: Fri Apr 13 2012 - 22:40:47 EST


This is seriously buggy:

On Fri, Apr 13, 2012 at 7:12 PM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
>
> +void task_work_run(struct task_struct *task)
> +{
> +       struct hlist_head task_works;
> +       struct hlist_node *pos;
> +
> +       raw_spin_lock_irq(&task->pi_lock);
> +       hlist_move_list(&task->task_works, &task_works);
> +       raw_spin_unlock_irq(&task->pi_lock);
> +
> +       if (unlikely(hlist_empty(&task_works)))
> +               return;
> +       /*
> +        * We use hlist to save the space in task_struct, but we want fifo.
> +        * Find the last entry, the list should be short, then process them
> +        * in reverse order.
> +        */
> +       for (pos = task_works.first; pos->next; pos = pos->next)
> +               ;
> +
> +       for (;;) {
> +               struct hlist_node **pprev = pos->pprev;
> +               struct task_work *twork = container_of(pos, struct task_work,
> +                                                       hlist);
> +               twork->func(twork);
> +
> +               if (pprev == &task_works.first)
> +                       break;
> +               pos = container_of(pprev, struct hlist_node, next);
> +       }
> +}

No can do. You've removed the task-work from the process list, and you
no longer hold the spinlock that protects that list. That means that
you *cannot* access the task-work data structure any more, because it
may long be gone.

Look at the users of this interface that you wrote yourself. They
allocate the task-work on the stack, and do a "task_work_cancel()"
before returning. That data structure is *gone*. You can't dereference
it any more.

So quite frankly, the only safe approach is to copy the twork->func
while holding the lock. And passing in the "twork" to the function
isn't safe either, as far as I can see, since it may be gone too.

Basically, *any* access of 'twork' after it is removed from the list
and you have released the task spinlock is unsafe, as far as I can
tell.

Alternatively, you must make the rule be that the data can only be
freed by the caller *if* it was returned from "task_work_cancel()".
But then you can't allocate it on the stack any more, and have to
allocate it separately.

Or you need to implement some kind of "task_work_cancel_sync()"
function that guarantees that it waits for the actual work function to
finish. And I don't know how you'd do that.

But as it is, this series looks seriously buggy.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/