Re: + kthread-add-a-missing-memory-barrier-to-kthread_stop.patchadded to -mm tree

From: Linus Torvalds
Date: Sat Feb 23 2008 - 15:03:42 EST




On Sat, 23 Feb 2008, Dmitry Adamushko wrote:
>
> No, wmb is not enough. I've provided an explanation in the original thread.

Here, let me answer your explanation from this thread (lots snipped to
keep it concise):

First off:

> Actually, there seems to be _no_ problem at all, provided a task to be
> woken up is _not_ running on another CPU at the exact moment of
> wakeup.

Agreed. The runqueue spinlock will protect the case of the target actually
sleeping. So that's not the case that anybody has worried about.

So to look at the "concurrently running on another CPU case":

> to sum it up, for the following scheme to work:
>
> set_current_state(TASK_INTERRUPTIBLE); <--- here we have a smb_mb()
> if (condition)
> schedule();
>
> effectively => (1) MODIFY(current->state) ; (2) LOAD(condition)
>
> and a wake_up path must ensure access (LOAD or MODIFY) to the same
> data happens in the _reverse_ order:

Yes.

> condition = new;
> smb_mb();
> try_to_wake_up();
>
> => (1) MODIFY(condition); (2) LOAD(current->state)
>
> try_to_wake_up() does not need to be a full mb per se, the only
> requirement (and only for situation like above) is that there is a
> full mb between possible write ops. that have taken place before
> try_to_wake_up() _and_ a load of p->state inside try_to_wake_up().
>
> does it make sense #2 ? :-)

.. and this is why I think a smp_wmb() is sufficient.

The spinlock already guarantees that the load cannot move up past the
spinlock (that would make a spinlock pointless), and the smp_wmb()
guarantees that the store cannot move down past the spinlock.

Now, I realize that a smp_wmb() only protects a write against another
write, but if it's at the top of try_to_wake_up(), the "other write" in
question is the spinlock itself. So while the smp_wmb() doesn't enforce
the order between STORE(condition) and LOAD(curent->state) *directly*, it
does end up doing so through the interaction with the spinlock.

At least I think so. Odd CPU memory ordering can actually break the notion
of "causality" (see, for example, the fact that we actually need to have
"smp_read_barrier_depends()" on some architectures), which is one reason
why it's hard to really think about them. I think I'm better than most on
memory ordering, but I won't guarantee that there cannot be some
architecture that could in theory break that

store -> wmb -> spinlock -> read

chain some really odd way.

Note that the position of the wmb does matter: if it is *after* the
spinlock, then it has zero impact on the read. My argument literally
depends on the wmb serializing with the write of the spinlock itself.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/