Re: [PATCH] x86 bitops.h commentary on instruction reordering

From: Vladislav Bolkhovitin
Date: Tue Aug 10 2004 - 07:35:34 EST


Alan Cox wrote:

On Llu, 2004-08-09 at 21:12, Vladislav Bolkhovitin wrote:

Well, Marcelo, sorry if I'm getting too annoying, but we had a race with cache coherency during SCST (SCSI target mid-level) development. We discovered that on P4 Xeon after atomic_set() there is very small window, when atomic_read() on another CPUs returns the old value. We had to rewrite the code without using atomic_set(). Isn't it cache coherency issue?

atomic_set/atomic_read are _atomic_ operations. Nothing is said about
ordering. You get old or new but not half and half. Two atomic_inc's
will both occur and so on.

If you want ordering you need locks otherwise there is nothing defining
the time order of both processors.

How can you even measure such a window without locking to know what the
state of the processors is ?

We didn't measure it, we just had lockup. In our code in the commands serialization we have two paths: the regular (fast) one, where the atomic value is equal to current value and there are no locks used, because only one command can be on the fast path, and the slow one, where the atomic and the current values are different, so lock used to perform all necessary job. After the fast path the atomic value gets incremented together with some checks on it. We used to use atomic_set() here and sometimes had lockup.

And, BTW, returning to the original topic, would it be better to make set_bit() and friends guarantee not to be reordered on all architectures, instead of just add the comment. Otherwise, what is the


x86 and some other platforms have certain ordering guarantees. set_bit
doesn't guarantee them but it happens to unavoidably work for most
(ab)uses.

So, seems there is no difference between operations with and without "__" prefix (like set_bit() and __set_bit()) now? Maybe, it's worth to leave only one version of them?

right thing? In some places in SCST we heavy rely on non-ordering guarantees.


Then you will get burned on most hardware.

Designing that we also be guided by the comments, which turned out to be misleading :).

Actually, I feel it would be perfect if someone wrote a document, where he described various issues with using atomic variables and bit operations on various architectures, like some FAQ. For example, this thread has at least several good questions and answers. Also, I can add one more question: for an user it could be non-obvious why smp_mb__after_atomic_*() and friends needed. This document wouldn't be big, so it didn't take too much time to write it, but it would help a *lot* for people who wish to write portable code.

Thanks,
Vlad

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/