Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

From: Paul E. McKenney
Date: Wed Aug 15 2007 - 15:07:58 EST


On Wed, Aug 15, 2007 at 11:25:05PM +0530, Satyam Sharma wrote:
> Hi Paul,
> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
>
> > On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote:
> > > [...]
> > > No, I'd actually prefer something like what Christoph Lameter suggested,
> > > i.e. users (such as above) who want "volatile"-like behaviour from atomic
> > > ops can use alternative functions. How about something like:
> > >
> > > #define atomic_read_volatile(v) \
> > > ({ \
> > > forget(&(v)->counter); \
> > > ((v)->counter); \
> > > })
> >
> > Wouldn't the above "forget" the value, throw it away, then forget
> > that it forgot it, giving non-volatile semantics?
>
> Nope, I don't think so. I wrote the following trivial testcases:
> [ See especially tp4.c and tp4.s (last example). ]

Right. I should have said "wouldn't the compiler be within its rights
to forget the value, throw it away, then forget that it forgot it".
The value coming out of the #define above is an unadorned ((v)->counter),
which has no volatile semantics.

> ==============================================================================
> $ cat tp1.c # Using volatile access casts
>
> #define atomic_read(a) (*(volatile int *)&a)
>
> int a;
>
> void func(void)
> {
> a = 0;
> while (atomic_read(a))
> ;
> }
> ==============================================================================
> $ gcc -Os -S tp1.c; cat tp1.s
>
> func:
> pushl %ebp
> movl %esp, %ebp
> movl $0, a
> .L2:
> movl a, %eax
> testl %eax, %eax
> jne .L2
> popl %ebp
> ret
> ==============================================================================
> $ cat tp2.c # Using nothing; gcc will optimize the whole loop away
>
> #define forget(x)
>
> #define atomic_read(a) \
> ({ \
> forget(&(a)); \
> (a); \
> })
>
> int a;
>
> void func(void)
> {
> a = 0;
> while (atomic_read(a))
> ;
> }
> ==============================================================================
> $ gcc -Os -S tp2.c; cat tp2.s
>
> func:
> pushl %ebp
> movl %esp, %ebp
> popl %ebp
> movl $0, a
> ret
> ==============================================================================
> $ cat tp3.c # Using a full memory clobber barrier
>
> #define forget(x) asm volatile ("":::"memory")
>
> #define atomic_read(a) \
> ({ \
> forget(&(a)); \
> (a); \
> })
>
> int a;
>
> void func(void)
> {
> a = 0;
> while (atomic_read(a))
> ;
> }
> ==============================================================================
> $ gcc -Os -S tp3.c; cat tp3.s
>
> func:
> pushl %ebp
> movl %esp, %ebp
> movl $0, a
> .L2:
> cmpl $0, a
> jne .L2
> popl %ebp
> ret
> ==============================================================================
> $ cat tp4.c # Using a forget(var) macro
>
> #define forget(a) __asm__ __volatile__ ("" :"=m" (a) :"m" (a))
>
> #define atomic_read(a) \
> ({ \
> forget(a); \
> (a); \
> })
>
> int a;
>
> void func(void)
> {
> a = 0;
> while (atomic_read(a))
> ;
> }
> ==============================================================================
> $ gcc -Os -S tp4.c; cat tp4.s
>
> func:
> pushl %ebp
> movl %esp, %ebp
> movl $0, a
> .L2:
> cmpl $0, a
> jne .L2
> popl %ebp
> ret
> ==============================================================================
>
> Possibly these were too trivial to expose any potential problems that you
> may have been referring to, so would be helpful if you could write a more
> concrete example / sample code.

The trick is to have a sufficiently complicated expression to force
the compiler to run out of registers. If the value is non-volatile,
it will refetch it (and expect it not to have changed, possibly being
disappointed by an interrupt handler running on that same CPU).

> > > Or possibly, implement these "volatile" atomic ops variants in inline asm
> > > like the patch that Sebastian Siewior has submitted on another thread just
> > > a while back.
> >
> > Given that you are advocating a change (please keep in mind that
> > atomic_read() and atomic_set() had volatile semantics on almost all
> > platforms), care to give some example where these historical volatile
> > semantics are causing a problem?
> > [...]
> > Can you give even one example
> > where the pre-existing volatile semantics are causing enough of a problem
> > to justify adding yet more atomic_*() APIs?
>
> Will take this to the other sub-thread ...

OK.

> > > Of course, if we find there are more callers in the kernel who want the
> > > volatility behaviour than those who don't care, we can re-define the
> > > existing ops to such variants, and re-name the existing definitions to
> > > somethine else, say "atomic_read_nonvolatile" for all I care.
> >
> > Do we really need another set of APIs?
>
> Well, if there's one set of users who do care about volatile behaviour,
> and another set that doesn't, it only sounds correct to provide both
> those APIs, instead of forcing one behaviour on all users.

Well, if the second set doesn't care, they should be OK with the volatile
behavior in this case.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/