Re: [RFC PATCH] LKMM: Add ctrl_dep() macro for control dependency
From: Alan Stern
Date: Wed Oct 13 2021 - 22:14:38 EST
On Wed, Oct 13, 2021 at 05:01:04PM -0700, Paul E. McKenney wrote:
> On Sun, Oct 10, 2021 at 04:02:02PM +0200, Florian Weimer wrote:
> > * Linus Torvalds:
> >
> > > On Fri, Oct 1, 2021 at 9:26 AM Florian Weimer <fweimer@xxxxxxxxxx> wrote:
> > >>
> > >> Will any conditional branch do, or is it necessary that it depends in
> > >> some way on the data read?
> > >
> > > The condition needs to be dependent on the read.
> > >
> > > (Easy way to see it: if the read isn't related to the conditional or
> > > write data/address, the read could just be delayed to after the
> > > condition and the store had been done).
> >
> > That entirely depends on how the hardware is specified to work. And
> > the hardware could recognize certain patterns as always producing the
> > same condition codes, e.g., AND with zero. Do such tests still count?
> > It depends on what the specification says.
> >
> > What I really dislike about this: Operators like & and < now have side
> > effects, and is no longer possible to reason about arithmetic
> > expressions in isolation.
>
> Is there a reasonable syntax that might help with these issues?
>
> Yes, I know, we for sure have conflicting constraints on "reasonable"
> on copy on this email. What else is new? ;-)
>
> I could imagine a tag of some sort on the load and store, linking the
> operations that needed to be ordered. You would also want that same
> tag on any conditional operators along the way? Or would the presence
> of the tags on the load and store suffice?
Here's a easy cop-out. Imagine a version of READ_ONCE that is
equivalent to:
a normal READ_ONCE on TSO architectures,
a load-acquire on more weakly ordered architectures.
Call it READ_ONCE_FOR_COND, for the sake of argument. Then as long as
people are careful to use READ_ONCE_FOR_COND when loading the values
that a conditional expression depends on, and WRITE_ONCE for the
important stores in the branches of the "if" statement, all
architectures will have the desired ordering. (In fact, if there are
multiple loads involved in the condition then only the last one has to
be READ_ONCE_FOR_COND; the others can just be READ_ONCE.)
Of course, this is not optimal on non-TSO archictecture. That's why I
called it a cop-out. But at least it is simple and easy.
Alan Stern