Re: Litmus test for question from Al Viro

From: Will Deacon
Date: Mon Oct 05 2020 - 05:12:57 EST

Next message: Manivannan Sadhasivam: "[PATCH v3 0/5] Add PCIe support for SM8250 SoC"
Previous message: Jon Hunter: "Re: [Patch 1/2] cpufreq: tegra194: get consistent cpuinfo_cur_freq"
In reply to: Will Deacon: "Re: Litmus test for question from Al Viro"
Next in thread: Paul E. McKenney: "Re: Litmus test for question from Al Viro"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Oct 05, 2020 at 09:20:03AM +0100, Will Deacon wrote:
> On Sun, Oct 04, 2020 at 10:38:46PM -0400, Alan Stern wrote:
> > On Sun, Oct 04, 2020 at 04:31:46PM -0700, Paul E. McKenney wrote:
> > > Nice simple example! How about like this?
> > >
> > > Thanx, Paul
> > >
> > > ------------------------------------------------------------------------
> > >
> > > commit c964f404eabe4d8ce294e59dda713d8c19d340cf
> > > Author: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
> > > Date: Sun Oct 4 16:27:03 2020 -0700
> > >
> > > manual/kernel: Add a litmus test with a hidden dependency
> > >
> > > This commit adds a litmus test that has a data dependency that can be
> > > hidden by control flow. In this test, both the taken and the not-taken
> > > branches of an "if" statement must be accounted for in order to properly
> > > analyze the litmus test. But herd7 looks only at individual executions
> > > in isolation, so fails to see the dependency.
> > >
> > > Signed-off-by: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
> > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > >
> > > diff --git a/manual/kernel/crypto-control-data.litmus b/manual/kernel/crypto-control-data.litmus
> > > new file mode 100644
> > > index 0000000..6baecf9
> > > --- /dev/null
> > > +++ b/manual/kernel/crypto-control-data.litmus
> > > @@ -0,0 +1,31 @@
> > > +C crypto-control-data
> > > +(*
> > > + * LB plus crypto-control-data plus data
> > > + *
> > > + * Result: Sometimes
> > > + *
> > > + * This is an example of OOTA and we would like it to be forbidden.
> > > + * The WRITE_ONCE in P0 is both data-dependent and (at the hardware level)
> > > + * control-dependent on the preceding READ_ONCE. But the dependencies are
> > > + * hidden by the form of the conditional control construct, hence the
> > > + * name "crypto-control-data". The memory model doesn't recognize them.
> > > + *)
> > > +
> > > +{}
> > > +
> > > +P0(int *x, int *y)
> > > +{
> > > + int r1;
> > > +
> > > + r1 = 1;
> > > + if (READ_ONCE(*x) == 0)
> > > + r1 = 0;
> > > + WRITE_ONCE(*y, r1);
> > > +}
> > > +
> > > +P1(int *x, int *y)
> > > +{
> > > + WRITE_ONCE(*x, READ_ONCE(*y));
> > > +}
> > > +
> > > +exists (0:r1=1)
> >
> > Considering the bug in herd7 pointed out by Akira, we should rewrite P1 as:
> >
> > P1(int *x, int *y)
> > {
> > int r2;
> >
> > r = READ_ONCE(*y);
>
> (r2?)
>
> > WRITE_ONCE(*x, r2);
> > }
> >
> > Other than that, this is fine.
>
> But yes, module the typo, I agree that this rewrite is much better than the
> proposal above. The definition of control dependencies on arm64 (per the Arm
> ARM [1]) isn't entirely clear that it provides order if the WRITE is
> executed on both paths of the branch, and I believe there are ongoing
> efforts to try to tighten that up. I'd rather keep _that_ topic separate
> from the "bug in herd" topic to avoid extra confusion.

Ah, now I see that you're changing P1 here, not P0. So I'm now nervous
about claiming that this is a bug in herd without input from Jade or Luc,
as it does unfortunately tie into the definition of control dependencies
and it could be a deliberate choice.

Jade, Luc: apparently herd doesn't emit a control dependency edge from
the READ_ONCE() to the WRITE_ONCE() in the following:

P0(int *x, int *y)
{
int r1;

r1 = 1;
if (READ_ONCE(*x) == 0)
r1 = 0;
WRITE_ONCE(*y, r1);
}

Is that deliberate?

Setting the arm64 architecture aside for one moment, I think the Linux
memory model would very much like the control dependency to exist in this
case. Documenting the unexpected outcome is one thing, but I think it would
be much better to do it in a way where users can reason about whether or not
they're falling into this trap rather than warning them that the results may
be unreliable, which is not likely to build confidence in the tool.

Will

Next message: Manivannan Sadhasivam: "[PATCH v3 0/5] Add PCIe support for SM8250 SoC"
Previous message: Jon Hunter: "Re: [Patch 1/2] cpufreq: tegra194: get consistent cpuinfo_cur_freq"
In reply to: Will Deacon: "Re: Litmus test for question from Al Viro"
Next in thread: Paul E. McKenney: "Re: Litmus test for question from Al Viro"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]