Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head

From: Marco Elver
Date: Mon Feb 17 2020 - 13:38:17 EST


On Mon, 17 Feb 2020 at 19:23, Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, Feb 17, 2020 at 01:38:51PM +0100, Peter Zijlstra wrote:
> > On Fri, Feb 14, 2020 at 04:25:18PM -0800, paulmck@xxxxxxxxxx wrote:
> > > From: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
> > >
> > > The RCU tasks list of callbacks, rcu_tasks_cbs_head, is sampled locklessly
> > > by rcu_tasks_kthread() when waiting for work to do. This commit therefore
> > > applies READ_ONCE() to that lockless sampling and WRITE_ONCE() to the
> > > single potential store outside of rcu_tasks_kthread.
> > >
> > > This data race was reported by KCSAN. Not appropriate for backporting
> > > due to failure being unlikely.
> >
> > What failure is possible here? AFAICT this is (again) one of them
> > load-complare-against-constant-discard patterns that are impossible to
> > mess up.
>
> You mean that because we are only testing for NULL, so load/store tearing of
> rcu_tasks_cbs_head is not an issue right?
>
> I agree. Even with invented stores, worst case we have a false-wakeup and go
> right back to sleep. Or, we read a partial rcu_tasks_cbs_head, and then go
> acquire the lock and read the whole thing correctly under lock.
>
> I wonder if we can teach KCSAN to actually ignore this kind of situation so
> we don't need to employ READ_ONCE() for no reason. Basically ask it to not
> bother if the read was only NULL-testing. +Marco since it is KCSAN related.

This came up before. It requires somehow making the compiler tell us
what type of operation we're doing and in what context:
https://lore.kernel.org/lkml/CANpmjNNZQsatHexXHm4dXvA0na6r9xMgVD5R+-8d7VXEBRi32w@xxxxxxxxxxxxxx/

In particular:

> > This particular rule relies on semantic analysis that is beyond what
> > the TSAN instrumentation currently supports. Right now we support GCC
> > and Clang; changing the compiler probably means we'd end up with only
> > one (probably Clang), and many more years before the change has
> > propagated to the majority of used compiler versions. It'd be good if
> > we can do this purely as a change in the kernel's codebase.

Load/store tearing might not be an issue, but we also have to be aware
of things like load fusing, e.g. in a loop.

That being said, there may be ways to get similar results without yet
changing the compiler.

Thanks,
-- Marco

> thanks,
>
> - Joel
>
>
> > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > > ---
> > > kernel/rcu/update.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> > > index 6c4b862..a27df76 100644
> > > --- a/kernel/rcu/update.c
> > > +++ b/kernel/rcu/update.c
> > > @@ -528,7 +528,7 @@ void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func)
> > > rhp->func = func;
> > > raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags);
> > > needwake = !rcu_tasks_cbs_head;
> > > - *rcu_tasks_cbs_tail = rhp;
> > > + WRITE_ONCE(*rcu_tasks_cbs_tail, rhp);
> > > rcu_tasks_cbs_tail = &rhp->next;
> > > raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags);
> > > /* We can't create the thread unless interrupts are enabled. */
> > > @@ -658,7 +658,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
> > > /* If there were none, wait a bit and start over. */
> > > if (!list) {
> > > wait_event_interruptible(rcu_tasks_cbs_wq,
> > > - rcu_tasks_cbs_head);
> > > + READ_ONCE(rcu_tasks_cbs_head));
> > > if (!rcu_tasks_cbs_head) {
> > > WARN_ON(signal_pending(current));
> > > schedule_timeout_interruptible(HZ/10);
> > > --
> > > 2.9.5
> > >