Re: [PATCH] 9p/client: fix data race on req->status

From: Christian Schoenebeck
Date: Mon Dec 12 2022 - 08:41:08 EST


On Friday, December 9, 2022 10:12:41 PM CET Dominique Martinet wrote:
> Christian Schoenebeck wrote on Fri, Dec 09, 2022 at 02:45:51PM +0100:
> > > > What about p9_tag_alloc()?
> > >
> > > I think that one's ok: it happens during the allocation before the
> > > request is enqueued in the idr, so it should be race-free by defition.
> > >
> > > tools/memory-model/Documentation/access-marking.txt says
> > > "Initialization-time and cleanup-time accesses" should use plain
> > > C-language accesses, so I stuck to that.
> >
> > When it is allocated then it is safe, but the object may also come from a pool
> > here. It's probably not likely to cause an issue here, just saying.
>
> If it comes from the pool then it is gated by the refcount... But that
> would require a similar barrier indeed (init stuff, wmb, init refcount
> // get req + check refcount, rmb, read stuff e.g. tag); just a
> write_once would not help.
>
> For the init side I assume unlocking c->lock acts as a write barrier
> after tag is set, which is conveniently the last step, but we'd need a
> read barrier here in tag lookup:
> --------
> diff --git a/net/9p/client.c b/net/9p/client.c
> index fef6516a0639..68585ad9003c 100644
> --- a/net/9p/client.c
> +++ b/net/9p/client.c
> @@ -363,6 +363,7 @@ struct p9_req_t *p9_tag_lookup(struct p9_client *c, u16 tag)
> */
> if (!p9_req_try_get(req))
> goto again;
> + smp_rmb();
> if (req->tc.tag != tag) {
> p9_req_put(c, req);
> goto again;
> --------
>
> OTOH this cannot happen with a normal server (a req should only be looked
> up after it has been sent to the server and comes back, which involves a
> few round trip and a few locks in the recv paths for tcp); but if syzbot
> tries hard enough I guess that could be hit...
> I don't have a strong opinion on this: I don't think anything really bad
> can happen here as long as the refcount is correct (status is read under
> lock when it matters before extra decrements of the refcount, and writes
> to the buffer itself are safe from a memory pov), even if it's obviously
> not correct strictly speaking.
> (And I have no way of measuring what impact that extra barrier would have
> tbh; for virtio at least lookup is actually never used...)

Yeah agreed, this was more of a theoretical issue. With the other memory
barrier patch posted by you already:

Reviewed-by: Christian Schoenebeck <linux_oss@xxxxxxxxxxxxx>