Re: [PATCH] net: 9p: fix possible refcount leak in p9_read_work() and recv_done()

From: Hangyu Hua
Date: Mon Jul 11 2022 - 23:24:48 EST


On 2022/7/11 15:39, asmadeus@xxxxxxxxxxxxx wrote:
Hangyu Hua wrote on Mon, Jul 11, 2022 at 02:59:07PM +0800:
A ref got in p9_tag_lookup needs to be put when functions enter the
error path.

Fix this by adding p9_req_put in error path.

I wish it was that simple.

Did you actually observe a leak? >
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 8f8f95e39b03..c4ccb7b9e1bf 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -343,6 +343,7 @@ static void p9_read_work(struct work_struct *work)
p9_debug(P9_DEBUG_ERROR,
"No recv fcall for tag %d (req %p), disconnecting!\n",
m->rc.tag, m->rreq);
+ p9_req_put(m->rreq);
m->rreq = NULL;
err = -EIO;
goto error;
@@ -372,6 +373,8 @@ static void p9_read_work(struct work_struct *work)
"Request tag %d errored out while we were reading the reply\n",
m->rc.tag);
err = -EIO;
+ p9_req_put(m->rreq);
+ m->rreq = NULL;
goto error;
}
spin_unlock(&m->client->lock);


for tcp, we still have that request in m->req_list, so we should be
calling p9_client_cb which will do the p9_req_put in p9_conn_cancel.

If you do it here, you'll get a refcount overflow and use after free.



That's a little weird. If you are right, the three return paths of this function are inconsistent with the handling of refcount.

static void p9_read_work(struct work_struct *work)
{
...
if ((m->rreq) && (m->rc.offset == m->rc.capacity)) {
p9_debug(P9_DEBUG_TRANS, "got new packet\n");
m->rreq->rc.size = m->rc.offset;
spin_lock(&m->client->lock);
if (m->rreq->status == REQ_STATUS_SENT) {
list_del(&m->rreq->req_list);
p9_client_cb(m->client, m->rreq, REQ_STATUS_RCVD); <---- [1]
} else if (m->rreq->status == REQ_STATUS_FLSHD) {
/* Ignore replies associated with a cancelled request. */
p9_debug(P9_DEBUG_TRANS,
"Ignore replies associated with a cancelled request\n"); <---- [2]
} else {
spin_unlock(&m->client->lock);
p9_debug(P9_DEBUG_ERROR,
"Request tag %d errored out while we were reading the reply\n",
m->rc.tag);
err = -EIO;
goto error; <---- [3]
}
spin_unlock(&m->client->lock);
m->rc.sdata = NULL;
m->rc.offset = 0;
m->rc.capacity = 0;
p9_req_put(m->rreq); <---- [4]
m->rreq = NULL;
}
...
error:
p9_conn_cancel(m, err); <---- [5]
clear_bit(Rworksched, &m->wsched);
}

There are three return paths here, [1] and [2] and [3].

[1]: m->rreq will be put twice in [1] and [4]. And m->rreq will be deleted from the m->req_list in [1].

[2]: m->rreq will be put in [4]. And m->rreq will not be deleted from m->req_list.

[3]: m->rreq will be put in [5]. And m->rreq will be deleted from the m->req_list in [5].

If p9_tag_lookup keep the refcount of req which is in m->req_list. There will be a double put in return path [1] and a potential UAF in return path [2]. And this also means a req in m->req_list without getting refcount before p9_tag_lookup.

static void p9_write_work(struct work_struct *work)
{
...
list_move_tail(&req->req_list, &m->req_list);

m->wbuf = req->tc.sdata;
m->wsize = req->tc.size;
m->wpos = 0;
p9_req_get(req);
...
}

But if you check out p9_write_work, a refcount already get after list_move_tail. We don't need to rely on p9_tag_lookup to keep a list's refcount. Whatsmore, code comments in p9_tag_alloc also proves that the refcount get by p9_tag_lookup is a temporary refcount.

So i still think there may be a refcount leak.

diff --git a/net/9p/trans_rdma.c b/net/9p/trans_rdma.c
index 88e563826674..82b5d6894ee2 100644
--- a/net/9p/trans_rdma.c
+++ b/net/9p/trans_rdma.c
@@ -317,6 +317,7 @@ recv_done(struct ib_cq *cq, struct ib_wc *wc)
/* Check that we have not yet received a reply for this request.
*/
if (unlikely(req->rc.sdata)) {
+ p9_req_put(req);
pr_err("Duplicate reply for request %d", tag);
goto err_out;
}

This one isn't as clear cut, I see that they put the client in a
FLUSHING state but nothing seems to acton on it... But if this happens
we're already in the use after free realm -- it means rc.sdata was
already set so the other thread could be calling p9_client_cb anytime if
it already hasn't, and yet another thread will then do the final ref put
and free this.
We shouldn't free this here as that would also be an overflow. The best
possible thing to do at this point is just to stop using that pointer.


But p9_tag_lookup have a lock inside. Doesn't this mean p9_tag_lookup won't return a freed req? Otherwise we should fix the lock to avoid falling into the use after free realm.

Thanks,
Hangyu

If you actually run into a problem with these refcounts (should get a
warning on umount that something didn't get freed) then by all mean
let's look further into it, but please don't send such patches without
testing the error paths you're "fixing" -- I'm pretty sure a reproducer
to hit these paths would bark errors in dmesg as refcount has an
overflow check.

--
Dominique