Re: [PATCH v2] nvme-multipath: Early exit if no path is available

From: Chao Leng
Date: Thu Jan 28 2021 - 20:19:31 EST




On 2021/1/28 17:23, Hannes Reinecke wrote:
On 1/28/21 10:18 AM, Chao Leng wrote:


On 2021/1/28 15:58, Daniel Wagner wrote:
On Thu, Jan 28, 2021 at 09:31:30AM +0800, Chao Leng wrote:
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -221,7 +221,7 @@ static struct nvme_ns *nvme_round_robin_path(struct nvme_ns_head *head,
       }
       for (ns = nvme_next_ns(head, old);
-         ns != old;
+         ns && ns != old;
nvme_round_robin_path just be called when !"old".
nvme_next_ns should not return NULL when !"old".
It seems unnecessary to add checking "ns".

The problem is when we enter nvme_round_robin_path() and there is no
path available. In this case the initialization ns = nvme_next_ns(head,
old) could return a NULL pointer."old" should not be NULL, so there is at least one path that is "old".
It is impossible to return NULL for nvme_next_ns(head, old).

No. list_next_or_null_rcu()/list_first_or_null_rcu() will return NULL when then end of the list is reached.
Although list_next_or_null_rcu()/list_first_or_null_rcu() may return
NULL, but nvme_next_ns(head, old) assume that the "old" is in the "head",
so nvme_next_ns(head, old) should not return NULL. If the "old" is not
in the "head", nvme_next_ns(head, old) will run abnormal.
So there is other bug which cause nvme_next_ns(head, old).

I review the code about head->list and head->current_path, I find 2 bugs
may cause nvme_next_ns(head, old) abnormal:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengchao@xxxxxxxxxx/
Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted from the
"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.

Cheers,

Hannes