Re: [PATCH v2] nvme-tcp: Check if request has started before processing it

From: Sagi Grimberg
Date: Tue May 11 2021 - 14:16:15 EST

Next message: Gulam Mohamed: "[PATCH V1 1/1] Fix race between iscsi logout and systemd-udevd"
Previous message: Axel Rasmussen: "Re: Userspace notifications for observing userfaultfd faults"
In reply to: Hannes Reinecke: "Re: [PATCH v2] nvme-tcp: Check if request has started before processing it"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 5/9/21 4:30 AM, Hannes Reinecke wrote:

On 5/8/21 1:22 AM, Sagi Grimberg wrote:

Well, that would require a modification to the CQE specification, no?
fmds was not amused when I proposed that :-(

Why would that require a modification to the CQE? it's just using say
4 msbits of the command_id to a running sequence...

I think Hannes was under the impression that the counter proposal wasn't
part of the "command_id". The host can encode whatever it wants in that
value, and the controller just has to return the same value.

Yea, maybe something like this?
--
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index e6612971f4eb..7af48827ea56 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1006,7 +1006,7 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
                return BLK_STS_IOERR;
        }

-       cmd->common.command_id = req->tag;
+       cmd->common.command_id = nvme_cid(req);
        trace_nvme_setup_cmd(req, cmd);
        return ret;
}
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 05f31a2c64bb..96abfb0e2ddd 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -158,6 +158,7 @@ enum nvme_quirks {
struct nvme_request {
        struct nvme_command     *cmd;
        union nvme_result       result;
+       u8                      genctr;
        u8                      retries;
        u8                      flags;
        u16                     status;
@@ -497,6 +498,48 @@ struct nvme_ctrl_ops {
        int (*get_address)(struct nvme_ctrl *ctrl, char *buf, int size);
};

+/*
+ * nvme command_id is constructed as such:
+ * | xxxx | xxxxxxxxxxxx |
+ *   gen    request tag
+ */
+#define nvme_cid_install_genctr(gen)           ((gen & 0xf) << 12)
+#define nvme_genctr_from_cid(cid)              ((cid & 0xf000) >> 12)
+#define nvme_tag_from_cid(cid)                 (cid & 0xfff)
+

That is a good idea, but we should ensure to limit the number of commands a controller can request, too.

We take the minimum between what the host does vs. what the controller
supports anyways.

As per spec each controller can support a full 32 bit worth of requests, and if we limit that arbitrarily from the stack we'll need to cap the number of requests a controller or fabrics driver can request.

NVMF_MAX_QUEUE_SIZE is already 1024, you are right that we also need:
--
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 92e03f15c9f6..66a4a7f7c504 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -60,6 +60,7 @@ MODULE_PARM_DESC(sgl_threshold,
"Use SGLs when average request segment size is larger or equal to "
"this size. Use 0 to disable SGLs.");

+#define NVME_PCI_MAX_QUEUE_SIZE 4096
static int io_queue_depth_set(const char *val, const struct kernel_param *kp);
static const struct kernel_param_ops io_queue_depth_ops = {
.set = io_queue_depth_set,
@@ -68,7 +69,7 @@ static const struct kernel_param_ops io_queue_depth_ops = {

static unsigned int io_queue_depth = 1024;
module_param_cb(io_queue_depth, &io_queue_depth_ops, &io_queue_depth, 0644);
-MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2");
+MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2 and <= 4096");

static int io_queue_count_set(const char *val, const struct kernel_param *kp)
{
@@ -164,6 +165,9 @@ static int io_queue_depth_set(const char *val, const struct kernel_param *kp)
if (ret != 0 || n < 2)
return -EINVAL;

+ if (n > NVME_PCI_MAX_QUEUE_SIZE)
+ return -EINVAL;
+
return param_set_uint(val, kp);
}

--

Next message: Gulam Mohamed: "[PATCH V1 1/1] Fix race between iscsi logout and systemd-udevd"
Previous message: Axel Rasmussen: "Re: Userspace notifications for observing userfaultfd faults"
In reply to: Hannes Reinecke: "Re: [PATCH v2] nvme-tcp: Check if request has started before processing it"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]