Re: [PATCH] Urgent bug fix causing Apple SSDs to not work.

From: Sagi Grimberg
Date: Sat Sep 25 2021 - 16:34:34 EST


From: Aditya Garg <gargaditya08@xxxxxxxx>
Date: Fri, 24 Sep 2021 15:36:45 +0530
Subject: [PATCH] Revert nvme to 5.14.5 to fix incompatibility arised in Apple SSDs.
Fixes: e7006de6c238 (nvme: code command_id with a genctr for use-after-free validation)

I think we need to hear more about the problem than just revert a
commit like this randomly. That commit has already been picked up for
-stable,

What are the exact symptoms, and which Apple SSD is this?

I do find this:

https://lore.kernel.org/all/cjJiSFV77WM51ciS8EuBcdeBcv9T83PUB-Kw3yi8PuC_LwrrUUnQ3w5RC1PbKvSYE72KryXp3wOJhv4Ov_WWIe2gKWOOo5uwuUjbbFA8HDM=@protonmail.com/

which instead of a revert has an actual patch. Can you try that one?

Keith Busch replied to that one, saying that the Apple SSD might not
be spec compliant, but hey, what else is new? If we start demanding
that hardware comply with specs, we'd have to scrap the whole notion
of working in the real world. Plus it would be very hypocritical of
us, since we ignore all specs when we deem them too limiting (whether
they be language specs, POSIX OS specs, or whatever).

Right, we have a lot of quirks for the apple controllers, what's one
more? :)

Could the following patch be tried? I'm basing this off the 'lspci' from
Orlando, but I'm assuming the previous model has the same limitation,
too.

---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 7efb31b87f37..f0787233557f 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -979,6 +979,7 @@ EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
{
struct nvme_command *cmd = nvme_req(req)->cmd;
+ struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
blk_status_t ret = BLK_STS_OK;
if (!(req->rq_flags & RQF_DONTPREP)) {
@@ -1027,7 +1028,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
return BLK_STS_IOERR;
}
- nvme_req(req)->genctr++;
+ if (!(ctrl->quirks & NVME_QUIRK_SKIP_CID_GEN))
+ nvme_req(req)->genctr++;
cmd->common.command_id = nvme_cid(req);
trace_nvme_setup_cmd(req, cmd);
return ret;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 9871c0c9374c..b49761d30df7 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -86,6 +86,12 @@ enum nvme_quirks {
*/
NVME_QUIRK_NO_DEEPEST_PS = (1 << 5),
+ /*
+ * The controller requires the command_id value be be limited to the
+ * queue depth.
+ */
+ NVME_QUIRK_SKIP_CID_GEN = (1 << 6),
+
/*
* Set MEDIUM priority on SQ creation
*/
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index b82492cd7503..d9f22ed68185 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3369,7 +3369,10 @@ static const struct pci_device_id nvme_id_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2005),
.driver_data = NVME_QUIRK_SINGLE_VECTOR |
NVME_QUIRK_128_BYTES_SQES |
- NVME_QUIRK_SHARED_TAGS },
+ NVME_QUIRK_SHARED_TAGS ,
+ NVME_QUIRK_SKIP_CID_GEN },
+ { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2006),
+ .driver_data = NVME_QUIRK_SKIP_CID_GEN },
{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
{ 0, }
--


Exactly the patch that I was going to propose. The only downside
is that we need to access the controller in the hot-path...