Re: [PATCH v3 1/1] libata: Add ATA_HORKAGE_NONCQ_ON_AMD for Samsung 860 and 870 SSD.

From: Damien Le Moal
Date: Mon Aug 30 2021 - 17:58:09 EST


On 2021/08/30 23:43, Kate Hsuan wrote:
> Many users are reporting that the Samsung 860 and 870 SSD are having
> various issues when combined with AMD SATA controllers and only
> completely disabling NCQ helps to avoid these issues.
>
> Entire disabling NCQ for Samsugn 860/870 SSD will cause I/O performance
> drop. In this case, a flag ATA_HORKAGE_NONCQ_ON_AMD is introduced to

With "Entire disabling NCQ...", did you mean "Always disabling NCQ" ?
If I understand this issue correctly, the explanation should be something like:

Always disabling NCQ for Samsung 860/870 SSDs regardless of the host SATA
adapter vendor will cause I/O performance degradation with well behaved
adapters. To limit the performance impact to AMD adapters, introduce the
ATA_HORKAGE_NO_NCQ_ON_AMD flag to force disable NCQ only for these adapters.

> used to perform an additional check for these SSDs. If it finds it's
> parent ATA controller is AMD, the NCQ will be disabled. Otherwise, the
> NCQ is kept to enable.
>
> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=201693
> Signed-off-by: Kate Hsuan <hpa@xxxxxxxxxx>
> ---
> drivers/ata/libata-core.c | 24 ++++++++++++++++++++++--
> include/linux/libata.h | 1 +
> 2 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index c861c93d1e84..36c62f758b73 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -2190,6 +2190,8 @@ static int ata_dev_config_ncq(struct ata_device *dev,
> char *desc, size_t desc_sz)
> {
> struct ata_port *ap = dev->link->ap;
> + struct pci_dev *pcidev = NULL;
> + struct device *parent_dev = NULL;
> int hdepth = 0, ddepth = ata_id_queue_depth(dev->id);
> unsigned int err_mask;
> char *aa_desc = "";
> @@ -2204,6 +2206,22 @@ static int ata_dev_config_ncq(struct ata_device *dev,
> snprintf(desc, desc_sz, "NCQ (not used)");
> return 0;
> }
> +
> + if (dev->horkage & ATA_HORKAGE_NONCQ_ON_AMD) {
> + for (parent_dev = dev->tdev.parent; parent_dev != NULL;
> + parent_dev = parent_dev->parent) {
> + if (dev_is_pci(parent_dev)) {
> + pcidev = to_pci_dev(parent_dev);
> + if (pcidev->vendor == PCI_VENDOR_ID_AMD) {
> + snprintf(desc, desc_sz,
> + "NCQ (not used)");
> + return 0;
> + }
> + break;
> + }
> + }

It would be really nice to move this hunk into a small helper, something like:

static bool ata_dev_check_adapter(struct ata_device *dev,
unsigned short vendor_id)

The "if" code block then becomes:

if ((dev->horkage & ATA_HORKAGE_NONCQ_ON_AMD) &&
ata_dev_check_adapter(dev, PCI_VENDOR_ID_AMD)) {
snprintf(desc, desc_sz, "NCQ (not used)");
return 0;
}


> + }
> +
> if (ap->flags & ATA_FLAG_NCQ) {
> hdepth = min(ap->scsi_host->can_queue, ATA_MAX_QUEUE);
> dev->flags |= ATA_DFLAG_NCQ;
> @@ -3971,9 +3989,11 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
> { "Samsung SSD 850*", NULL, ATA_HORKAGE_NO_NCQ_TRIM |
> ATA_HORKAGE_ZERO_AFTER_TRIM, },
> { "Samsung SSD 860*", NULL, ATA_HORKAGE_NO_NCQ_TRIM |
> - ATA_HORKAGE_ZERO_AFTER_TRIM, },
> + ATA_HORKAGE_ZERO_AFTER_TRIM |
> + ATA_HORKAGE_NONCQ_ON_AMD, },
> { "Samsung SSD 870*", NULL, ATA_HORKAGE_NO_NCQ_TRIM |
> - ATA_HORKAGE_ZERO_AFTER_TRIM, },
> + ATA_HORKAGE_ZERO_AFTER_TRIM |
> + ATA_HORKAGE_NONCQ_ON_AMD, },
> { "FCCT*M500*", NULL, ATA_HORKAGE_NO_NCQ_TRIM |
> ATA_HORKAGE_ZERO_AFTER_TRIM, },
>
> diff --git a/include/linux/libata.h b/include/linux/libata.h
> index 860e63f5667b..42e16114e91f 100644
> --- a/include/linux/libata.h
> +++ b/include/linux/libata.h
> @@ -426,6 +426,7 @@ enum {
> ATA_HORKAGE_NOTRIM = (1 << 24), /* don't use TRIM */
> ATA_HORKAGE_MAX_SEC_1024 = (1 << 25), /* Limit max sects to 1024 */
> ATA_HORKAGE_MAX_TRIM_128M = (1 << 26), /* Limit max trim size to 128M */
> + ATA_HORKAGE_NONCQ_ON_AMD = (1 << 27), /* Disable NCQ on AMD chipset */

Please add a "_" after "NO", similarly to ATA_HORKAGE_NO_NCQ_TRIM:

ATA_HORKAGE_NO_NCQ_ON_AMD = (1 << 27), /* Disable NCQ on AMD chipset */

>
> /* DMA mask for user DMA control: User visible values; DO NOT
> renumber */
>


--
Damien Le Moal
Western Digital Research