[PATCH v10 0/6] CXL Poison List Retrieval & Tracing

From: alison . schofield
Date: Tue Mar 21 2023 - 22:12:27 EST


From: Alison Schofield <alison.schofield@xxxxxxxxx>

Changes in v10:

Patch 2/6 cxl/trace: Add TRACE support for CXL media-error records
- Rename the cxl_poison trace event field 'length' to 'dpa_length' (Jonathan)
The dpa_length here is the dpa length as reported by the device.
- Add a 'type' field to the cxl_poison TRACE_EVENT and define and
use type 'CXL_POISON_TRACE_LIST'. This is in preparation for adding
more cxl_poison_trace_type's like CXL_POISON_TRACE_(INJECT | CLEAR)
- Use continuations in TP_printk to match the file convention.

Patch 4/6 cxl/region: Provide region info to the cxl_poison trace event
- Add Jonathan Reviewed-by (for real) and Tested-by Tags (Jonathan)
- Tidy blank line at return in trigger_poison_list_store() (Jonathan)

Patches 1,3,5,6: no changes.

Cover Letter:
- Updated the example cxl_poison TRACE_EVENTs to show type=LIST

Link to v9:
https://lore.kernel.org/linux-cxl/cover.1679284567.git.alison.schofield@xxxxxxxxx/

End Changelog

Add support for retrieving device poison lists and store the returned
error records as kernel trace events.

The handling of the poison list is guided by the CXL 3.0 Specification
Section 8.2.9.8.4.1. [1]

Example trigger:
$ echo 1 > /sys/bus/cxl/devices/mem0/trigger_poison_list

Example Trace Events:

Poison found in a PMEM Region:
cxl_poison: memdev=mem0 host=cxl_mem.0 serial=0 type=LIST region=region11 region_uuid=d96e67ec-76b0-406f-8c35-5b52630dcad1 hpa=0xf100000000 dpa=0x70000000 dpa_length=0x40 source=Injected flags= overflow_time=0

Poison found in RAM Region:
cxl_poison: memdev=mem0 host=cxl_mem.0 serial=0 type=LIST region=region2 region_uuid=00000000-0000-0000-0000-000000000000 hpa=0xf010000000 dpa=0x0 dpa_length=0x40 source=Injected flags= overflow_time=0

Poison found in an unmapped DPA resource:
cxl_poison: memdev=mem3 host=cxl_mem.3 serial=3 type=LIST region= region_uuid=00000000-0000-0000-0000-000000000000 hpa=0xffffffffffffffff dpa=0x40000000 dpa_length=0x40 source=Injected flags= overflow_time=0


[1]: https://www.computeexpresslink.org/download-the-specification


Alison Schofield (6):
cxl/mbox: Add GET_POISON_LIST mailbox command
cxl/trace: Add TRACE support for CXL media-error records
cxl/memdev: Add trigger_poison_list sysfs attribute
cxl/region: Provide region info to the cxl_poison trace event
cxl/trace: Add an HPA to cxl_poison trace events
tools/testing/cxl: Mock support for Get Poison List

Documentation/ABI/testing/sysfs-bus-cxl | 14 +++
drivers/cxl/core/core.h | 15 ++++
drivers/cxl/core/mbox.c | 75 ++++++++++++++++
drivers/cxl/core/memdev.c | 108 ++++++++++++++++++++++++
drivers/cxl/core/region.c | 63 ++++++++++++++
drivers/cxl/core/trace.c | 94 +++++++++++++++++++++
drivers/cxl/core/trace.h | 101 ++++++++++++++++++++++
drivers/cxl/cxlmem.h | 72 +++++++++++++++-
drivers/cxl/mem.c | 36 ++++++++
drivers/cxl/pci.c | 4 +
tools/testing/cxl/test/mem.c | 42 +++++++++
11 files changed, 623 insertions(+), 1 deletion(-)


base-commit: e686c32590f40bffc45f105c04c836ffad3e531a
--
2.37.3