[PATCH RFCv2 16/16] edac: Add an error scope logic

From: Mauro Carvalho Chehab
Date: Sat Jan 28 2012 - 10:36:14 EST


This patch is currently incomplete, but the idea here is to
change the EDAC error calls to handle a scope var, that will
be used when providing the error traces to userspace, and
to increment a per-location counter.

Signed-off-by: Mauro Carvalho Chehab <mchehab@xxxxxxxxxx>
---
include/linux/edac.h | 27 +++++++++++++++++++++++++++
1 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/include/linux/edac.h b/include/linux/edac.h
index 5876675..879116e 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -72,6 +72,33 @@ enum hw_event_mc_err_type {
HW_EVENT_ERR_FATAL,
};

+/**
+ * enum hw_event_error_scope - escope of a memory error
+ * @HW_EVENT_ERR_MC: error can be anywhere inside the MC
+ * @HW_EVENT_SCOPE_MC_BRANCH: error can be on any DIMM inside the branch
+ * @HW_EVENT_SCOPE_MC_CHANNEL: error can be on any DIMM inside the MC channel
+ * @HW_EVENT_SCOPE_MC_CSROW: error can be on any DIMM inside the csrow
+ * @HW_EVENT_SCOPE_MC_DIMM: error is on a specific DIMM
+ *
+ * Depending on the error detection algorithm, the memory topology and even
+ * the MC capabilities, some errors can't be attributed to just one DIMM, but
+ * to a group of memory sockets. Depending on where the error occurs, the
+ * EDAC core will increment the corresponding error count for that entity,
+ * and the upper entities. For example, assuming a system with 1 memory
+ * controller 2 branches, 2 MC channels and 4 DIMMS on it, if an error
+ * happens at channel 0, the error counts for channel 0, for branch 0 and
+ * for the memory controller 0 will be incremented. The DIMM error counts won't
+ * be incremented, as, in this example, the driver can't be 100% sure on what
+ * memory the error actually occurred.
+ */
+enum hw_event_error_scope {
+ HW_EVENT_SCOPE_MC,
+ HW_EVENT_SCOPE_MC_BRANCH,
+ HW_EVENT_SCOPE_MC_CHANNEL,
+ HW_EVENT_SCOPE_MC_CSROW,
+ HW_EVENT_SCOPE_MC_CSROW_CHANNEL,
+};
+
/* memory types */
enum mem_type {
MEM_EMPTY = 0, /* Empty csrow */
--
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/