Re: [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64

From: Aristeu Rozanski
Date: Thu Apr 30 2009 - 10:47:57 EST


(adding Ben Woodard, Mauro C. Chehab to Cc list)

> > Kconfig, mce code delivers needed error info to edac which, in turn,
> > goes and decodes the error/does the mapping to DIMM blocks/supplies DRAM
> > error injection facility for testing purposes and similar things. That
> > way you have both and they don't overlap in functionality.
> You can do that, but it's redundant because mcelog can do this
> this already. I had some conversations with existing EDAC users
> recently and they seem to only care about the resulting output,
> so just querying from mcelog is fine.
what about using the same EDAC interface? for lots of memory controllers, even
in other architectures than x86, EDAC interface is available. sounds
inconsistent to force users to have to handle special cases on their scripts
just because _optional_ sharing of error information from mce code is not
available.
how about SW/HW scrubbing?

> The only issue is that mcelog needs to get the DIMM data. In many
> cases it can do so from SMBIOS output, if not a suitable interface
> would need to be provided by the kernel.
that can be done already in EDAC and in this driver

> > By the way, I think there's a similar attempt/proposal of letting mce
> > and edac talk to each other from Red Hat so I think this could be a
> There was a fairly dubious patch floating around I think, but it
> had a couple of problems.
and what if those problems are solved? a patch like that would make possible
to have EDAC support for both AMD64 and Nehalem and wouldn't hurt the
performance of people who choose not to use EDAC.

--
Aristeu

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/