Re: [RFC PATCH 00/21 v2] amd64_edac: EDAC module for AMD64

From: Mauro Carvalho Chehab
Date: Thu Apr 30 2009 - 14:39:21 EST


On Thu, 30 Apr 2009, Andi Kleen wrote:

Kconfig, mce code delivers needed error info to edac which, in turn,
goes and decodes the error/does the mapping to DIMM blocks/supplies DRAM
error injection facility for testing purposes and similar things. That
way you have both and they don't overlap in functionality.

You can do that, but it's redundant because mcelog can do this
this already. I had some conversations with existing EDAC users
recently and they seem to only care about the resulting output,
so just querying from mcelog is fine.
The only issue is that mcelog needs to get the DIMM data. In many
cases it can do so from SMBIOS output, if not a suitable interface
would need to be provided by the kernel.

From what I've heard from the existing EDAC users, they have several
concerns that mcelog could be viable replacement to their EDAC usage, due to performance issues, including the need of accessing SMBIOS in order to get such information.

Also, EDAC interface is already stablished, and, as pointed by Doug, it is very useful on cluster environments, where memory failures is a big issue and need to be solved as soon as possible.

EDAC solves this issue very well and works on a wider range of designs than mcelog. So, there's no reason to deprecate it or to reject patches adding EDAC interfaces to other chips.

On the other hand, mcelog is also useful on different scenarios. So, they are not competing technologies, but complementary ones.

So, assuming that both EDAC and mcelog are needed, the proper design for those chipsets where the memory controller is integrated with other log functions (like AMD64 and Nethalem) seem to build an unique kernel layer that retrieves the error logs from the harware and allows access to the same data via both mcelog and EDAC userspace API's.

Cheers,
Mauro
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/