Re: [PATCH] RAS: Add a tracepoint for reporting memory controllerevents

From: Borislav Petkov
Date: Thu May 31 2012 - 15:26:14 EST


On Thu, May 31, 2012 at 06:14:56PM +0000, Luck, Tony wrote:
> > Ok, thus the dynamic granularity. But we're going to end up reporting
> > rank and row too so that it can be matched to the DIMM. I consider
> > physical address a bonus in such cases and it is only of importance to
> > those who like to replace single DRAM chips or single MOSFET transistors
> > :-) :-) :-).
>
> Perhaps you don't really need to replace your 32GB DIMM because a single
> bit is stuck (one bad bit out of > 300 billion). With the physical address
> we can tell Linux to try to stop using the page that contains the stuck
> bit - most of the time it can do that.

Yes, ok, that is a good example for using the physical address.
Provided, of course, the reported granularity still keeps us within the
page that contained the error. But I think you said earlier that most of
the errors are reported with 4K granularity, so all is fine.

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/