Re: [PATCH 0/2] Migrate data off physical pages with correctedmemory errors (Version 7)

From: Alex Williamson
Date: Mon Jul 21 2008 - 15:15:54 EST


On Sun, 2008-07-20 at 12:39 -0500, Russ Anderson wrote:
> On Sat, Jul 19, 2008 at 12:37:11PM +0200, Andi Kleen wrote:
> > If you really wanted to do this you probably should hook it up
> > to mcelog's (or the IA64 equivalent) DIMM database
>
> Is there an IA64 equivalent? I've looked at the x86_64 mcelog,
> but have not found a IA64 version.

There's a bit in the SAL error record that can tell you when the
platform thinks the page should be deallocated. In the section header
(B2.2), ERROR_RECOVERY_INFO, bit 3 "Error threshold exceeded". If you
use this bit, then it's a platform decision. If you want pages to be
deallocated on the first hit, then have your SAL always set that bit. I
believe HP systems do implement this bit in SAL using some kind of
heuristics.

Alex

--
Alex Williamson HP Open Source & Linux Org.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/