Re: [PATCH] [0/16] POISON: Intro

From: Andi Kleen
Date: Wed Apr 08 2009 - 02:13:37 EST


On Tue, Apr 07, 2009 at 10:15:42PM -0700, Andrew Morton wrote:
> On Tue, 7 Apr 2009 17:09:56 +0200 (CEST) Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
>
> > Upcoming Intel CPUs have support for recovering from some memory errors. This
> > requires the OS to declare a page "poisoned", kill the processes associated
> > with it and avoid using it in the future. This patchkit implements
> > the necessary infrastructure in the VM.
>
> If the page is clean then we can just toss it and grab a new one from
> backing store without killing anyone.
>
> Does the patchset do that?

Yes. But it only really works for shared mmap, anonymous and private
tends to be near always dirty.

Also you can disable even the early kill and only request kill
on access.

It also does some other tricks, like for dirty file just trigger
an IO error (although I must admit the dirty handling is rather
tricky and I would appreciate very careful review of that part)s

A few other known recovery tricks are not implemented yet
(like handling free memory[1]), but will be over time.

-Andi

[1] I didn't consider that one high priority since production
systems with long uptime shouldn't have much free memory.

--
ak@xxxxxxxxxxxxxxx -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/