Re: [PATCH 2/3] HWPOISON: undo memory error handling for dirty pagecache

From: Naoya Horiguchi
Date: Fri Aug 10 2012 - 20:59:10 EST


Hi Andi,

On Fri, Aug 10, 2012 at 04:09:48PM -0700, Andi Kleen wrote:
> Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> writes:
>
> > Current memory error handling on dirty pagecache has a bug that user
> > processes who use corrupted pages via read() or write() can't be aware
> > of the memory error and result in discarding dirty data silently.
> >
> > The following patch is to improve handling/reporting memory errors on
> > this case, but as a short term solution I suggest that we should undo
> > the present error handling code and just leave errors for such cases
> > (which expect the 2nd MCE to panic the system) to ensure data consistency.
>
> Not sure that's the right approach. It's not worse than any other IO
> errors isn't it?

Right, in current situation both memory errors and other IO errors have
the possibility of data lost in the same manner.
I thought that in real mission critical system (for which I think
HWPOISON feature is targeted) closing dangerous path is better than
keeping waiting for someone to solve the problem in more generic manner.

But if we start with Fengguang's approach at first as you replied to
patch 3, this patch is not necessary.

Naoya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/