Re: [RFC Patch 0/2] mm: Add parameters to make kernel behavior atmemory error on dirty cache selectable

From: Naoya Horiguchi
Date: Thu Apr 11 2013 - 11:23:20 EST


On Thu, Apr 11, 2013 at 03:49:16PM +0200, Andi Kleen wrote:
> > As a result, if the dirty cache includes user data, the data is lost,
> > and data corruption occurs if an application uses old data.
>
> The application cannot use old data, the kernel code kills it if it
> would do that. And if it's IO data there is an EIO triggered.
>
> iirc the only concern in the past was that the application may miss
> the asynchronous EIO because it's cleared on any fd access.
>
> This is a general problem not specific to memory error handling,
> as these asynchronous IO errors can happen due to other reason
> (bad disk etc.)
>
> If you're really concerned about this case I think the solution
> is to make the EIO more sticky so that there is a higher chance
> than it gets returned. This will make your data much more safe,
> as it will cover all kinds of IO errors, not just the obscure memory
> errors.

I'm interested in this topic, and in previous discussion, what I was said
is that we can't expect user applications to change their behaviors when
they get EIO, so globally changing EIO's stickiness is not a great approach.
I'm working on a new pagecache tag based mechanism to solve this.
But it needs time and more discussions.
So I guess Tanino-san suggests giving up on dirty pagecache errors
as a quick solution.

Thanks,
Naoya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/