Re: [PATCH, RFC/T] Fix handling of write failures to swap devices

From: Richard Purdie
Date: Wed Nov 01 2006 - 04:33:18 EST


On Wed, 2006-11-01 at 16:36 +1100, Nick Piggin wrote:
> Also note that the current code *should* "gracefully" handle the failure.
> In that, it will not reclaim the page on a write error, so it isn't going
> to cause a data loss...
>
> It's just that it currently results in unswappable pages.
>
> Handling it more gracefully by allowing the page to be retried with another
> swap entry is OK I guess, but given the added complexity, I'm not even sure
> it is worthwhile.
>
> Perhaps we should just do the ClearPageError in the try_to_unuse path,
> because the sysadmin should take down that swap device on failure. So if a
> new device is added, we want to be able to unpin the failed pages.

As I see it, ClearPageError in the try_to_unuse path is the correct
thing to do as once we've marked the swap page as bad, it can't retry to
the same swap page. There is nothing special about the failed pages
beyond that point so we don't need them marked.

Also, if we remove the error flag, the page is still probably on the
inactive list so other existing code is likely to take care of trying a
new write to a another swap entry? I didn't add code for this case for
that reason.

Regards,

Richard

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/