Re: end_swap_bio_write error handling

From: Richard Purdie
Date: Thu Aug 31 2006 - 04:03:41 EST


On Thu, 2006-08-31 at 10:58 +0900, KAMEZAWA Hiroyuki wrote:
> Now, swap-write-failure-fixup.patch is merged in -mm kernel.
> ==
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc4/2.6.18-rc4-mm3/broken-out/mm-swap-write-failure-fixup.patch
> ==
> error message comes and a page turns to be dirty again.

+ if (!uptodate) {
SetPageError(page);
+ /*
+ * We failed to write the page out to swap-space.
+ * Re-dirty the page in order to avoid it being reclaimed.
+ * Also print a dire warning that things will go BAD (tm)
+ * very quickly.
+ *
+ * Also clear PG_reclaim to avoid rotate_reclaimable_page()
+ */
+ set_page_dirty(page);
+ printk(KERN_ALERT "Write-error on swap-device (%d:%d)\n",
+ imajor(bio->bi_bdev->bd_inode),
+ iminor(bio->bi_bdev->bd_inode));
+ ClearPageReclaim(page);

I'm not 100% convinced this will help as if you SetPageError, it will
still end up killing off the processes involved. Removing the
SetPageError gives much more stable results in my testing. I was
wondering how to stop it repeatedly trying to write to the particular
swap file sector. ClearPageReclaim() doesn't appear to help much as
rotate_reclaimable_page() does check if a page is dirty.

Ideally, we should remap the page to a new swap sector so we can mark
the existing one as bad. The easiest way to do that might be to have the
page move out of the PageSwapCache although I've not worked out how to
do that yet.

Richard



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/