Re: [PATCH] page->flags corruption fix

From: Hugh Dickins
Date: Wed Oct 08 2003 - 10:48:29 EST


On Wed, 8 Oct 2003, Rik van Riel wrote:
>
> Though I suspect it's gotten worse since 2.4.14 or so, where
> we moved the final lru_cache_del() into __free_pages_ok() and
> the fact that anonymous pages are on the lru lists.

I agree both of those make it all more fragile;
but I still don't quite see where it's broken.

The existing shortcuts are normally dealing with either a freed
page or a just allocated, not yet published, page. And you're
right that the anon->swap case is an exception, less obvious.

> It's quite possible that one CPU adds the page to the swap
> cache, while another CPU moves the page around on the inactive
> list. At that point both CPUs could be fiddling around with
> the page->flags simultaneously.

Don't both of those TryLockPage (in one case holding page_table_lock,
in the other case holding pagemap_lru_lock, to hold the page while
doing so)?

> In fact, this has been observed in heavy stress testing by
> Matt Domsch and Robert Hentosh...

Perhaps Matt could add description of what they observed, in support
of the patch. I'm not yet convinced that it's necessarily the fix.

Hugh

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/