Re: Pagecache: find_or_create_page does not call a proper pageallocator function

From: Andrew Morton
Date: Tue Apr 24 2007 - 15:50:39 EST


On Tue, 24 Apr 2007 12:34:53 -0700 (PDT) Christoph Lameter <clameter@xxxxxxx> wrote:

> > Not as metadata, no. But someone (let's hope only root, though I may
> > be wrong on that) can map any part of the block device into userspace.
>
> Concurrent access to a block device by a filesystem and the user? That
> cannot go over well. If one just reads then I would expect that a copy
> of the metadata becomes available to the user. Also you cannot migrate
> pages that have multiple references (which is the case here if the
> filesystem uses the page cache for the metadata) unless the user has
> special priviledges and uses special command options.
>
> A page that has references that cannot be accounted for by page migration
> is never migrated. I would assume that the filesystem at minimum takes a
> refcount on the page used for metadata.
>
> If the filesystem would not take a refcount then it would already be in
> trouble because the page may then be evicted at any time.

No, think of the following scenario:

- file I/O causes a read of an ext2 file's bitmap. The bitmap is
brought into /dev/hda1's pagecache using !__GFP_HIGHMEM

- references are released against that page and it's now just clean
reclaimable pagecache

- someone (say, an online filesystem checker or something) mmaps
/dev/hda1 and reads that page.

- migration comes alnog and migrates that page into highmem

- file I/O causes a read of that bitmap again. We find it in
/dev/hda's pagecache.

Here's set_bh_page().

void set_bh_page(struct buffer_head *bh,
struct page *page, unsigned long offset)
{
bh->b_page = page;
BUG_ON(offset >= PAGE_SIZE);
if (PageHighMem(page))
/*
* This catches illegal uses and preserves the offset:
*/
bh->b_data = (char *)(0 + offset);
else
bh->b_data = page_address(page) + offset;
}

- ext2 now tries to access the bits in the bitmap via page->bh->b_data

- game over
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/