Re: [PATCH aio-next] aio: fix race in ring buffer page lookup introducedby page migration support

From: Gu Zheng
Date: Mon Sep 09 2013 - 20:51:58 EST


Hi Ben, Al,

On 09/10/2013 12:02 AM, Benjamin LaHaise wrote:

> Hi Al, Gu,
>
> I've added this patch to my tree at git://git.kvack.org/~bcrl/aio-next.git
> to fix the get_user_pages() issue introduced by Gu's changes in the page
> migration patch. Thanks Al for spotting this.

Thanks very much for spotting and fixing this issue.

Best regards,
Gu

>
> -ben
>
> commit d6c355c7dabcd753a75bc77d150d36328a355267
> Author: Benjamin LaHaise <bcrl@xxxxxxxxx>
> Date: Mon Sep 9 11:57:59 2013 -0400
>
> aio: fix race in ring buffer page lookup introduced by page migration support
>
> Prior to the introduction of page migration support in "fs/aio: Add support
> to aio ring pages migration" / 36bc08cc01709b4a9bb563b35aa530241ddc63e3,
> mapping of the ring buffer pages was done via get_user_pages() while
> retaining mmap_sem held for write. This avoided possible races with userland
> racing an munmap() or mremap(). The page migration patch, however, switched
> to using mm_populate() to prime the page mapping. mm_populate() cannot be
> called with mmap_sem held.
>
> Instead of dropping the mmap_sem, revert to the old behaviour and simply
> drop the use of mm_populate() since get_user_pages() will cause the pages to
> get mapped anyways. Thanks to Al Viro for spotting this issue.
>
> Signed-off-by: Benjamin LaHaise <bcrl@xxxxxxxxx>
>
> diff --git a/fs/aio.c b/fs/aio.c
> index 6e26755..f4a27af 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -307,16 +307,25 @@ static int aio_setup_ring(struct kioctx *ctx)
> aio_free_ring(ctx);
> return -EAGAIN;
> }
> - up_write(&mm->mmap_sem);
> -
> - mm_populate(ctx->mmap_base, populate);
>
> pr_debug("mmap address: 0x%08lx\n", ctx->mmap_base);
> +
> + /* We must do this while still holding mmap_sem for write, as we
> + * need to be protected against userspace attempting to mremap()
> + * or munmap() the ring buffer.
> + */
> ctx->nr_pages = get_user_pages(current, mm, ctx->mmap_base, nr_pages,
> 1, 0, ctx->ring_pages, NULL);
> +
> + /* Dropping the reference here is safe as the page cache will hold
> + * onto the pages for us. It is also required so that page migration
> + * can unmap the pages and get the right reference count.
> + */
> for (i = 0; i < ctx->nr_pages; i++)
> put_page(ctx->ring_pages[i]);
>
> + up_write(&mm->mmap_sem);
> +
> if (unlikely(ctx->nr_pages != nr_pages)) {
> aio_free_ring(ctx);
> return -EAGAIN;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/