Re: [RFC] splice() and readahead interaction

From: Andrew Morton
Date: Mon May 07 2007 - 17:55:37 EST


On Sat, 5 May 2007 05:04:29 -0400
"Fengguang Wu" <fengguang.wu@xxxxxxxxx> wrote:

> Readahead logic somehow fails to populate the page range with data.
> It can be because
> 1) the readahead routine is not always called in the following lines of
> fs/splice.c:
> if (!loff || nr_pages > 1)
> page_cache_readahead(mapping, &in->f_ra, in, index,
> nr_pages);
> 2) even called, page_cache_readahead() wont guarantee the pages are there.
> It wont submit readahead I/O for pages already in the radix tree, or when
> (ra_pages == 0), or after 256 cache hits.
>
> In your case, it should be because of the retried reads, which lead to
> excessive cache hits, and disables readahead at some time.
>
> And that _one_ failure of readahead blocks the whole read process.
> The application receives EAGAIN and retries the read, but
> __generic_file_splice_read() refuse to make progress:
> - in the previous invocation, it has allocated a blank page and inserted it
> into the radix tree, but never has the chance to start I/O for it: the test
> of SPLICE_F_NONBLOCK goes before that.
> - in the retried invocation, the readahead code will neither get out of the
> cache hit mode, nor will it submit I/O for an already existing page.
>
> The attached patch should fix the critical splice bug. Sorry for not being
> able to test it locally for now - I'm at home and running knoppix. And the
> readahead bug will be fixed by the upcoming on-demand readahead patch. I
> should be back and submit it after a week.
>
> Thank you,
> Fengguang Wu
>
>
> [splice-nonblock-fix.patch text/x-patch (506B)]
> --- linux-2.6.21.1/fs/splice.c.old 2007-05-05 04:40:38.000000000 -0400
> +++ linux-2.6.21.1/fs/splice.c 2007-05-05 04:41:59.000000000 -0400
> @@ -378,10 +378,11 @@
> * If in nonblock mode then dont block on waiting
> * for an in-flight io page
> */
> - if (flags & SPLICE_F_NONBLOCK)
> - break;
> -
> - lock_page(page);
> + if (flags & SPLICE_F_NONBLOCK) {
> + if (TestSetPageLocked(page))
> + break;
> + } else
> + lock_page(page);
>
> /*
> * page was truncated, stop here. if this isn't the

So.. afaik we're awaiting testing results for this change?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/