[PATCH RFC] mm/readahead: improve randread performance with readahead disabled

From: Yu Kuai
Date: Tue Jul 01 2025 - 07:16:04 EST


From: Yu Kuai <yukuai3@xxxxxxxxxx>

We have a workload of random 4k-128k read on a HDD, from iostat we observed
that average request size is 256k+ and bandwidth is 100MB+, this is because
readahead waste lots of disk bandwidth. Hence we disable readahead and
performance from user side is indeed much better(2x+), however, from
iostat we observed request size is just 4k and bandwidth is just around
40MB.

Then we do a simple dd test and found out if readahead is disabled,
page_cache_sync_ra() will force to read one page at a time, and this
really doesn't make sense because we can just issue user requested size
request to disk.

Fix this problem by removing the limit to read one page at a time from
page_cache_sync_ra(), this way the random read workload can get better
performance with readahead disabled.

PS: I'm not sure if I miss anything, so this version is RFC
Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
---
mm/readahead.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/mm/readahead.c b/mm/readahead.c
index 20d36d6b055e..1df85ccba575 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -561,13 +561,21 @@ void page_cache_sync_ra(struct readahead_control *ractl,
* Even if readahead is disabled, issue this request as readahead
* as we'll need it to satisfy the requested range. The forced
* readahead will do the right thing and limit the read to just the
- * requested range, which we'll set to 1 page for this case.
+ * requested range.
*/
- if (!ra->ra_pages || blk_cgroup_congested()) {
+ if (blk_cgroup_congested()) {
if (!ractl->file)
return;
+ /*
+ * If the cgroup is congested, ensure to do at least 1 page of
+ * readahead to make progress on the read.
+ */
req_count = 1;
do_forced_ra = true;
+ } else if (!ra->ra_pages) {
+ if (!ractl->file)
+ return;
+ do_forced_ra = true;
}

/* be dumb */
--
2.39.2