Re: [PATCH 00/33] Adaptive read-ahead V12
From: Andrew Morton
Date: Thu May 25 2006 - 11:44:29 EST
Wu Fengguang <wfg@xxxxxxxxxxxxxxxx> wrote:
> This is the 12th release of the adaptive readahead patchset.
> It has received tests in a wide range of applications in the past
> six months, and polished up considerably.
> Please consider it for inclusion in -mm tree.
> Performance benefits
> Besides file servers and desktops, it is recently found to benefit
> postgresql databases a lot.
> I explained to pgsql users how the patch may help their db performance:
> HOW IT WORKS
> In adaptive readahead, the context based method may be of particular
> interest to postgresql users. It works by peeking into the file cache
> and check if there are any history pages present or accessed. In this
> way it can detect almost all forms of sequential / semi-sequential read
> patterns, e.g.
> - parallel / interleaved sequential scans on one file
> - sequential reads across file open/close
> - mixed sequential / random accesses
> - sparse / skimming sequential read
> It also have methods to detect some less common cases:
> - reading backward
> - seeking all over reading N pages
> WAYS TO BENEFIT FROM IT
> As we know, postgresql relies on the kernel to do proper readahead.
> The adaptive readahead might help performance in the following cases:
> - concurrent sequential scans
> - sequential scan on a fragmented table
> (some DBs suffer from this problem, not sure for pgsql)
> - index scan with clustered matches
> - index scan on majority rows (in case the planner goes wrong)
> And received positive responses:
> [QUOTE from Michael Stone]
> I've got one DB where the VACUUM ANALYZE generally takes 11M-12M ms;
> with the patch the job took 1.7M ms. Another VACUUM that normally takes
> between 300k-500k ms took 150k. Definately a promising addition.
> [QUOTE from Michael Stone]
> >I'm thinking about it, we're already using a fixed read-ahead of 16MB
> >using blockdev on the stock Redhat 2.6.9 kernel, it would be nice to
> >not have to set this so we may try it.
> FWIW, I never saw much performance difference from doing that. Wu's
> patch, OTOH, gave a big boost.
> [QUOTE: odbc-bench with Postgresql 7.4.11 on dual Opteron]
> Base kernel:
> Transactions per second: 92.384758
> Transactions per second: 99.800896
> After read-ahvm.readahead_ratio = 100:
> Transactions per second: 105.461952
> Transactions per second: 105.458664
> vm.readahead_ratio = 100 ; vm.readahead_hit_rate = 1:
> Transactions per second: 113.055367
> Transactions per second: 124.815910
These are nice-looking numbers, but one wonders. If optimising readahead
makes this much difference to postgresql performance then postgresql should
be doing the readahead itself, rather than relying upon the kernel's
ability to guess what the application will be doing in the future. Because
surely the database can do a better job of that than the kernel.
That would involve using posix_fadvise(POSIX_FADV_RANDOM) to disable kernel
readahead and then using posix_fadvise(POSIX_FADV_WILLNEED) to launch
Has this been considered or attempted?
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/