Re: [RFC] kernel facilities for cache prefetching

From: Wu Fengguang
Date: Wed May 03 2006 - 00:10:27 EST


On Tue, May 02, 2006 at 08:55:06AM -0700, Linus Torvalds wrote:
> Doing prefetching on a physical block basis is simply not a valid
> approach, for several reasons:

Sorry!
I made a misleading introduction. I'll try to explain it in more detail.

DATA ACQUISITION

/proc/filecache provides an interface to query the cached pages of any
file. This information is expressed in tuples of <idx, len>, which
more specifically means <mapping-offset, pages>.

Normally one should use 'echo' to setup two parameters before doing
'cat':
@file
the filename;
use 'ALL' to get a list all files cached
@mask
only show the pages with non-zero (page-flags & @mask);
for simplicity, use '0' to show all present pages(take 0 as ~0)

Normally, one should first get the file list using param 'file ALL',
and then iterate through all the files and pages of interested with
params 'file filename' and 'mask pagemask'.

The param 'mask' acts as a filter for different users: it allows
sysadms to know where his memory goes, and the prefetcher to ignore
pages from false readahead.

One can use 'mask hex(PG_active|PG_referenced|PG_mapped)' in its hex form
to show only accessed pages(here PG_mapped is a faked flag), and use
'mask hex(PG_dirty)' to show only dirtied pages.

One can use
$ echo "file /sbin/init" > /proc/filecache
$ echo "mask 0" > /proc/filecache
$ cat /proc/filecache
to get an idea which pages of /sbin/init are currently cached.

In the proposal, I used the following example, which is proved to be
rather misleading:
$ echo "file /dev/hda1" > /proc/filecache
$ cat /proc/filecache
The intention of that example was to show that filesystem dir/inode
buffer status -- which is the key data for user-land pre-caching --
can also be retrieved through this interface.

So the proposed solution is to
- prefetch normal files on the virtual mapping level
- prefetch fs dir/inode buffers on a physical block basis

I/O SUBMISSION
How can we avoid unnecessary seeks when prefetching on virtual mapping
level? The answer is to leave this job to i/o elevators. What we
should do is to present elevators with most readahead requests before
too many requests being submitted to disk drivers.
The proposed scheme is to:
1) (first things first)
issue all readahead requests for filesystem buffers
2) (in background, often blocked)
issue all readahead requests for normal files
-) make sure the above requests are of really _low_ priority
3) regular system boot continues
4) promote the priority of any request that is now demanded by
legacy programs

In the scheme, most work is done by user-land tools. The required
kernel support is minimal and general-purpose:
- an /proc/filecache interface
- the ability to promote I/O priority on demanded pages

By this approach, we avoided the complicated OSX bootcache solution,
which is a physical-blocks-based, special-handlings-in-kernel solution
that is exactly what Linus is against.

Thanks,
Wu
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/