Re: [RFC] - Some notions that I would like comments on

Chuck Lever (cel@monkey.org)
Fri, 16 Jul 1999 11:30:22 -0400 (EDT)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Andrea Arcangeli: "[patch] fix for some minor bh race"
Previous message: Luca Montecchiani: "[2.0.3x] Problem: block on freelist at 0109e7b0 isn't free"

On Fri, 16 Jul 1999, Jamie Lokier wrote:
> Chuck Lever wrote:
> > > > I mean, is "readaround" for block @ 128k-192k triggered by
> > > > reading/paging within block @ 64k-128k, or is it triggered by the first
> > > > read with 128k-192k?
> > >
> > > No, not yet: it's something which we'll probably do eventually.
> > > However, on any vaguely modern hardware, the track buffers on the disk
> > > itself will keep filling once you've submitted one IO. Accessing the
> > > next cluster will take a latency hit on the CPU, but we still get the
> > > full disk bandwidth overall.
> >
> > why trigger a page-in for the next cluster? doubling the cluster size
> > might give the same behavior.
>
> I don't see how it will. Doubling the cluster size just halves the
> number of times when we get the latency hit of a synchronous read of the
> first page in a cluster.

as i understand it, when a page fault occurs and the requested page isn't
already in the page cache, the whole cluster is read in. however, the
read operations are non-blocking -- after all the reads are scheduled,
filemap_nopage waits for the specific requested page.

so, if you want a page fault to trigger the next cluster too, a way of
doing that easily with the current code base is to schedule all the reads
for the current cluster, then schedule all the reads for the next cluster,
then wait for the requested page. that's almost identical to doubling the
cluster size.

however, if the cluster size is 128k, and the requested page is in the
second half of the cluster, then you've "read behind." on the other hand
by triggering the next cluster, you have 128k of potentially more
interesting data, since more fresh data is likely to be ahead of the
current page request.

i may have missed your point, though.

- Chuck Lever

-- corporate: <chuckl@netscape.com> personal: <chucklever@netscape.net> or <cel@monkey.org>

The Linux Scalability project: http://www.citi.umich.edu/projects/linux-scalability/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/

Next message: Andrea Arcangeli: "[patch] fix for some minor bh race"
Previous message: Luca Montecchiani: "[2.0.3x] Problem: block on freelist at 0109e7b0 isn't free"