Re: Buffer cache hints

Richard Gooch (rgooch@atnf.csiro.au)
Sat, 7 Sep 1996 17:22:53 +1000


Linus Torvalds writes:
>
> On Sat, 7 Sep 1996, Richard Gooch wrote:
> >
> > I do use mmap() sometimes, but I still have to swap the bytes (I
> > just use mmap() and bcopy() instead of read() to read from the
> > file). It would be so nice if the data was in host-natural form, but
> > alas, no.
> > I have one file which is 30 MBytes, and my disc rattles like crazy for
> > a few minutes before all the data has been read and swap-copied into
> > VM. If it wasn't for the unneccesary paging, this would take 15 to 20
> > seconds with my machine with 64 MBytes of RAM.
>
> You'd still be better off with mmap + byte swap in place, than with read
> + byte swap. Rationale:
>
> With "read(large-area)" + "massage(large-area)", you end up swapping things
> out _twice_. When you do the read, the data in the beginning of the read
> buffer gets swapped out when the kernel has to copy the data to the end of
> the read buffer, and then when you do the byte-order stuff it has to be
> swapped in again (and the end of the read buffer gets swapped out).
>
> If you do a mmap(MAP_PRIVATE, PROT_READ|PROT_WRIE), the kernel won't actually
> read the data until you need it, so it will be read just once, and then
> directly massaged without hitting swap in between. The kernel will start
> swapping out the (massaged) pages by the time you've reached the end, but
> you'd still have "won" one swap-out.

This is possible under i386_Linux, but my code has to work on other
platforms too, where I don't just byte-swap, but also resize (i.e. if
sizeof (float) == 8). It could make things very messy.

> Also, if you _know_ that you'll then use the data in some specific sequence,
> you can try to minimize this swapping stage by doing the byte swap in
> reverse: that way when you have byte-swapped all the data you're likely to
> have the start of the data buffer in memory (because that's the part you
> touched the latest). NOTE: this only makes sense if you know that the swap
> is a problem, because generally it's slower going backwards than forward if
> there are no swap effects.

It's possible to do this (a fair bit of stuffing around, though),
but this scheme might reduce performance with other operating systems
where the MM is different.

> You can do the same thing with read (read in small chunks and do the data
> massage in small chunks), but it's generally easier with mmap. And it's a lot
> more likely that you'll see a mmap cache hint in the future than a buffer
> cache hint..

In fact, I already do it in chunks (4 MBytes at the moment): read a
bit, massage a bit, and then to the next chunk. This improved things
considerably, but there is still room to go.
Well, since my code "knows" that Linux supports mmap(), I could cope
with a mmap only hint, since my "reads" are really bcopy() from mmaped
region when the file is mmaped.
I do wonder: how much harder is it to have such a hint for the buffer
cache?
What is the hope of getting a mmap hint in the near future?

Regards,

Richard....