Re: Block Device Caching

From: FabF
Date: Fri Jul 02 2004 - 07:58:54 EST


On Fri, 2004-07-02 at 14:01, Markus Schaber wrote:
> Hi, Helge,
>
> On Fri, 02 Jul 2004 13:15:35 +0200
> Helge Hafting <helge.hafting@xxxxxxx> wrote:
> > > [Caching problems]
> > >Any Ideas?
> > >
> > Not much. "bear" may have decided to drop cache in one case but not
> > in the other - that is valid although strange. Take a look at what
> > block size the device uses. A mounted device uses the fs blocksize, a
> > partition(or entire disk) may be set to some different blocksize. (I
> > believe it might be the classical 512 byte) Caching blocks
> > with a size different from the page size (usually 4k) has extra
> > overhead and could lead to extra memory pressure and different caching
> > decisions.
> >
> > So I recommend retesting bear with a good blocksize, such as 4k, on
> > the device. That also gives somewhat better performance in general
> > than 512-byte blocks. There is a syscall for changing the block size
> > of a device.
> >
> > (Block size may also be changed by the following hack: mount a fs
> > like ext2 on the device, then umount it. Of course you need to use
> > mkfs so this can't be done on a device that contains data.
> > "mount" sets the device block size to the size used by the fs.
> > After umount, the device is free to be opened directly again, but it
> > retains the new blocksize.)
>
> This sounds reasonable, and so I did some tests wr/t mounting and memory
> pressure:
>
>
> bear:~/readcachetest# top | head -n 8 | tail -n 2
> Mem: 3953624k total, 775776k used, 3177848k free, 458060k buffers
> Swap: 1048568k total, 0k used, 1048568k free, 31812k cached
>
> bear:~/readcachetest# ./readcache /dev/daten/testing 1000
> frist run needed 12.374819 seconds, this is 80.809263 MBytes/Sec
> second run needed 12.004670 seconds, this is 83.300915 MBytes/Sec
> third run needed 11.898035 seconds, this is 84.047492 MBytes/Sec
>
> bear:~/readcachetest# top | head -n 8 | tail -n 2
> Mem: 3953624k total, 775648k used, 3177976k free, 457904k buffers
> Swap: 1048568k total, 0k used, 1048568k free, 31900k cached
>
> bear:~/readcachetest# mount /dev/daten/testing /testinmount/ -t ext2
>
> bear:~/readcachetest# top | head -n 8 | tail -n 2
> Mem: 3953624k total, 775720k used, 3177904k free, 457944k buffers
> Swap: 1048568k total, 0k used, 1048568k free, 31860k cached
>
> bear:~/readcachetest# ./readcache /dev/daten/testing 1000
> frist run needed 12.199237 seconds, this is 81.972340 MBytes/Sec
> second run needed 12.028633 seconds, this is 83.134966 MBytes/Sec
> third run needed 12.454502 seconds, this is 80.292251 MBytes/Sec
>
> bear:~/readcachetest# top | head -n 8 | tail -n 2
> Mem: 3953624k total, 1125432k used, 2828192k free, 808008k buffers
> Swap: 1048568k total, 0k used, 1048568k free, 31792k cached
>
> bear:~/readcachetest# ./readcache /testinmount/test.dump 1000
> frist run needed 11.985626 seconds, this is 83.433272 MBytes/Sec
> second run needed 3.682361 seconds, this is 271.564901 MBytes/Sec
> third run needed 3.638563 seconds, this is 274.833774 MBytes/Sec
>
> bear:~/readcachetest# top | head -n 8 | tail -n 2
> Mem: 3953624k total, 2150128k used, 1803496k free, 808736k buffers
> Swap: 1048568k total, 0k used, 1048568k free, 1055688k cached
>
> bear:~/readcachetest# ./readcache /dev/daten/testing 1000
> frist run needed 12.418232 seconds, this is 80.526761 MBytes/Sec
> second run needed 12.381597 seconds, this is 80.765026 MBytes/Sec
> third run needed 12.185159 seconds, this is 82.067046 MBytes/Sec
>
> bear:~/readcachetest# top | head -n 8 | tail -n 2
> Mem: 3953624k total, 2149232k used, 1804392k free, 807784k buffers
> Swap: 1048568k total, 0k used, 1048568k free, 1055756k cached
>
> bear:~/readcachetest# umount /testinmount/
>
> bear:~/readcachetest# top | head -n 8 | tail -n 2
> Mem: 3953624k total, 775728k used, 3177896k free, 457872k buffers
> Swap: 1048568k total, 0k used, 1048568k free, 31728k cached
>
> bear:~/readcachetest# ./readcache /dev/daten/testing 1000
> frist run needed 12.171772 seconds, this is 82.157306 MBytes/Sec
> second run needed 11.842943 seconds, this is 84.438471 MBytes/Sec
> third run needed 12.167059 seconds, this is 82.189131 MBytes/Sec
>
> bear:~/readcachetest# top | head -n 8 | tail -n 2
> Mem: 3953624k total, 775656k used, 3177968k free, 457876k buffers
> Swap: 1048568k total, 0k used, 1048568k free, 31860k cached
>
> So files from the mounted fs get cached, but not direct reads from the
> block device, although the lvm volume is mounted, and enough memory is
> available.
>
> I currently suspect this to be a problem with the cciss controller, see
> my other mail to the list...
>
> > You may also want to test with a smaller count. Reading directly from
> > pagecache should be noticeably faster than reading from the cache on
> > the raid controller, although either obviously is way faster than
> > reading from actual disks.
>
> bear:~/readcachetest# ./readcache /dev/daten/testing 100
> frist run needed 1.202330 seconds, this is 83.171841 MBytes/Sec
> second run needed 0.349686 seconds, this is 285.970842 MBytes/Sec
> third run needed 0.346644 seconds, this is 288.480401 MBytes/Sec
>
> bear:~/readcachetest# mount /dev/daten/testing /testinmount/ -t ext2
>
> bear:~/readcachetest# ./readcache /testinmount/test.dump 100
> frist run needed 1.205797 seconds, this is 82.932699 MBytes/Sec
> second run needed 0.367004 seconds, this is 272.476594 MBytes/Sec
> third run needed 0.362783 seconds, this is 275.646874 MBytes/Sec
>
> bear:~/readcachetest# ./readcache /testinmount/test.dump 100
> frist run needed 0.363097 seconds, this is 275.408500 MBytes/Sec
> second run needed 0.363002 seconds, this is 275.480576 MBytes/Sec
> third run needed 0.362852 seconds, this is 275.594457 MBytes/Sec
>
> bear:~/readcachetest# umount /testinmount/
>
> bear:~/readcachetest# mount /dev/daten/testing /testinmount/ -t ext2
>
> bear:~/readcachetest# ./readcache /testinmount/test.dump 100
> frist run needed 1.189227 seconds, this is 84.088235 MBytes/Sec
> second run needed 0.367456 seconds, this is 272.141426 MBytes/Sec
> third run needed 0.363139 seconds, this is 275.376646 MBytes/Sec
>
> Hmm, I'm somehow irritated now, as 275 MB/Sec seems rather low for reads
> from the page cache on a 2.8Ghz Xeon machine (even my laptop gets about 400
> MB/Sec with this, and Skate even reaches 3486.385664 MBytes/Sec.)
>
>
> No idea...
> Markus.
Did you played with hdparm -m <device> ?

Regards,
FabF

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/