Large write = large latency for small writes

From: David Rees
Date: Tue Mar 17 2009 - 16:19:32 EST


There is a fairly active bug 12309 "Large I/O operations result in
slow performance and high iowait times" [1] which has gotten rather
large an unwieldy due to the number of participants and confusion due
to what appears to be multiple bugs in either the schedule or IO
request merging layers.

I encountered this bug when seeing some horrible latency on a NFS
client while the server was performing heavy IO [2] and initially
thought it was a NFS bug, but later found that NFS only makes it worse
because writes over NFS are synchronous by default.

I have a simple test case which demonstrates the huge increase in
write latency that occurs for small writes when a large disk
saturating write is also in progress [3]:

dd if=/dev/zero of=/tmp/bigfile bs=1M count=10000 conv=fdatasync &
sleep 10
time dd if=/dev/zero of=/tmp/smallfile bs=4k count=1 conv=fdatasync

On a handful of systems I have access to, it took anywhere from 6-45
seconds for the small write to complete. Others in the bug have
reproduced this across a number of filesystems (ext3, reiserfs, ext4).
xfs in particular seems to handle this test case better than the
others. As do systems which can sustain high write speeds.

The only way I've been able to reduce the latency to acceptable levels
is to drop vm.dirty_background_ratio and vm.dirty_ratio to 1 and 2
respectively - but even then on some systems the small write still
takes 5-7 seconds.

The systems I've been testing on are all Fedora 9 or 10 systems with
the latest kernels (basically 2.6.27.19), ext3 filesystems and various
amounts of CPU, memory and disk subsystems (but nothing too new or
fast).

Any ideas? How much latency is acceptable with the test case?

-Dave

[1] http://bugzilla.kernel.org/show_bug.cgi?id=12309
[2] http://marc.info/?l=linux-nfs&m=123697692631683&w=2
[3] http://bugzilla.kernel.org/show_bug.cgi?id=12309#c249
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/