Re: Higher than expected disk write(2) latency

From: Jeff Moyer
Date: Wed Jul 02 2008 - 14:16:22 EST


Martin Sustrik <sustrik@xxxxxxxxxx> writes:

> Hi Roger,
>
>>> Fair enough. That exaplains the behaviour. Would AIO help here? If
>>> we are able to enqueue next write before the first one is finished,
>>> it can start writing it immediately without waiting for a
>>> revolution.
>>
>> If you could get them queued at the disk level, things that would
>> need to be watched were if the disk can queue things up (and all
>> controllers/drivers support it), and how many things the disk can
>> queue up, and how large each of those things can be, if they aren't
>> queued at the disk, there is the chance that the machine cannot get
>> the data to the disk faster enough for that next sector.
>>
>> I have always avoided fully sync operations as things *ALWAYS* got
>> really really slow because of all of the requirements need to make
>> sure that it always got the data to disk correctly on a unexpected
>> crash, and typically the type of applications I dealt with, if the
>> machine crashed the currently outputting data was known to be
>> incomplete and generally useless, so things were reran.
>>
>> Depending on your application you could always get a small fast
>> solid state device (no seek or RPM issues), and use it to keep a
>> journal that could be replayed on an unexpected crash...and then
>> just use various syncs to force things to disk at various points.
>
> We've tried AIO and the results are quite disappointing. If you open
> the file with O_SYNC, the latencies are the same as with sync I/O -
> each write takes 8.3ms (7500rpm disk).

I thought you were doing I/O to the underlying block device. If so,
there's no need to open with O_SYNC. You do, however, need to open the
device with O_DIRECT and align your buffers (and buffer lengths)
properly.

Which AIO interface are you using, libaio or librt? How many I/Os are
you queueing to the device? You may want to take a look at aio-stress.c
as a way to test your device (this uses libaio, the in-kernel AIO
interface).

Cheers,

Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/