Re: Excessive stall times on ext4 in 3.9-rc2

From: Theodore Ts'o
Date: Wed Apr 10 2013 - 09:13:37 EST


On Wed, Apr 10, 2013 at 11:56:08AM +0100, Mel Gorman wrote:
> During major activity there is likely to be "good" behaviour
> with stalls roughly every 30 seconds roughly corresponding to
> dirty_expire_centiseconds. As you'd expect, the flusher thread is stuck
> when this happens.
>
> 237 ? 00:00:00 flush-8:0
> [<ffffffff811a35b9>] sleep_on_buffer+0x9/0x10
> [<ffffffff811a35ee>] __lock_buffer+0x2e/0x30
> [<ffffffff8123a21f>] do_get_write_access+0x43f/0x4b0

If we're stalling on lock_buffer(), that implies that buffer was being
written, and for some reason it was taking a very long time to
complete.

It might be worthwhile to put a timestamp in struct dm_crypt_io, and
record the time when a particular I/O encryption/decryption is getting
queued to the kcryptd workqueues, and when they finally squirt out.

Something else that might be worth trying is to add WQ_HIGHPRI to the
workqueue flags and see if that makes a difference.

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/