Fwd: [PATCH 0/5] cfq-iosched: improve latency for no-idle queues (v3)

From: Corrado Zoccolo
Date: Tue Nov 03 2009 - 13:35:39 EST

Hi Jens,
Jeff did some testing of this patchset on his NCQ-enabled SSD (the
30GB OCZ Vertex).
The test suite contained various multiple competing workloads
scenarios, and was run on for-2.6.33 and cfq-2.6.33 branches.

Max latencies were reduced in most cases, and we had also improvements
on bandwidth side in some scenarios, especially
for multiple random readers, either alone or competing with writes.
2 random readers aggregate bw increased from 48356 to 74205
and 4 random readers vs 1 seq writer:
* aggregate reader bw increased from 35242 to 56400
* writer bandwidth increased from 33269 to 55127
* maximum latency on read decreased from 535 to 324
* maximum latency on writes decreased from 22243 to 1153
It's a win on all measures.
The effect increasing the number of readers to 32 (latency_test_2.fio)
is even more visible (max read latency reduced from 3305 to 268,
aggregated read BW increased from 32894 to 164571).

The only case where I see an increased max latency is for 2 random
readers vs 1 seq reader:

randomread.0: read_bw = 15,418K
randomread.1: read_bw = 15,399K
seqread: read_bw = 409K
0: read_bw = 31226
0: read_lat_max = 11.589
0: read_lat_avg = 3.22366666666667

randomread.0: read_bw = 10,065K
randomread.1: read_bw = 10,067K
seqread: read_bw = 101M
0: read_bw = 121132
0: read_lat_max = 303
0: read_lat_avg = 0.282333333333333

but here the increased latency is paid back by a large increase in
sequential read BW (the max latency is, btw, experienced by the seq
reader, so I think it is a fair behaviour).

Jeff observed that the for-2.6.33 numbers were worse than his baseline
runs, probably due to changed hw_tag detection.
My patchset is much less sensible to hw_tag on SSDs (since there are
much less situations in which it would idle), so my numbers are

