Re: [PATCH 02/20] blkio: Change CFQ to use CFS like queue timestamps

From: Vivek Goyal
Date: Wed Nov 04 2009 - 17:26:08 EST

On Wed, Nov 04, 2009 at 10:18:15PM +0100, Corrado Zoccolo wrote:
> Hi Vivek,
> On Wed, Nov 4, 2009 at 12:43 AM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> > o Previously CFQ had one service tree where queues of all theree prio classes
> >  were being queued. One side affect of this time stamping approach is that
> >  now single tree approach might not work and we need to keep separate service
> >  trees for three prio classes.
> >
> Single service tree is no longer true in cfq for-2.6.33.
> Now we have a matrix of service trees, with first dimension being the
> priority class, and second dimension being the workload type
> (synchronous idle, synchronous no-idle, async).
> You can have a look at the series: .
> It may have other interesting influences on your work, as the idle
> introduced at the end of the synchronous no-idle tree, that provides
> fairness also for seeky or high-think-time queues.

Thanks. I am looking at your patches right now. Got one question about
following commit.

commit a6d44e982d3734583b3b4e1d36921af8cfd61fc0
Author: Corrado Zoccolo <czoccolo@xxxxxxxxx>
Date: Mon Oct 26 22:45:11 2009 +0100

cfq-iosched: enable idling for last queue on priority class

cfq can disable idling for queues in various circumstances.
When workloads of different priorities are competing, if the higher
priority queue has idling disabled, lower priority queues may steal
its disk share. For example, in a scenario with an RT process
performing seeky reads vs a BE process performing sequential reads,
on an NCQ enabled hardware, with low_latency unset,
the RT process will dispatch only the few pending requests every full
slice of service for the BE process.

The patch solves this issue by always performing idle on the last
queue at a given priority class > idle. If the same process, or one
that can pre-empt it (so at the same priority or higher), submits a
new request within the idle window, the lower priority queue won't
dispatch, saving the disk bandwidth for higher priority ones.

Note: this doesn't touch the non_rotational + NCQ case (no hardware
to test if this is a benefit in that case).

Not able to understand the logic of waiting for last queue in prio
class. This whole patch series seems to be about low latencies. So why
would not somebody set "low_latency" in IO scheduler? And if somebody
sets "low_latencies" then we will enable idling on random seeky reader
also. So problem will not exist.

On top of that, even if we don't idle for RT reader, we will always
preempt BE reader immediately and get the disk. The only side affect
is that on rotational media, disk head might have moved and bring the
overall throughput down.

So my concern is that with this idling on last queue, we are targetting
fairness issue for the random seeky readers with thinktime with-in 8ms.
That can be easily solved by setting low_latency=1. Why are we going
to this lenth then?

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at