Re: performance "regression" in cfq compared to anticipatory, deadline and noop

From: Jens Axboe
Date: Wed May 14 2008 - 04:26:49 EST

Next message: KAMEZAWA Hiroyuki: "Re: [RFC/PATCH 4/6] memcg: shmem reclaim helper"
Previous message: Li Zefan: "Re: [RFC/PATCH 4/6] memcg: shmem reclaim helper"
In reply to: Daniel J Blueman: "Re: performance "regression" in cfq compared to anticipatory, deadline and noop"
Next in thread: Daniel J Blueman: "Re: performance "regression" in cfq compared to anticipatory, deadline and noop"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, May 14 2008, Daniel J Blueman wrote:
> > > > > > > > On Sun, 2008-05-11 at 14:14 +0100, Daniel J Blueman wrote:
> > > > > > > > > I've been experiencing this for a while also; an almost 50% regression
> > > > > > > > > is seen for single-process reads (ie sync) if slice_idle is 1ms or
> > > > > > > > > more (eg default of 8) [1], which seems phenomenal.
> > > > > > > > >
> > > > > > > > > Jens, is this the expected price to pay for optimal busy-spindle
> > > > > > > > > scheduling, a design issue, bug or am I missing something totally?
> [snip]
> > > They seem to start out the same, but then CFQ gets interrupted by a
> > > timer unplug (which is also odd) and after that the request size drops.
> > > On most devices you don't notice, but some are fairly picky about
> > > request sizes. The end result is that CFQ has an average dispatch
> > > request size of 142kb, where AS is more than double that at 306kb. I'll
> > > need to analyze the data and look at the code a bit more to see WHY this
> > > happens.
> >
> > Here's a test patch, I think we get into this situation due to CFQ being
> > a bit too eager to start queuing again. Not tested, I'll need to spend
> > some testing time on this. But I'd appreciate some feedback on whether
> > this changes the situation! The final patch will be a little more
> > involved.
> >
> > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> > index b399c62..ebd8ce2 100644
> > --- a/block/cfq-iosched.c
> > +++ b/block/cfq-iosched.c
> > @@ -1775,18 +1775,8 @@ cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq,
> >
> > cic->last_request_pos = rq->sector + rq->nr_sectors;
> >
> > - if (cfqq == cfqd->active_queue) {
> > - /*
> > - * if we are waiting for a request for this queue, let it rip
> > - * immediately and flag that we must not expire this queue
> > - * just now
> > - */
> > - if (cfq_cfqq_wait_request(cfqq)) {
> > - cfq_mark_cfqq_must_dispatch(cfqq);
> > - del_timer(&cfqd->idle_slice_timer);
> > - blk_start_queueing(cfqd->queue);
> > - }
> > - } else if (cfq_should_preempt(cfqd, cfqq, rq)) {
> > + if ((cfqq != cfqd->active_queue) &&
> > + cfq_should_preempt(cfqd, cfqq, rq)) {
> > /*
> > * not the active queue - expire current slice if it is
> > * idle and has expired it's mean thinktime or this new queue
>
> I find this does address the issue (both with 64KB stride dd and
> hdparm -t; presumably the requests getting merged). Tested on
> 2.6.26-rc2 on Ubuntu HH 804 x86-64, with slice_idle defaulting to 8
> and AHCI on ICH9; disk is ST3320613AS.
>
> Blktrace profiles from 'dd if=/dev/sda of=/dev/null bs=64k count=1000' are at:
>
> http://quora.org/blktrace-profiles.tar.bz2

Goodie! I think the below patch is better - we do want to schedule the
queue immediately, but we do not want to interrupt the queuer. So just
kick the workqueue handling of the queue instead of entering the
dispatcher directly. Can you test this one as well? Thanks!

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index f4e1006..e8c1941 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1107,7 +1107,6 @@ static int cfq_dispatch_requests(struct request_queue *q, int force)

cfq_clear_cfqq_must_dispatch(cfqq);
cfq_clear_cfqq_wait_request(cfqq);
- del_timer(&cfqd->idle_slice_timer);

dispatched += __cfq_dispatch_requests(cfqd, cfqq, max_dispatch);
}
@@ -1769,15 +1768,9 @@ cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq,
cic->last_request_pos = rq->sector + rq->nr_sectors;

if (cfqq == cfqd->active_queue) {
- /*
- * if we are waiting for a request for this queue, let it rip
- * immediately and flag that we must not expire this queue
- * just now
- */
if (cfq_cfqq_wait_request(cfqq)) {
- cfq_mark_cfqq_must_dispatch(cfqq);
del_timer(&cfqd->idle_slice_timer);
- blk_start_queueing(cfqd->queue);
+ kblockd_schedule_work(&cfqd->unplug_work);
}
} else if (cfq_should_preempt(cfqd, cfqq, rq)) {
/*
@@ -1787,7 +1780,7 @@ cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq,
*/
cfq_preempt_queue(cfqd, cfqq);
cfq_mark_cfqq_must_dispatch(cfqq);
- blk_start_queueing(cfqd->queue);
+ kblockd_schedule_work(&cfqd->unplug_work);
}
}

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: KAMEZAWA Hiroyuki: "Re: [RFC/PATCH 4/6] memcg: shmem reclaim helper"
Previous message: Li Zefan: "Re: [RFC/PATCH 4/6] memcg: shmem reclaim helper"
In reply to: Daniel J Blueman: "Re: performance "regression" in cfq compared to anticipatory, deadline and noop"
Next in thread: Daniel J Blueman: "Re: performance "regression" in cfq compared to anticipatory, deadline and noop"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]