Re: tiobench read 50% regression with 2.6.30-rc1

From: Jeff Moyer
Date: Wed Apr 15 2009 - 00:07:33 EST


Jens Axboe <jens.axboe@xxxxxxxxxx> writes:

> On Fri, Apr 10 2009, Zhang, Yanmin wrote:
>> On Thu, 2009-04-09 at 11:57 +0200, Jens Axboe wrote:
>> > On Thu, Apr 09 2009, Zhang, Yanmin wrote:
>> > > Comparing with 2.6.29's result, tiobench (read) has about 50% regression
>> > > with 2.6.30-rc1 on all my machines. Bisect down to below patch.
>> > >
>> > > b029195dda0129b427c6e579a3bb3ae752da3a93 is first bad commit
>> > > commit b029195dda0129b427c6e579a3bb3ae752da3a93
>> > > Author: Jens Axboe <jens.axboe@xxxxxxxxxx>
>> > > Date: Tue Apr 7 11:38:31 2009 +0200
>> > >
>> > > cfq-iosched: don't let idling interfere with plugging
>> > >
>> > > When CFQ is waiting for a new request from a process, currently it'll
>> > > immediately restart queuing when it sees such a request. This doesn't
>> > > work very well with streamed IO, since we then end up splitting IO
>> > > that would otherwise have been merged nicely. For a simple dd test,
>> > > this causes 10x as many requests to be issued as we should have.
>> > > Normally this goes unnoticed due to the low overhead of requests
>> > > at the device side, but some hardware is very sensitive to request
>> > > sizes and there it can cause big slow downs.
>> > >
>> > >
>> > >
>> > > Command to start the testing:
>> > > #tiotest -k0 -k1 -k3 -f 80 -t 32
>> > >
>> > > It's a multi-threaded program and starts 32 threads. Every thread does I/O
>> > > on its own 80MB file.
>> The files should be created before the testing and pls. drop page caches
>> by "echo 3 >/proc/sys/vm/drop_caches" before testing.
>>
>> >
>> > It's not a huge surprise that we regressed there. I'll get this fixed up
>> > next week. Can you I talk you into trying to change the 'quantum' sysfs
>> > variable for the drive? It's in /sys/block/xxx/queue/iosched where xxx
>> > is your drive(s). It's set to 4, if you could try progressively larger
>> > settings and retest, that would help get things started.
>> I tried 4,8,16,64,128 and didn't find result difference.
>
> Can you try with this patch?
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index a4809de..66f00e5 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -1905,10 +1905,17 @@ cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq,
> * Remember that we saw a request from this process, but
> * don't start queuing just yet. Otherwise we risk seeing lots
> * of tiny requests, because we disrupt the normal plugging
> - * and merging.
> + * and merging. If the request is already larger than a single
> + * page, let it rip immediately. For that case we assume that
> + * merging is already done.
> */
> - if (cfq_cfqq_wait_request(cfqq))
> + if (cfq_cfqq_wait_request(cfqq)) {
> + if (blk_rq_bytes(rq) > PAGE_CACHE_SIZE) {
> + del_timer(&cfqd->idle_slice_timer);
> + blk_start_queueing(cfqd->queue);
> + }
> cfq_mark_cfqq_must_dispatch(cfqq);
> + }
> } else if (cfq_should_preempt(cfqd, cfqq, rq)) {
> /*
> * not the active queue - expire current slice if it is

I tested this using iozone to read a file from an NFS client. The
iozone command line was:
iozone -s 2000000 -r 64 -f /mnt/test/testfile -i 1 -w

The numbers in the nfsd's row represent the number of nfsd threads. I
included numbers for the deadline scheduler as well for comparison.

v2.6.29

nfsd's | 1 | 2 | 4 | 8
--------+---------------+-------+------
cfq | 91356 | 66391 | 61942 | 51674
deadline| 43207 | 67436 | 96289 | 107784

2.6.30-rc1

nfsd's | 1 | 2 | 4 | 8
--------+---------------+-------+------
cfq | 43127 | 22354 | 20858 | 21179
deadline| 43732 | 68059 | 76659 | 83231

2.6.30-rc1 + cfq fix

nfsd's | 1 | 2 | 4 | 8
--------+-----------------+-------+------
cfq | 114602 | 102280 | 43479 | 43160

As you can see, for 1 and 2 threads, the patch *really* helps out. We
still don't get back the performance for 4 and 8 nfsd threads, though.
It's interesting to note that the deadline scheduler regresses for 4 and
8 threads, as well. I think we've still got some digging to do.

I'll try the cfq close cooperator patches next.

Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/