Re: Slow file transfer speeds with CFQ IO scheduler in some cases

From: Jens Axboe
Date: Tue Nov 11 2008 - 04:36:18 EST


On Mon, Nov 10 2008, Jeff Moyer wrote:
> "Vitaly V. Bursov" <vitalyb@xxxxxxxxxxxxx> writes:
>
> > Jens Axboe wrote:
> >> On Mon, Nov 10 2008, Vitaly V. Bursov wrote:
> >>> Jens Axboe wrote:
> >>>> On Mon, Nov 10 2008, Jeff Moyer wrote:
> >>>>> Jens Axboe <jens.axboe@xxxxxxxxxx> writes:
> >>>>>
> >>>>>> http://bugzilla.kernel.org/attachment.cgi?id=18473&action=view
> >>>>> Funny, I was going to ask the same question. ;) The reason Jens wants
> >>>>> you to try this patch is that nfsd may be farming off the I/O requests
> >>>>> to different threads which are then performing interleaved I/O. The
> >>>>> above patch tries to detect this and allow cooperating processes to get
> >>>>> disk time instead of waiting for the idle timeout.
> >>>> Precisely :-)
> >>>>
> >>>> The only reason I haven't merged it yet is because of worry of extra
> >>>> cost, but I'll throw some SSD love at it and see how it turns out.
> >>>>
> >>> Sorry, but I get "oops" same moment nfs read transfer starts.
> >>> I can get directory list via nfs, read files locally (not
> >>> carefully tested, though)
> >>>
> >>> Dumps captured via netconsole, so these may not be completely accurate
> >>> but hopefully will give a hint.
> >>
> >> Interesting, strange how that hasn't triggered here. Or perhaps the
> >> version that Jeff posted isn't the one I tried. Anyway, search for:
> >>
> >> RB_CLEAR_NODE(&cfqq->rb_node);
> >>
> >> and add a
> >>
> >> RB_CLEAR_NODE(&cfqq->prio_node);
> >>
> >> just below that. It's in cfq_find_alloc_queue(). I think that should fix
> >> it.
> >>
> >
> > Same problem.
> >
> > I did make clean; make -j3; sync; on (2 times) patched kernel and it went OK
> > but It won't boot anymore with cfq with same error...
> >
> > Switching cfq io scheduler at runtime (booting with "as") appears to work with
> > two parallel local dd reads.
>
> Strange, I can't reproduce a failure. I'll keep trying. For now, these
> are the results I see:
>
> [root@maiden ~]# mount megadeth:/export/cciss /mnt/megadeth/
> [root@maiden ~]# dd if=/mnt/megadeth/file1 of=/dev/null bs=1M
> 1024+0 records in
> 1024+0 records out
> 1073741824 bytes (1.1 GB) copied, 26.8128 s, 40.0 MB/s
> [root@maiden ~]# umount /mnt/megadeth/
> [root@maiden ~]# mount megadeth:/export/cciss /mnt/megadeth/
> [root@maiden ~]# dd if=/mnt/megadeth/file1 of=/dev/null bs=1M
> 1024+0 records in
> 1024+0 records out
> 1073741824 bytes (1.1 GB) copied, 23.7025 s, 45.3 MB/s
> [root@maiden ~]# umount /mnt/megadeth/
>
> Here is the patch, with the suggestion from Jens to switch the cfqq to
> the right priority tree when the priority is changed.

I don't see the issue here either. Vitaly, are you using any openvz
kernel patches? IIRC, they patch cfq so it could just be that your cfq
version is incompatible with Jeff's patch.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/