Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler

From: Jens Axboe
Date: Fri Oct 28 2016 - 10:07:44 EST


On 10/27/2016 04:27 PM, Linus Walleij wrote:
On Thu, Oct 27, 2016 at 11:08 PM, Jens Axboe <axboe@xxxxxxxxx> wrote:

blk-mq has evolved to support a variety of devices, there's nothing
special about mmc that can't work well within that framework.

There is. Read mmc_queue_thread() in drivers/mmc/card/queue.c

This repeatedly calls req = blk_fetch_request(q);, starting one request
and then getting the next one off the queue, including reading
a few NULL requests off the end of the queue (to satisfy the
semantics of its state machine.

It then preprocess each request by esstially calling .pre() and .post()
hooks all the way down to the driver, flushing its mapped
sglist from CPU to DMA device memory (not a problem on x86 and
other DMA-coherent archs, but a big win on the incoherent ones).

In the attempt that was posted recently this is achieved by lying
and saying the HW queue is two items deep and eating requests
off that queue calling pre/post on them.

But as there actually exist MMC cards with command queueing, this
would become hopeless to handle, the hw queue depth has to reflect
the real depth. What we need is for the block core to call pre/post
hooks on each request.

The "only" thing that doesn't work well after that is that CFQ is no
longer in action, which will have interesting effects on MMC throughput
in any fio-like stress test as it is mostly single-hw-queue.

That will cause you pain with any IO scheduler that has more complex
state, like CFQ and BFQ... I looked at the code but I don't quite get
why it is handling requests like that. Care to expand? Is it a
performance optimization? It looks fairly convoluted for some reason. I
would imagine that latency would be one of the more important aspects
for mmc, yet the driver has a context switch for each sync IO.

--
Jens Axboe