Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing

From: Nicholas A. Bellinger
Date: Thu Aug 15 2013 - 22:13:26 EST

On Thu, 2013-08-15 at 18:23 +0200, Alexander Gordeev wrote:
> On Fri, Aug 09, 2013 at 01:17:37PM -0700, Nicholas A. Bellinger wrote:
> > On Fri, 2013-08-09 at 21:15 +0200, Alexander Gordeev wrote:
> > Mmmm, I'm able to reproduce over here with ahci + scsi-mq, and it
> > appears to be a bug related with using sdev->sdev_md_req.queue_depth=1,
> > that ends up causing the blkdev_issue_flush() to wait forever because
> > blk_mq_wait_for_tags() never ends up getting the single tag back for the
> It turns out this way - blkdev_issue_flush() claims the only tag, submits
> the bio and waits for the completion. But because blk_mq_make_request()
> does not mark any context in blk_mq_hw_ctx::ctx_map (nor enslists the request
> into blk_mq_ctx::rq_list) it never gets processed from blk_mq_work_fn->
> __blk_mq_run_hw_queue() and blkdev_issue_flush() waits endlessly. All other
> requests are just waiting for the tag availability as result.

Ok, here's a bit better idea of what is going on now..

The problem is that blkdev_issue_flush() -> blk_mq_make_request() ->
__blk_mq_alloc_request() allocates the first tag, which calls
blk_insert_flush() -> blk_flush_complete_seq() -> blk_flush_kick() ->
mq_flush_work() -> blk_mq_alloc_request() to allocate a second tag for
the struct request that actually gets dispatched into scsi-mq as a

I'm not exactly sure why this double tag usage of struct request is
occurring, but AFAICT it does happen for every flush, and is not
specific to the blkdev_issue_flush() codepath.. I'm sure that Jens can
fill us in on that bit. ;)

So, assuming that this double tag usage is necessary and not a bug,
perhaps using a reserved tag for the first tag (eg: the one that's not
dispatched into scsi_mq_queue_rq) makes sense..?

I'm playing with a patch to do this, but am currently getting hung-up on
what appear to be some separate blk-mq reserved_tags > 0 bugs, the first
of which is passing queue_depth=1 + reserved_tags=1 is broken, and
results in tags->nr_free = 0.

Here's the quick fix:

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 6718007..ffdf686 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -470,7 +470,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
* Rest of the tags start at the queue list
tags->nr_free = 0;
- while (nr_tags - tags->nr_reserved) {
+ while (nr_tags) {
tags->freelist[tags->nr_free] = tags->nr_free +

Anyways, before digging further into reserved tags logic, Jens, what are
your thoughts for addressing this special queue_depth=1 case for libata
+ the like..?


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at