Re: [PATCH v3 1/5] blk-mq-sched: introduce high level elevator lock

From: Yu Kuai
Date: Mon Aug 11 2025 - 02:17:58 EST


Hi,

在 2025/08/11 12:34, Damien Le Moal 写道:
On 8/11/25 13:25, Yu Kuai wrote:
Hi,

在 2025/08/11 11:53, Damien Le Moal 写道:
On 8/11/25 10:01, Yu Kuai wrote:
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 55a0fd105147..1a2da5edbe13 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -113,7 +113,14 @@ static int __blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
if (budget_token < 0)
break;
- rq = e->type->ops.dispatch_request(hctx);
+ if (blk_queue_sq_sched(q)) {
+ elevator_lock(e);
+ rq = e->type->ops.dispatch_request(hctx);
+ elevator_unlock(e);

I do not think this is safe for bfq since bfq uses the irqsave/irqrestore spin
lock variant. If it is safe, this needs a big comment block explaining why
and/or the rules regarding the scheduler use of this lock.

It's correct, however, this patch doesn't change bfq yet, and it's like:

elevator_lock
spin_lock_irq(&bfqd->lock)
spin_unlock_irq(&bfqd->lock)
elevator_unlock

Patch 3 remove bfqd->lock and convert this to:

elevator_lock_irq
elevator_unlock_irq.

I do not understand. Since q->elevator->lock is already taken here, without IRQ
disabled, how can bfq_dispatch_request method again take this same lock with IRQ
disabled ? That cannot possibly work.

Looks like there is still misunderstanding somehow :( After patch 3,
bfq_dispatch_work doesn't grab any lock, elevator lock is held before
calling into dispatch method.

Before:

elevator_lock
bfq_dispatch_request
spin_lock_irq(&bfqd->lock)
spin_unlock_irq(&bfqd->lock)
elevator_unlock

After:
elevator_lock_irq
bfq_dispatch_request
elevator_unlock_irq

Ah, yes, I see it now.

But that is a nasty change that affects *all* schedulers, even those that do not
need to disable IRQs because they are not using the lock in their completion
path, e.g. mq-deadline. So I do not think that is acceptable.


OK, I did some benchmark and didn't found any performance regression, so
I decided use the irq version for deadline. Perhaps I didn't found such
test.

I can add a flag in elevator_queue->flags in addition like following to
fix this:

diff --git a/block/elevator.h b/block/elevator.h
index 81f7700b0339..f04b9b2ae344 100644
--- a/block/elevator.h
+++ b/block/elevator.h
@@ -123,6 +123,7 @@ struct elevator_queue {
#define ELEVATOR_FLAG_REGISTERED 0
#define ELEVATOR_FLAG_DYING 1
#define ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT 2
+#define ELEVATOR_FLAG_LOCK_AT_COMP 3

#define elevator_lock(e) spin_lock(&(e)->lock)
#define elevator_unlock(e) spin_unlock(&(e)->lock)
@@ -134,6 +135,22 @@ struct elevator_queue {
spin_unlock_irqrestore(&(e)->lock, flags)
#define elevator_lock_assert_held(e) lockdep_assert_held(&(e)->lock)

+static inline void elevator_dispatch_lock(struct elevator_queue *e)
+{
+ if (test_bit(ELEVATOR_FLAG_LOCK_AT_COMP, &e->flags))
+ elevator_lock_irq(e);
+ else
+ elevator_lock(e);
+}
+
+static inline void elevator_dispatch_unlock(struct elevator_queue *e)
+{
+ if (test_bit(ELEVATOR_FLAG_LOCK_AT_COMP, &e->flags))
+ elevator_unlock_irq(e);
+ else
+ elevator_unlock(e);
+}
+

Thanks,
Kuai