[RFC PATCH v3 2/3] blk-mq: Freeze and quiesce all queues for tagset in elevator_exit()

From: John Garry
Date: Fri Mar 05 2021 - 10:20:03 EST


A use-after-free may occur if blk_mq_queue_tag_busy_iter() is run on a
queue when another queue associated with the same tagset is switching IO
scheduler:

BUG: KASAN: use-after-free in bt_iter+0xa0/0x120
Read of size 8 at addr ffff0410285e7e00 by task fio/2302

CPU: 24 PID: 2302 Comm: fio Not tainted 5.12.0-rc1-11925-g29a317e228d9 #747
Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018
Call trace:
dump_backtrace+0x0/0x2d8
show_stack+0x18/0x68
dump_stack+0x124/0x1a0
print_address_description.constprop.13+0x68/0x30c
kasan_report+0x1e8/0x258
__asan_load8+0x9c/0xd8
bt_iter+0xa0/0x120
blk_mq_queue_tag_busy_iter+0x348/0x5d8
blk_mq_in_flight+0x80/0xb8
part_stat_show+0xcc/0x210
dev_attr_show+0x44/0x90
sysfs_kf_seq_show+0x120/0x1c0
kernfs_seq_show+0x9c/0xb8
seq_read_iter+0x214/0x668
kernfs_fop_read_iter+0x204/0x2c0
new_sync_read+0x1ec/0x2d0
vfs_read+0x18c/0x248
ksys_read+0xc8/0x178
__arm64_sys_read+0x44/0x58
el0_svc_common.constprop.1+0xc8/0x1a8
do_el0_svc+0x90/0xa0
el0_svc+0x24/0x38
el0_sync_handler+0x90/0xb8
el0_sync+0x154/0x180

Indeed, blk_mq_queue_tag_busy_iter() already does take a reference to its
queue usage counter when called, and the queue cannot be frozen to switch
IO scheduler until all refs are dropped. This ensures no stale references
to IO scheduler requests will be seen by blk_mq_queue_tag_busy_iter().

However, there is nothing to stop blk_mq_queue_tag_busy_iter() being
run for another queue associated with the same tagset, and it seeing
a stale IO scheduler request from the other queue after they are freed.

To stop this happening, freeze and quiesce all queues associated with the
tagset as the elevator is exited.

Signed-off-by: John Garry <john.garry@xxxxxxxxxx>
---

I think that this patch is what Bart suggested:
https://lore.kernel.org/linux-block/c0d127a9-9320-6e1c-4e8d-412aa9ea9ca6@xxxxxxx/

block/blk.h | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)

diff --git a/block/blk.h b/block/blk.h
index 3b53e44b967e..1a948bfd91e4 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -201,10 +201,29 @@ void elv_unregister_queue(struct request_queue *q);
static inline void elevator_exit(struct request_queue *q,
struct elevator_queue *e)
{
+ struct blk_mq_tag_set *set = q->tag_set;
+ struct request_queue *tmp;
+
lockdep_assert_held(&q->sysfs_lock);

+ mutex_lock(&set->tag_list_lock);
+ list_for_each_entry(tmp, &set->tag_list, tag_set_list) {
+ if (tmp == q)
+ continue;
+ blk_mq_freeze_queue(tmp);
+ blk_mq_quiesce_queue(tmp);
+ }
+
blk_mq_sched_free_requests(q);
__elevator_exit(q, e);
+
+ list_for_each_entry(tmp, &set->tag_list, tag_set_list) {
+ if (tmp == q)
+ continue;
+ blk_mq_unquiesce_queue(tmp);
+ blk_mq_unfreeze_queue(tmp);
+ }
+ mutex_unlock(&set->tag_list_lock);
}

ssize_t part_size_show(struct device *dev, struct device_attribute *attr,
--
2.26.2