[PATCHSET] blk-mq: reimplement timeout handling

From: Tejun Heo
Date: Sat Dec 09 2017 - 14:25:42 EST


Currently, blk-mq timeout path synchronizes against the usual
issue/completion path using a complex scheme involving atomic
bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence
rules. Unfortunatley, it contains quite a few holes.

It's pretty easy to make blk_mq_check_expired() terminate a later
instance of a request. If we induce 5 sec delay before
time_after_eq() test in blk_mq_check_expired(), shorten the timeout to
2s, and issue back-to-back large IOs, blk-mq starts timing out
requests spuriously pretty quickly. Nothing actually timed out. It
just made the call on a recycle instance of a request and then
terminated a later instance long after the original instance finished.
The scenario isn't theoretical either.

This patchset replaces the broken synchronization mechanism with a RCU
and generation number based one. Please read the patch description of
the second path for more details.

Oleg, Peter, I'd really appreciate if you guys can go over the
reported breakages and the new implementation.

This patchset contains the following six patches.

0001-blk-mq-protect-completion-path-with-RCU.patch
0002-blk-mq-replace-timeout-synchronization-with-a-RCU-an.patch
0003-blk-mq-use-blk_mq_rq_state-instead-of-testing-REQ_AT.patch
0004-blk-mq-make-blk_abort_request-trigger-timeout-path.patch
0005-blk-mq-remove-REQ_ATOM_COMPLETE-usages-from-blk-mq.patch
0006-blk-mq-remove-REQ_ATOM_STARTED.patch

and is available in the following git branch.

git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git blk-mq-timeout

diffstat follows. Thanks.

block/blk-core.c | 2
block/blk-mq-debugfs.c | 4
block/blk-mq.c | 246 +++++++++++++++++++++++++++----------------------
block/blk-mq.h | 48 +++++++++
block/blk-timeout.c | 9 -
block/blk.h | 7 -
include/linux/blk-mq.h | 1
include/linux/blkdev.h | 23 ++++
8 files changed, 218 insertions(+), 122 deletions(-)

--
tejun