[PATCH BUGFIX/IMPROVEMENT 6/6] block, bfq: do not expire a queue when it is the only busy one

From: Paolo Valente
Date: Fri Jan 22 2021 - 13:32:03 EST


This commits preserves I/O-dispatch plugging for a special symmetric
case that may suddenly turn into asymmetric: the case where only one
bfq_queue, say bfqq, is busy. In this case, not expiring bfqq does not
cause any harm to any other queues in terms of service guarantees. In
contrast, it avoids the following unlucky sequence of events: (1) bfqq
is expired, (2) a new queue with a lower weight than bfqq becomes busy
(or more queues), (3) the new queue is served until a new request
arrives for bfqq, (4) when bfqq is finally served, there are so many
requests of the new queue in the drive that the pending requests for
bfqq take a lot of time to be served. In particular, event (2) may
case even already dispatched requests of bfqq to be delayed, inside
the drive. So, to avoid this series of events, the scenario is
preventively declared as asymmetric also if bfqq is the only busy
queues. By doing so, I/O-dispatch plugging is performed for bfqq.

Tested-by: Jan Kara <jack@xxxxxxx>
Signed-off-by: Paolo Valente <paolo.valente@xxxxxxxxxx>
---
block/bfq-iosched.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 003c96fa01ad..c045613ce927 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -3464,20 +3464,38 @@ static void bfq_dispatch_remove(struct request_queue *q, struct request *rq)
* order until all the requests already queued in the device have been
* served. The last sub-condition commented above somewhat mitigates
* this problem for weight-raised queues.
+ *
+ * However, as an additional mitigation for this problem, we preserve
+ * plugging for a special symmetric case that may suddenly turn into
+ * asymmetric: the case where only bfqq is busy. In this case, not
+ * expiring bfqq does not cause any harm to any other queues in terms
+ * of service guarantees. In contrast, it avoids the following unlucky
+ * sequence of events: (1) bfqq is expired, (2) a new queue with a
+ * lower weight than bfqq becomes busy (or more queues), (3) the new
+ * queue is served until a new request arrives for bfqq, (4) when bfqq
+ * is finally served, there are so many requests of the new queue in
+ * the drive that the pending requests for bfqq take a lot of time to
+ * be served. In particular, event (2) may case even already
+ * dispatched requests of bfqq to be delayed, inside the drive. So, to
+ * avoid this series of events, the scenario is preventively declared
+ * as asymmetric also if bfqq is the only busy queues
*/
static bool idling_needed_for_service_guarantees(struct bfq_data *bfqd,
struct bfq_queue *bfqq)
{
+ int tot_busy_queues = bfq_tot_busy_queues(bfqd);
+
/* No point in idling for bfqq if it won't get requests any longer */
if (unlikely(!bfqq_process_refs(bfqq)))
return false;

return (bfqq->wr_coeff > 1 &&
(bfqd->wr_busy_queues <
- bfq_tot_busy_queues(bfqd) ||
+ tot_busy_queues ||
bfqd->rq_in_driver >=
bfqq->dispatched + 4)) ||
- bfq_asymmetric_scenario(bfqd, bfqq);
+ bfq_asymmetric_scenario(bfqd, bfqq) ||
+ tot_busy_queues == 1;
}

static bool __bfq_bfqq_expire(struct bfq_data *bfqd, struct bfq_queue *bfqq,
--
2.20.1