Re: [PATCH] block: fix io hung by block throttle

From: Junxiao Bi
Date: Mon Apr 19 2021 - 02:11:02 EST



On 4/18/21 5:33 AM, Hillf Danton wrote:
On Sat, 17 Apr 2021 14:37:57 Junxiao Bi wrote:
On 4/17/21 3:10 AM, Hillf Danton wrote:
+ if (acquire_inflight_cb(rqw, private_data))
This function is to increase atomic variable rq_wait->inflight.
You are right.

What's the mutex for?
It cuts the race between we peek at the sleepers on rqw->wait while they are
coming and going, and we cant update rqw->inflight without making sure there
are no sleepers.

Why? I think checking the sleeper in original code is for a fast path.

For wbt, acquire_inflight_cb is wbt_inflight_cb where atomic_inc_below is used to update rqw->inflight. I don't see why a mutex is needed for this atomic operation.


With the mutex in place, in addition to the certainty of !sleepers, we can
avoid the race between us and waker in terms of updating inflight by removing
the invokation of acquire_inflight_cb in the wakeup callback, and the bonus is
we no longer need the wakeup cb and the rq_qos_wait_data because the more
traditional wait_event() can do the job.

Finally we can dump the cleanup_cb_t.

+++ b/block/blk-rq-qos.c
@@ -200,96 +200,24 @@ bool rq_depth_scale_down(struct rq_depth
return true;
}
-struct rq_qos_wait_data {
- struct wait_queue_entry wq;
- struct task_struct *task;
- struct rq_wait *rqw;
- acquire_inflight_cb_t *cb;
- void *private_data;
- bool got_token;
-};
-
-static int rq_qos_wake_function(struct wait_queue_entry *curr,
- unsigned int mode, int wake_flags, void *key)
-{
- struct rq_qos_wait_data *data = container_of(curr,
- struct rq_qos_wait_data,
- wq);
-
- /*
- * If we fail to get a budget, return -1 to interrupt the wake up loop
- * in __wake_up_common.
- */
- if (!data->cb(data->rqw, data->private_data))
- return -1;
-
- data->got_token = true;
- smp_wmb();
- list_del_init(&curr->entry);
- wake_up_process(data->task);
- return 1;
-}
-
/**
* rq_qos_wait - throttle on a rqw if we need to
* @rqw: rqw to throttle on
* @private_data: caller provided specific data
* @acquire_inflight_cb: inc the rqw->inflight counter if we can
- * @cleanup_cb: the callback to cleanup in case we race with a waker
*
* This provides a uniform place for the rq_qos users to do their throttling.
* Since you can end up with a lot of things sleeping at once, this manages the
* waking up based on the resources available. The acquire_inflight_cb should
* inc the rqw->inflight if we have the ability to do so, or return false if not
* and then we will sleep until the room becomes available.
- *
- * cleanup_cb is in case that we race with a waker and need to cleanup the
- * inflight count accordingly.
*/
void rq_qos_wait(struct rq_wait *rqw, void *private_data,
- acquire_inflight_cb_t *acquire_inflight_cb,
- cleanup_cb_t *cleanup_cb)
+ acquire_inflight_cb_t *acquire_inflight_cb)
{
- struct rq_qos_wait_data data = {
- .wq = {
- .func = rq_qos_wake_function,
- .entry = LIST_HEAD_INIT(data.wq.entry),
- },
- .task = current,
- .rqw = rqw,
- .cb = acquire_inflight_cb,
- .private_data = private_data,
- };
- bool has_sleeper;
-
- has_sleeper = wq_has_sleeper(&rqw->wait);
- if (!has_sleeper && acquire_inflight_cb(rqw, private_data))
- return;
-
- prepare_to_wait_exclusive(&rqw->wait, &data.wq, TASK_UNINTERRUPTIBLE);
- has_sleeper = !wq_has_single_sleeper(&rqw->wait);
- do {
- /* The memory barrier in set_task_state saves us here. */
- if (data.got_token)
- break;
- if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) {
- finish_wait(&rqw->wait, &data.wq);
-
- /*
- * We raced with wbt_wake_function() getting a token,
- * which means we now have two. Put our local token
- * and wake anyone else potentially waiting for one.
- */
- smp_rmb();
- if (data.got_token)
- cleanup_cb(rqw, private_data);
- break;
- }
- io_schedule();
- has_sleeper = true;
- set_current_state(TASK_UNINTERRUPTIBLE);
- } while (1);
- finish_wait(&rqw->wait, &data.wq);
+ mutex_lock(&rqw->throttle_mutex);
+ wait_event(rqw->wait, acquire_inflight_cb(rqw, private_data));
+ mutex_unlock(&rqw->throttle_mutex);

This will break the throttle? There is a inflight io limitation. With this change, there can be only one io inflight whatever the limit is.

Thanks,

Junxiao.

}
void rq_qos_exit(struct request_queue *q)