Re: [PATCH 3/3] blk-mq: Use llist_head for blk_cpu_done

From: Sagi Grimberg
Date: Wed Nov 04 2020 - 14:15:37 EST



There really aren't any rules for this, and it's perfectly legit to
complete from process context. Maybe you're a kthread driven driver and
that's how you handle completions. The block completion path has always
been hard IRQ safe, but possible to call from anywhere.

I'm not trying to put restrictions and forbidding completions from a
kthread. I'm trying to avoid the pointless softirq dance for no added
value. We could:

to not break that assumption you just mentioned and provide
|static inline void blk_mq_complete_request_local(struct request *rq)
|{
| rq->q->mq_ops->complete(rq);
|}

so that completion issued from from process context (like those from
usb-storage) don't end up waking `ksoftird' (running at SCHED_OTHER)
completing the requests but rather performing it right away. The softirq
dance makes no sense here.

Agreed. But I don't think your above blk_mq_complete_request_local
is all that useful either as ->complete is defined by the caller,
so we could just do a direct call. Basically we should just
return false from blk_mq_complete_request_remote after updating
the state when called from process context.

Agreed.

But given that IIRC
we are not supposed to check what state we are called from
we'll need a helper just for updating the state instead and
ensure the driver uses the right helper. Now of course we might
have process context callers that still want to bounce to the
submitting CPU, but in that case we should go directly to a
workqueue or similar.

This would mean that it may be suboptimal for nvme-tcp to complete
requests in softirq context from the network context (determined by
NIC steering). Because in this case, this would trigger workqueue
schedule on a per-request basis rather than once per .data_ready
call like we do today. Is that correct?

It has been observed that completing commands in softirq context
(network determined cpu) because basically the completion does
IPI + local complete, not IPI + softirq or IPI + workqueue.

Either way doing this properly will probabl involve an audit of all
drivers, but I think that is worth it.

Agree.