Re: [PATCH v1] io_uring: Add support for napi_busy_poll

From: Hao Xu
Date: Tue Mar 01 2022 - 03:23:10 EST



On 3/1/22 05:01, Olivier Langlois wrote:
On Tue, 2022-03-01 at 02:26 +0800, Hao Xu wrote:
On 2/25/22 13:32, Olivier Langlois wrote:
On Mon, 2022-02-21 at 13:23 +0800, Hao Xu wrote:
@@ -5776,6 +5887,7 @@ static int __io_arm_poll_handler(struct
io_kiocb *req,
                 __io_poll_execute(req, mask);
                 return 0;
         }
+       io_add_napi(req->file, req->ctx);
I think this may not be the right place to do it. the process
will
be:
arm_poll sockfdA--> get invalid napi_id from sk->napi_id -->
event
triggered --> arm_poll for sockfdA again --> get valid napi_id
then why not do io_add_napi() in event
handler(apoll_task_func/poll_task_func).
You have a valid concern that the first time a socket is passed to
io_uring that napi_id might not be assigned yet.

OTOH, getting it after data is available for reading does not help
neither since busy polling must be done before data is received.

for both places, the extracted napi_id will only be leveraged at
the
next polling.
Hi Olivier,

I think we have some gap here. AFAIK, it's not 'might not', it is

'definitely not', the sk->napi_id won't be valid until the poll
callback.

Some driver's code FYR:
(drivers/net/ethernet/intel/e1000/e1000_main.c)

e1000_receive_skb-->napi_gro_receive-->napi_skb_finish--
gro_normal_one
and in gro_normal_one(), it does:

           if (napi->rx_count >= gro_normal_batch)
                   gro_normal_list(napi);


The gro_normal_list() delivers the info up to the specifical network
protocol like tcp.

And then sk->napi_id is set, meanwhile the poll callback is
triggered.

So that's why I call the napi polling technology a 'speculation'.
It's
totally for the

future data. Correct me if I'm wrong especially for the poll callback
triggering part.

When I said 'might not', I was meaning that from the io_uring point of
view, it has no idea what is the previous socket usage. If it has been
used outside io_uring, the napi_id could available on the first call.

If it is really read virgin socket, neither my choosen call site or
your proposed sites will make the napi busy poll possible for the first
poll.

I feel like there is not much to gain to argue on this point since I
pretty much admitted that your solution was most likely the only call
site making MULTIPOLL requests work correctly with napi busy poll as
those requests could visit __io_arm_poll_handler only once (Correct me
if my statement is wrong).

The only issue was that I wasn't sure is how using your calling sites
would make locking work.

I suppose that adding a dedicated spinlock for protecting napi_list
instead of relying on uring_lock could be a solution. Would that work?
spinlock should be fine.