Re: [PATCH] task_work: allow TWA_SIGNAL without a rescheduling IPI

From: Jens Axboe
Date: Mon Apr 25 2022 - 21:52:37 EST


On 4/22/22 8:34 AM, Jens Axboe wrote:
> Some use cases don't always need an IPI when sending a TWA_SIGNAL
> notification. Add TWA_SIGNAL_NO_IPI, which is just like TWA_SIGNAL,
> except it doesn't send an IPI to the target task. It merely sets
> TIF_NOTIFY_SIGNAL and wakes up the task.

Adding Peter and Thomas.

>
> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
>
> ---
>
> This is a prep patch for an io_uring change where we don't need the IPI,
> and skipping it can reduce rescheduling/IPI rate by tens to hundreds of
> thousands per second.
>
> diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
> index 3c8b34876744..66b689f6cfcb 100644
> --- a/include/linux/sched/signal.h
> +++ b/include/linux/sched/signal.h
> @@ -355,14 +355,23 @@ static inline void clear_notify_signal(void)
> smp_mb__after_atomic();
> }
>
> +/*
> + * Returns 'true' if kick_process() is needed to force a transition from
> + * user -> kernel to guarantee expedient run of TWA_SIGNAL based task_work.
> + */
> +static inline bool __set_notify_signal(struct task_struct *task)
> +{
> + return !test_and_set_tsk_thread_flag(task, TIF_NOTIFY_SIGNAL) &&
> + !wake_up_state(task, TASK_INTERRUPTIBLE);
> +}
> +
> /*
> * Called to break out of interruptible wait loops, and enter the
> * exit_to_user_mode_loop().
> */
> static inline void set_notify_signal(struct task_struct *task)
> {
> - if (!test_and_set_tsk_thread_flag(task, TIF_NOTIFY_SIGNAL) &&
> - !wake_up_state(task, TASK_INTERRUPTIBLE))
> + if (__set_notify_signal(task))
> kick_process(task);
> }
>
> diff --git a/include/linux/task_work.h b/include/linux/task_work.h
> index 897494b597ba..795ef5a68429 100644
> --- a/include/linux/task_work.h
> +++ b/include/linux/task_work.h
> @@ -17,6 +17,7 @@ enum task_work_notify_mode {
> TWA_NONE,
> TWA_RESUME,
> TWA_SIGNAL,
> + TWA_SIGNAL_NO_IPI,
> };
>
> static inline bool task_work_pending(struct task_struct *task)
> diff --git a/kernel/task_work.c b/kernel/task_work.c
> index c59e1a49bc40..fa8fdd04aa17 100644
> --- a/kernel/task_work.c
> +++ b/kernel/task_work.c
> @@ -13,11 +13,13 @@ static struct callback_head work_exited; /* all we need is ->next == NULL */
> *
> * Queue @work for task_work_run() below and notify the @task if @notify
> * is @TWA_RESUME or @TWA_SIGNAL. @TWA_SIGNAL works like signals, in that the
> - * it will interrupt the targeted task and run the task_work. @TWA_RESUME
> - * work is run only when the task exits the kernel and returns to user mode,
> - * or before entering guest mode. Fails if the @task is exiting/exited and thus
> - * it can't process this @work. Otherwise @work->func() will be called when the
> - * @task goes through one of the aforementioned transitions, or exits.
> + * it will interrupt the targeted task and run the task_work. @TWA_SIGNAL_NO_IPI
> + * works like @TWA_SIGNAL, except it doesn't send a reschedule IPI to force the
> + * targeted task to reschedule and run task_work. @TWA_RESUME work is run only
> + * when the task exits the kernel and returns to user mode, or before entering
> + * guest mode. Fails if the @task is exiting/exited and thus it can't process
> + * this @work. Otherwise @work->func() will be called when the @task goes
> + * through one of the aforementioned transitions, or exits.
> *
> * If the targeted task is exiting, then an error is returned and the work item
> * is not queued. It's up to the caller to arrange for an alternative mechanism
> @@ -53,6 +55,9 @@ int task_work_add(struct task_struct *task, struct callback_head *work,
> case TWA_SIGNAL:
> set_notify_signal(task);
> break;
> + case TWA_SIGNAL_NO_IPI:
> + __set_notify_signal(task);
> + break;
> default:
> WARN_ON_ONCE(1);
> break;
>


--
Jens Axboe