Re: [PATCH 0/2] vhost: improve livepatch switching for heavily loaded vhost worker kthreads

From: Josh Poimboeuf
Date: Mon Jan 30 2023 - 13:19:06 EST


On Mon, Jan 30, 2023 at 01:40:18PM +0100, Peter Zijlstra wrote:
> On Fri, Jan 27, 2023 at 02:11:31PM -0800, Josh Poimboeuf wrote:
>
>
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 4df2b3e76b30..fbcd3acca25c 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -36,6 +36,7 @@
> > #include <linux/seqlock.h>
> > #include <linux/kcsan.h>
> > #include <linux/rv.h>
> > +#include <linux/livepatch_sched.h>
> > #include <asm/kmap_size.h>
> >
> > /* task_struct member predeclarations (sorted alphabetically): */
> > @@ -2074,6 +2075,9 @@ DECLARE_STATIC_CALL(cond_resched, __cond_resched);
> >
> > static __always_inline int _cond_resched(void)
> > {
> > + //FIXME this is a bit redundant with preemption disabled
> > + klp_sched_try_switch();
> > +
> > return static_call_mod(cond_resched)();
> > }
>
> Right, I was thinking you'd do something like:
>
> static_call_update(cond_resched, klp_cond_resched);
>
> With:
>
> static int klp_cond_resched(void)
> {
> klp_try_switch_task(current);
> return __cond_resched();
> }
>
> That would force cond_resched() into doing the transition thing,
> irrespective of the preemption mode at hand. And then, when KLP be done,
> re-run sched_dynamic_update() to reset it to whatever it ought to be.

Ok, makes sense.

>
> > @@ -401,8 +421,10 @@ void klp_try_complete_transition(void)
> > */
> > read_lock(&tasklist_lock);
> > for_each_process_thread(g, task)
> > - if (!klp_try_switch_task(task))
> > + if (!klp_try_switch_task(task)) {
> > + set_tsk_need_resched(task);
> > complete = false;
> > + }
>
> Yeah, no, that's broken -- preemption state live in more than just the
> TIF bit.

Oops.

> > read_unlock(&tasklist_lock);
> >
> > /*
> > @@ -413,6 +435,7 @@ void klp_try_complete_transition(void)
> > task = idle_task(cpu);
> > if (cpu_online(cpu)) {
> > if (!klp_try_switch_task(task)) {
> > + set_tsk_need_resched(task);
> > complete = false;
> > /* Make idle task go through the main loop. */
> > wake_up_if_idle(cpu);
>
> Idem.
>
> Also, I don't see the point of this and the __schedule() hook here:

The (poorly executed) idea was to catch kthreads which do

if (need_resched())
schedule();

but I guess we can just replace those usages with cond_resched()?

> > @@ -8500,8 +8502,10 @@ EXPORT_STATIC_CALL_TRAMP(might_resched);
> > static DEFINE_STATIC_KEY_FALSE(sk_dynamic_cond_resched);
> > int __sched dynamic_cond_resched(void)
> > {
> > - if (!static_branch_unlikely(&sk_dynamic_cond_resched))
> > + if (!static_branch_unlikely(&sk_dynamic_cond_resched)) {
> > + klp_sched_try_switch();
> > return 0;
> > + }
> > return __cond_resched();
> > }
> > EXPORT_SYMBOL(dynamic_cond_resched);
>
> I would make the klp_sched_try_switch() not depend on
> sk_dynamic_cond_resched, because __cond_resched() is not a guaranteed
> pass through __schedule().
>
> But you'll probably want to check with Mark here, this all might
> generate crap code on arm64.
>
> Both ways this seems to make KLP 'depend' (or at least work lots better)
> when PREEMPT_DYNAMIC=y. Do we want a PREEMPT_DYNAMIC=n fallback for
> _cond_resched() too?

That was the intent but I obviously failed. Let me go rework it a bit.

--
Josh