Re: [PATCH v0.9.1 3/6] sched/umcg: implement UMCG syscalls

From: Peter Zijlstra
Date: Thu Jan 20 2022 - 06:08:05 EST


On Wed, Jan 19, 2022 at 09:26:41AM -0800, Peter Oskolkov wrote:
> On Mon, Dec 6, 2021 at 3:47 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Mon, Nov 29, 2021 at 09:34:49AM -0800, Peter Oskolkov wrote:
> > > On Mon, Nov 29, 2021 at 8:41 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > > > Also, timeout on sys_umcg_wait() gets you the exact same situation (or
> > > > worse, multiple running workers).
> > >
> > > It should not. Timed out workers should be added to the runnable list
> > > and not become running unless a server chooses so. So sys_umcg_wait()
> > > with a timeout should behave similarly to a normal sleep, in that the
> > > server is woken upon the worker blocking, and upon the worker wakeup
> > > the worker is added to the woken workers list and waits for a server
> > > to run it. The only difference is that in a sleep the worker becomes
> > > BLOCKED, while in sys_umcg_wait() the worker is RUNNABLE the whole
> > > time.
> > >
> > > Why then have sys_umcg_wait() with a timeout at all, instead of
> > > calling nanosleep()? Because the worker in sys_umcg_wait() can be
> > > context-switched into by another worker, or made running by a server;
> > > if the worker is in nanosleep(), it just sleeps.
> >
> > I've been trying to figure out the semantics of that timeout thing, and
> > I can't seem to make sense of it.
> >
> > Consider two workers:
> >
> > S0 running A S1 running B
> >
> > therefore:
> >
> > S0::state == RUNNABLE S1::state == RUNNABLE
> > A::server_tid == S0.tid B::server_tid = S1.tid
> > A::state == RUNNING B::state == RUNNING
> >
> > Doing:
> >
> > self->state = RUNNABLE; self->state = RUNNABLE;
> > sys_umcg_wait(0); sys_umcg_wait(10);
> > umcg_enqueue_runnable() umcg_enqueue_runnable()
>
> sys_umcg_wait() should not enqueue the worker as runnable; workers are
> enqueued to indicate wakeup events.

Oooh... I see.

> So worker timeouts in sys_umcg_wait are treated as wakeup events, with
> the difference that when the worker is eventually scheduled by a
> server, sys_umcg_wait returns with ETIMEDOUT.

Right.. OK, let me go fold and polish what I have now before I go change
things again though.