Re: [PATCH -v2 15/17] sched: Fix migrate_disable() vs rt/dl balancing

From: Steven Rostedt
Date: Tue Oct 06 2020 - 09:44:51 EST


On Tue, 6 Oct 2020 09:59:39 +0200
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Mon, Oct 05, 2020 at 04:57:32PM +0200, Peter Zijlstra wrote:
> > +static inline struct task_struct *get_push_task(struct rq *rq)
> > +{
> > + struct task_struct *p = rq->curr;
> > +
> > + lockdep_assert_held(&rq->lock);
> > +
> > + if (rq->push_busy)
> > + return NULL;
> > +
> > + if (p->nr_cpus_allowed == 1)
> > + return NULL;
>
> This; that means what when we're stuck below a per-cpu thread, we're
> toast. There's just nothing much you can do... :/

Well, hopefully, per CPU threads don't run for long periods of time. I'm
working with folks having issues of running non stop RT threads that every
so often go into the kernel kicking off per CPU kernel threads that now get
starved when the RT tasks go back to user space, causing the rest of the
system to hang.

As I've always said. When dealing with real-time systems, you need to be
careful about how you organize your tasks. Ideally, any RT task that is
pinned to a CPU shouldn't be sharing that CPU with anything else that may
be critical.

-- Steve


>
> > +
> > + rq->push_busy = true;
> > + return get_task_struct(p);
> > +}