Re: WARNING in enqueue_task_dl

From: Juri Lelli
Date: Mon Nov 19 2018 - 10:32:10 EST


On 19/11/18 14:43, Juri Lelli wrote:
> On 19/11/18 13:52, Peter Zijlstra wrote:
> > On Mon, Nov 19, 2018 at 01:07:18PM +0100, luca abeni wrote:
> >
> > > > On Sun, 18 Nov 2018, syzbot wrote:
> >
> > > > > WARNING: CPU: 1 PID: 6351 at kernel/sched/deadline.c:628
> > > > > enqueue_task_dl+0x22da/0x38a0 kernel/sched/deadline.c:1504
> > >
> > > Here, it looks like a task is invoking sched_setattr() to become
> > > SCHED_DEADLINE when dl_boosted is set...
> > >
> > > Is this possible / correct?
> >
> > Possible, clearly. Correct, only in so far as that it is not a malformed
> > program, but it is very poor design to actually trigger this (of course
> > the fuzzer doesn't care about that).
> >
> > > If this (sched_setattr() with dl_boosted set) should not be possible,
> > > then we have a bug that we need to investigate...
> > >
> > > Otherwise, I suspect we can just remove the WARN_ON at line 628 of
> > > deadline.c
> >
> > I wonder why we put that WARN in there to begin with... git-blame gives
> > us:
> >
> > 98b0a8578050 ("sched/deadline: Remove useless parameter from setup_new_dl_entity()")
> >
> > So the problem seems to be that if we're boosted, we should maybe not be
> > using our own (newly set) parameters, but those of the donor task.
> >
> > Specifically, our 'suboptimal' deadline inheritance scheme 'requires' us
> > to use the inherited deadline, not our own. So in that respect I think
> > the WARN is valid, although I'm not sure what, apart from actually
> > finishing that PE patch-set we can do about it just now.
>
> Mmm, but, as it was written in the comment that was removed by 295d6d5
> ("sched/deadline: Fix switching to -deadline"), I was still expecting
> that for a boosted task setup_new_dl_entity() shouldn't be called.
> Wonder if this is another manifestation of the problems we have with
> clocks. Need to think more about it.

So, while this looks like nothing more than a stop-gap solution until we
get PE in place, would the following make any sense? It seems I can't
reproduce the warning anymore with it (w/o it usually takes a few secs
to reproduce).

--->8---