Re: [bug-report] possible s64 overflow in max_vruntime()

From: Vincent Guittot
Date: Fri Jan 27 2023 - 11:19:23 EST


On Fri, 27 Jan 2023 at 12:44, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Thu, Jan 26, 2023 at 07:31:02PM +0100, Roman Kagan wrote:
>
> > > All that only matters for small sleeps anyway.
> > >
> > > Something like:
> > >
> > > sleep_time = U64_MAX;
> > > if (se->avg.last_update_time)
> > > sleep_time = cfs_rq_clock_pelt(cfs_rq) - se->avg.last_update_time;
> >
> > Interesting, why not rq_clock_task(rq_of(cfs_rq)) - se->exec_start, as
> > others were suggesting? It appears to better match the notion of sleep
> > wall-time, no?
>
> Should also work I suppose. cfs_rq_clock takes throttling into account,
> but that should hopefully also not be *that* long, so either should
> work.

yes rq_clock_task(rq_of(cfs_rq)) should be fine too

Another thing to take into account is the sleeper credit that the
waking task deserves so the detection should be done once it has been
subtracted from vruntime.

Last point, when a nice -20 task runs on a rq, it will take a bit more
than 2 seconds for the vruntime to be increased by more than 24ms (the
maximum credit that a waking task can get) so threshold must be
significantly higher than 2 sec. On the opposite side, the lowest
possible weight of a cfs rq is 2 which means that the problem appears
for a sleep longer or equal to 2^54 = 2^63*2/1024. We should use this
value instead of an arbitrary 200 days