Re: [bug-report] possible s64 overflow in max_vruntime()

From: Roman Kagan
Date: Wed Jan 25 2023 - 14:45:58 EST


Upping the thread as we're hitting this problem too.

On Thu, Dec 22, 2022 at 01:45:48PM +0100, Peter Zijlstra wrote:
> On Wed, Dec 21, 2022 at 11:19:31PM +0800, Zhang Qiao wrote:
> > I found problem about s64 overflow in max_vruntime().
> >
[...]
> > static inline u64 min_vruntime(u64 min_vruntime, u64 vruntime)
> > {
> > /*
> > * vruntime=0x124b17fd59db8d02
> > * min_vruntime=0x918fdb05287da7c3
> > * vruntime - min_vruntime = 9276074894177461567 > s64_max, will s64 overflow
> > */
> > s64 delta = (s64)(vruntime - min_vruntime);
> > if (delta < 0)
> > min_vruntime = vruntime;
> >
> > return min_vruntime;
> > }
> >
> > ----------
> >
> > max_vruntime() will return the kworker's old vruntime, it is incorrect and the correct result
> > shoud be cfs_rq->minvruntime. This incorrect result is greater than cfs_rq->min_vruntime and
> > will cause kworker thread starved.
> >
> > Does anyone have a good suggestion for slove this problem? or bugfix patch.
>
> I don't understand what you tihnk the problem is. Signed overflow is
> perfectly fine and works as designed here.

Disagreed.

The calculation is indeed safe against the overflow of the vruntimes
themselves. However, when the two vruntimes are more than 2^63 apart,
their comparison gets inverted due to that s64 overflow.

And this is what happens here: one scheduling entity has accumulated a
vruntime more than 2^63 ahead of another. Now the comparison is
inverted due to s64 overflow, and the latter can't get to the cpu,
because it appears to have vruntime (much) bigger than that of the
former.

This situation is reproducible e.g. when one scheduling entity is a
multi-cpu hog, and the other is woken up from a long sleep. Normally
when a task is placed on a cfs_rq, its vruntime is pulled to
min_vruntime, to avoid boosting the woken up task. However in this case
the task is so much behind in vruntime that it appears ahead instead,
its vruntime is not adjusted in place_entity(), and then it looses the
cpu to the current scheduling entity.

Thanks,
Roman.



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879