Re: sched: rq->nr_iowait transiently going negative after the recent p->on_cpu optimization

From: Peter Zijlstra
Date: Thu Sep 24 2020 - 10:50:49 EST


On Thu, Sep 24, 2020 at 10:27:51AM -0400, Tejun Heo wrote:
> Hello,
>
> On Thu, Sep 24, 2020 at 01:50:42PM +0200, Peter Zijlstra wrote:
> > Hurmph.. I suppose you're right :/ And this is an actual problem?
>
> Yeah, this got exposed to userspace as a full 64bit number which overflowed
> u32 conversion in the rust procfs library which aborted a program I was
> working on multiple times over several months.
>
> On a more theoretical side, it might also surprise nr_iowait_cpu() users.
> However, a real problem that may be.
>
> > I think the below should cure that, but blergh, not nice. If you could
> > confirm, I'll try and think of something nicer.
>
> Rik suggested that it'd be sufficient to return 0 on underflow especially
> given that 0 is actually the right number to describe the state. So, maybe
> that can be a nicer code-wise?

I worry about things where one CPU has a positive value and one or more
(other) CPUs have a temporary negative value.