Re: [patch] sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression

From: Peter Zijlstra
Date: Wed Aug 10 2016 - 16:08:42 EST


On Wed, Aug 10, 2016 at 01:14:29PM +0200, Mike Galbraith wrote:
> Hi Peter,
>
> While running ltp, the fates decided it was time for me to encounter
> the roughly 1 out of every 10 call failure below. As much as I run
> ltp, I'm a bit surprised that I (or anyone else) haven't met this
> before, but then the fates are known to be a tad fickle.
>
> getrusage04 0 TINFO : Expected timers granularity is 4000 us
> getrusage04 0 TINFO : Using 1 as multiply factor for max [us]time increment (1000+4000us)!
> getrusage04 0 TINFO : utime: 0us; stime: 179us
> getrusage04 0 TINFO : utime: 3751us; stime: 0us
> getrusage04 1 TFAIL : getrusage04.c:133: stime increased > 5000us:
>
> When applying the full rtime to either stime or utime, do not overwrite
> the previously tallied value.
>
> Fixes: 9d7fb0427648 ("sched/cputime: Guarantee stime + utime == rtime")
> Signed-off-by: Mike Galbraith <umgwanakikbuti@xxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx # 4.3+
> ---
> kernel/sched/cputime.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -608,11 +608,13 @@ static void cputime_adjust(struct task_c
>
> if (utime == 0) {
> stime = rtime;
> + utime = prev->utime;
> goto update;
> }
>
> if (stime == 0) {
> utime = rtime;
> + stime = prev->stime;
> goto update;
> }

This cannot be right; it violates that utime+stime==rtime. Let me try
and figure out what actually happens.