Re: [PATCH] tick/nohz: Fix wrong user and system time accouting against vtime sampling

From: Frederic Weisbecker
Date: Mon Apr 10 2017 - 14:01:56 EST


On Mon, Apr 10, 2017 at 05:45:56PM +0200, Thomas Gleixner wrote:
> On Wed, 5 Apr 2017, Wanpeng Li wrote:
> > + /*
> > + * Offset the tick to avert jiffies_lock contention, and all ticks
> > + * alignment in order that the vtime sampling does not end up "in
> > + * phase" with the jiffies incrementing.
> > + */
> > + if (sched_skew_tick || tick_nohz_full_enabled()) {
> > u64 offset = ktime_to_ns(tick_period) >> 1;
> > do_div(offset, num_possible_cpus());
> > offset *= smp_processor_id();
>
> That's not a fix, that's just papering over the problem.
>
> offset = 1ms / 2 = 500us = 500000ns;
> offset /= 144 = 3472ns
>
> So CPU0 and CPU1 ticks are ~3 microseconds apart. That merily reduces the
> probability of the issue, but does not prevent it.

I worried about it but didn't realize it could be that tight.

So the alternative is the solution involving sched_clock() as the source for
cputime. Wanpeng Li could you please resubmit your patch that does that?

Thanks.