Re: [tip:sched/urgent] sched: Fix cross-cpu clock sync on remotewakeups

From: Borislav Petkov
Date: Wed Jun 01 2011 - 03:05:59 EST


On Tue, May 31, 2011 at 03:11:56PM +0200, Peter Zijlstra wrote:
> Well, I don't have a modern AMD system to verify on, but the only
> explanation is sched_clock weirdness (different code from the GTOD tsc
> stuff). I could not reproduce on an Intel Westmere machine, but could on
> a Core2.
>
> The sched_clock_cpu stuff basically takes a GTOD timestamp every tick
> and uses sched_clock() (tsc + cyc2ns) to provide delta increments, when
> TSCs are synced every cpu should return the same value and the patch is
> a nop.
>
> If they aren't synced the per-cpu sched_clock_cpu() values can drift up
> to about 2 jiffies (when the ticks drift about 1 and the slower of the
> two has a stuck tsc while the faster of the two does progress at the
> normal rate). In that case doing a clock update cross-cpu will ensure
> time monotonicity between those two cpus.

Hmm, could it be that the sched_clock_tick() could be seeing different
TSC values due to propagation delays of IPIs and TSCs? Or, it could be
also that some TSCs don't start at the same moment after powerup but
still run synchronized though?

How can we trace this, do you do trace_printk() in the scheduler? I'm
asking because I remember reading somewhere that tracing the scheduler
is not that trivial like say a driver :). I could do that on a couple of
machines I have here and see what happens.

Thanks.

--
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/