Re: [tip:sched/urgent] sched: Fix cross-cpu clock sync on remote wakeups

From: Yong Zhang
Date: Thu Jun 02 2011 - 03:52:16 EST


On Wed, Jun 1, 2011 at 11:50 PM, Borislav Petkov <bp@xxxxxxxxx> wrote:
> On Wed, Jun 01, 2011 at 06:36:52AM -0400, Peter Zijlstra wrote:
>> trace_printk() can go pretty much anywhere, you want to start with
>> something like the below and go from there, either up into
>> arch/x86/kernel/tsc.c:native_sched_clock() or down into the scheduler
>> and instrument rq->clock (although you most likely already get most of
>> that through the sched_clock_cpu() trace_printk).
>
> See for a trace excerpt below.
>
>> Also, it might be good to check on the sched_clock_stable logic in
>> general and on your platform in particular, if that's set we forgo all
>> the fancy bits and return sched_clock() directly.
>
> Nah, we don't set sched_clock_stable on AMD.
>
> --
> Â Â Â Â Â<idle>-0 Â Â [023] Â 511.343360: sched_clock_cpu: clock: 511500051225
> Â Â Â Â Â<idle>-0 Â Â [023] Â 511.343365: sched_clock_tick: gtod: 511500058415, now: 511343364325
> Â Â Â Â Â<idle>-0 Â Â [023] Â 511.343367: sched_clock_local: now: 511343365506, gtod: 511500058415, delta: 1181, old_clock: 511500051225, clock: 511500059596
> Â Â Â Â Â<idle>-0 Â Â [023] Â 511.343369: sched_clock_local: now: 511343367760, gtod: 511500058415, delta: 3435, old_clock: 511500059596, clock: 511500061850
> Â Â Â Â Â<idle>-0 Â Â [023] Â 511.343370: sched_clock_cpu: clock: 511500061850
> Â Â Â Â Â<idle>-0 Â Â [023] Â 511.343380: sched_clock_local: now: 511343378644, gtod: 511500058415, delta: 14319, old_clock: 511500061850, clock: 511500072734
> Â Â Â Â Â<idle>-0 Â Â [023] Â 511.343381: sched_clock_cpu: clock: 511500072734
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.343519: sched_clock_tick: gtod: 511500210072, now: 511343515910
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.343523: sched_clock_local: now: 511343521252, gtod: 511500210072, delta: 5342, old_clock: 511484088885, clock: 511500215414
> Â Â Â Â Â <...>-102 Â [023] Â 511.343527: sched_clock_local: now: 511343524554, gtod: 511500058415, delta: 160229, old_clock: 511500072734, clock: 511500218644

In sched_clock_local(), clock is calculated around ->tick_gtod even if
that ->tick_gtod
is stale for long time because we stays in idle state. You know ->tick_gtod
is only updated in sched_clock_tick();

IOW, when a cpu goes out of idle, sched_clock_tick() is called from
tick_nohz_stop_idle() which is later than interrupt.

So if we have any site which call sched_clock() in interrupt on an
idle cpu, it could
get incorrect clock.

I'm not sure how to teach sched_clock() about this special case, Peter?

Thanks,
Yong

>
> let's take this one for example
>
> now: 511343524554 Â Â Â is the TSC on core 23
> gtod: 511500058415 Â Â Âtick_gtod we just updated with the delta of now and scd->tick_raw. Btw, delta is > 0 so we're incrementing fine.
> old_clock: 511500072734 what we returned on the previous sched_clock_cpu() above at tstamp 511.343381
> clock: 511500218644 Â Â is what we actually return
>
>
> Â Â Â Â Â <...>-102 Â [023] Â 511.343528: sched_clock_cpu: clock: 511500218644
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.343542: sched_clock_local: now: 511343540484, gtod: 511500210072, delta: 24574, old_clock: 511500215414, clock: 511500234646
>
>
> now, on cpu0 we've advanced in the meantime with now=511343540484 (btw,
> the now TSC values are almost the same as the ftrace timestamps, the
> last being taken only a couple of ç later) and so on ...
>
> So, I don't see anything unusual but I could very well be missing
> something. Btw, this trace is without your change to ttwu_queue().
>
> I'll do the same exercise on another F10h system once I get it free to
> see whether we see the same monotonicity there too.
>
> Let me know if you want something else traced or the trace tweaked.
>
>
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.343543: sched_clock_cpu: clock: 511500234646
> Â Â Â Â Â <...>-1549 Â[023] Â 511.343548: sched_clock_local: now: 511343546428, gtod: 511500058415, delta: 182103, old_clock: 511500218644, clock: 511500240518
> Â Â Â Â Â <...>-1549 Â[023] Â 511.343549: sched_clock_cpu: clock: 511500240518
> Â Â Â Â Â<idle>-0 Â Â [023] Â 511.343560: sched_clock_local: now: 511343558648, gtod: 511500058415, delta: 194323, old_clock: 511500240518, clock: 511500252738
> Â Â Â Â Â<idle>-0 Â Â [023] Â 511.343561: sched_clock_cpu: clock: 511500252738
> Â Â Â Â Â<idle>-0 Â Â [023] Â 511.347381: sched_clock_tick: gtod: 511504076961, now: 511347377038
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.347386: sched_clock_tick: gtod: 511504078048, now: 511347378123
> Â Â Â Â Â<idle>-0 Â Â [023] Â 511.347389: sched_clock_local: now: 511347383317, gtod: 511504076961, delta: 6279, old_clock: 511500252738, clock: 511504083240
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.347392: sched_clock_local: now: 511347387920, gtod: 511504078048, delta: 9797, old_clock: 511500234646, clock: 511504087845
> Â Â Â Â Â<idle>-0 Â Â [023] Â 511.347397: sched_clock_local: now: 511347395232, gtod: 511504076961, delta: 18194, old_clock: 511504083240, clock: 511504095155
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.347399: sched_clock_local: now: 511347396131, gtod: 511504078048, delta: 18008, old_clock: 511504087845, clock: 511504096056
> Â Â Â Â Â<idle>-0 Â Â [023] Â 511.347401: sched_clock_cpu: clock: 511504095155
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.347402: sched_clock_cpu: clock: 511504096056
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.473120: sched_clock_tick: gtod: 511630004325, now: 511473117198
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.473124: sched_clock_local: now: 511473122126, gtod: 511630004325, delta: 4928, old_clock: 511504096056, clock: 511630009253
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.473156: sched_clock_local: now: 511473154566, gtod: 511630004325, delta: 37368, old_clock: 511630009253, clock: 511630041693
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.473157: sched_clock_cpu: clock: 511630041693
> Â Â Â Â Â <...>-3889 Â[000] Â 511.473232: sched_clock_local: now: 511473229761, gtod: 511630004325, delta: 112563, old_clock: 511630041693, clock: 511630116888
> Â Â Â Â Â <...>-3889 Â[000] Â 511.473233: sched_clock_cpu: clock: 511630116888
> Â Â Â Â Â <...>-1226 Â[000] Â 511.473251: sched_clock_local: now: 511473249602, gtod: 511630004325, delta: 132404, old_clock: 511630116888, clock: 511630136729
> Â Â Â Â Â <...>-1226 Â[000] Â 511.473252: sched_clock_cpu: clock: 511630136729
> Â Â Â Â Â<idle>-0 Â Â [012] Â 511.473260: sched_clock_local: now: 511473255980, gtod: 511448077716, delta: 181793791, old_clock: 511448095543, clock: 511452077966
> Â Â Â Â Â<idle>-0 Â Â [012] Â 511.473263: sched_clock_cpu: clock: 511452077966
> Â Â Â Â Â <...>-3889 Â[000] Â 511.473266: sched_clock_local: now: 511473263744, gtod: 511630004325, delta: 146546, old_clock: 511630136729, clock: 511630150871
> Â Â Â Â Â <...>-3889 Â[000] Â 511.473267: sched_clock_cpu: clock: 511630150871
> Â Â Â Â Â<idle>-0 Â Â [012] Â 511.473271: sched_clock_tick: gtod: 511630157193, now: 511473269741
> Â Â Â Â Â<idle>-0 Â Â [012] Â 511.473273: sched_clock_local: now: 511473271181, gtod: 511630157193, delta: 1440, old_clock: 511452077966, clock: 511630158633
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.473280: sched_clock_local: now: 511473277552, gtod: 511630004325, delta: 160354, old_clock: 511630150871, clock: 511630164679
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.473282: sched_clock_cpu: clock: 511630164679
> Â Â Â Â Â <...>-3891 Â[012] Â 511.473355: sched_clock_local: now: 511473352521, gtod: 511630157193, delta: 82780, old_clock: 511630158633, clock: 511630239973
> Â Â Â Â Â <...>-3891 Â[012] Â 511.473356: sched_clock_cpu: clock: 511630239973
> Â Â Â Â Â <...>-1201 Â[012] Â 511.473375: sched_clock_local: now: 511473373553, gtod: 511630157193, delta: 103812, old_clock: 511630239973, clock: 511630261005
> Â Â Â Â Â <...>-1201 Â[012] Â 511.473376: sched_clock_cpu: clock: 511630261005
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.473380: sched_clock_local: now: 511473375569, gtod: 511630004325, delta: 258371, old_clock: 511630164679, clock: 511630262696
> Â Â Â Â Â<idle>-0 Â Â [000] Â 511.473383: sched_clock_cpu: clock: 511630262696
> Â Â Â Â Â <...>-3891 Â[012] Â 511.473387: sched_clock_local: now: 511473384172, gtod: 511630157193, delta: 114431, old_clock: 511630261005, clock: 511630271624
> Â Â Â Â Â <...>-3891 Â[012] Â 511.473388: sched_clock_cpu: clock: 511630271624
>
>
> --
> Regards/Gruss,
> Boris.
>
> Advanced Micro Devices GmbH
> Einsteinring 24, 85609 Dornach
> General Managers: Alberto Bozzo, Andrew Bowd
> Registration: Dornach, Gemeinde Aschheim, Landkreis Muenchen
> Registergericht Muenchen, HRB Nr. 43632
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at Âhttp://vger.kernel.org/majordomo-info.html
> Please read the FAQ at Âhttp://www.tux.org/lkml/
>



--
Only stand for myself
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/