Re: [RFC 1/2] x86/tsc: add a timer to make sure tsc_adjust is always checked

From: Feng Tang
Date: Sat Apr 10 2021 - 10:38:21 EST


Hi Thomas,

On Sat, Apr 10, 2021 at 11:27:11AM +0200, Thomas Gleixner wrote:
> On Tue, Mar 30 2021 at 16:25, Feng Tang wrote:
> > Normally the tsc_sync will be checked every time system enters idle state,
> > but there is still caveat that a system won't enter idle, either because
> > it's too busy or configured purposely to not enter idle. Setup a periodic
> > timer to make sure the check is always on.
>
> Bah. I really hate the fact that we don't have a knob to disable writes
> to the TSC/TSC_ADJUST msrs. That would spare this business alltogether.
>
> > +/*
> > + * Normally the tsc_sync will be checked every time system enters idle state,
> > + * but there is still caveat that a system won't enter idle, either because
> > + * it's too busy or configured purposely to not enter idle.
> > + *
> > + * So setup a periodic timer to make sure the check is always on.
> > + */
> > +
> > +#define SYNC_CHECK_INTERVAL (HZ * 600)
> > +static void tsc_sync_check_timer_fn(struct timer_list *unused)
>
> I've surely mentioned this before that glueing a define without an empty
> newline to a function definition is horrible to read.

Got it, will add a newline.

> > +{
> > + int next_cpu;
> > +
> > + tsc_verify_tsc_adjust(false);
> > +
> > + /* Loop to do the check for all onlined CPUs */
>
> I don't see a loop here.

I meant to loop all onlined CPUs, and the comment could be
changed to

/* Run the check for all onlined CPUs in turn */

> > + next_cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask);
>
> Why raw_smp_processor_id()? What's wrong with smp_processor_id()?

Will change to smp_processor_id() for this timer function.

> > + if (next_cpu >= nr_cpu_ids)
> > + next_cpu = cpumask_first(cpu_online_mask);
> > +
> > + tsc_sync_check_timer.expires += SYNC_CHECK_INTERVAL;
> > + add_timer_on(&tsc_sync_check_timer, next_cpu);
> > +}
> > +
> > +static int __init start_sync_check_timer(void)
> > +{
> > + if (!boot_cpu_has(X86_FEATURE_TSC_ADJUST))
> > + return 0;
> > +
> > + timer_setup(&tsc_sync_check_timer, tsc_sync_check_timer_fn, 0);
> > + tsc_sync_check_timer.expires = jiffies + SYNC_CHECK_INTERVAL;
> > + add_timer(&tsc_sync_check_timer);
> > +
> > + return 0;
> > +}
> > +late_initcall(start_sync_check_timer);
>
> So right now, if someone adds 'tsc=reliable' on the kernel command line
> then all of the watchdog checking, except for the idle enter TSC_ADJUST
> check is disabled. The NOHZ full people are probably going to be pretty
> unhappy about yet another unconditional timer they have to chase down.
>
> So this needs some more thought.

'tsc=reliable' in cmdline will set 'tsc_clocksource_reliable' to 1, so
we can skip starting this timer if 'tsc_clocksource_reliable==1' ?

Thanks,
Feng

>
> Thanks,
>
> tglx