Re: [RFC PATCH v2 11/14] x86/watchdog/hardlockup: Add an HPET-based hardlockup detector

From: Peter Zijlstra
Date: Tue Apr 09 2019 - 07:00:21 EST


On Tue, Mar 26, 2019 at 09:49:13PM +0100, Thomas Gleixner wrote:
> So way you should handle this is:
>
> cpumask_set_cpu(cpu, hld_data->cpu_monitored_mask);
>
> if (!hld_data->enabled_cpus++) {
> hld_data->handling_cpu = cpu;
> kick_timer();
> enable_timer();
> }
>
> The cpu mask starts off empty and each CPU sets itself when the function is
> invoked on it.
>
> data->enabled_cpus keeps track of the enabled cpus so you avoid
> reconfiguration just because a different cpu comes online. And it's
> required for disable as well.
>
> > +void hardlockup_detector_hpet_disable(void)
> > +{
> > + struct cpumask *allowed = watchdog_get_allowed_cpumask();
> > +
> > + if (!hld_data)
> > + return;
> > +
> > + /* Only disable the timer if there are no more CPUs to monitor. */
> > + if (!cpumask_weight(allowed))
> > + disable_timer(hld_data);
>
> Again this should be:
>
> cpumask_clear_cpu(cpu, hld_data->cpu_monitored_mask);
> hld_data->enabled_cpus--;
>
> if (hld_data->handling_cpu != cpu)
> return;
>
> disable_timer();
> if (hld_data->enabled_cpus)
> return;

if (!hld_data->enabled_cpus)
return;

>
> hld_data->handling_cpu = cpumask_first(hld_data->cpu_monitored_mask);
> enable_timer();

That said; you can do the above without ->enabled_cpus, by using
->handling_cpu == nr_cpu_ids to indicate 'empty'. But I'm not at all
sure that is worth the effort, it results in less obious code.