Re: [PATCH] cpufreq: intel_pstate: Fix for HWP interrupt before driver is ready

From: Rafael J. Wysocki
Date: Tue Sep 07 2021 - 09:41:38 EST


On Mon, Sep 6, 2021 at 9:57 PM Srinivas Pandruvada
<srinivas.pandruvada@xxxxxxxxxxxxxxx> wrote:
>
> On Mon, 2021-09-06 at 20:25 +0200, Rafael J. Wysocki wrote:
> > On Mon, Sep 6, 2021 at 8:14 PM Srinivas Pandruvada
> > <srinivas.pandruvada@xxxxxxxxxxxxxxx> wrote:
> > >
> > > On Mon, 2021-09-06 at 19:54 +0200, Rafael J. Wysocki wrote:
> > > >
> [...]
>
> > > >
> > > We are handling offline for other thermal interrupt sources from
> > > same
> > > interrupt in therm-throt.c, where we do similar in offline path (by
> > > TGLX). If cpufreq offline can cause such issue of changing CPU,
> >
> > This is not cpufreq offline, but intel_pstate_update_status() which
> > may be triggered via sysfs. And again, the theoretically problematic
> > thing is dereferencing cpudata (which may be cleared by a remote CPU)
> > from the interrupt handler without protection.
>
> This will be a problem.
>
> >
> > > I can call intel_pstate_disable_hwp_interrupt() via override from
> > > https://elixir.bootlin.com/linux/latest/C/ident/thermal_throttle_offline
> > > after masking APIC interrupt.
> >
> > But why would using RCU be harder than this?
> I think, this will require all_cpu_data and cpu_data to be rcu
> protected. This needs to be well tested.
>
> I think better to revert the patch for the next release.

Done, thanks!