Re: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized therm_work

From: Borislav Petkov
Date: Mon Jan 06 2020 - 02:08:51 EST


On Mon, Jan 06, 2020 at 06:41:55AM +0000, Chuansheng Liu wrote:
> In ICL platform, it is easy to hit bootup failure with panic
> in thermal interrupt handler during early bootup stage.
>
> Such issue makes my platform almost can not boot up with
> latest kernel code.
>
> The call stack is like:
> kernel BUG at kernel/timer/timer.c:1152!
>
> Call Trace:
> __queue_delayed_work
> queue_delayed_work_on
> therm_throt_process
> intel_thermal_interrupt
> ...
>
> When one CPU is up, the irq is enabled prior to CPU UP
> notification which will then initialize therm_worker.

You mean the unmasking of the thermal vector at the end of
intel_init_thermal()?

If so, why don't you move that to the end of the notifier and unmask it
only after all the necessary work like setting up the workqueues etc, is
done, and save yourself adding yet another silly bool?

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette