Re: [PATCH 22/25] x86/mcheck: Do the init in one place

From: Borislav Petkov
Date: Mon Nov 07 2016 - 13:45:41 EST


On Thu, Nov 03, 2016 at 03:50:18PM +0100, Sebastian Andrzej Siewior wrote:
> Part of the init (memory allocation and so on) is done
> in mcheck_cpu_init(). While moving the the allocation to
> mcheck_init_device() (where the hotplug calls are initialized) it
> becomes necessary to move the callback (mcheck_cpu_init()), too.
>
> The callback is now removed from identify_cpu() and registered as a
> hotplug event which is invoked as the very first one which is shortly
> after the original point of invocation (look at smp_store_cpu_info() and
> notify_cpu_starting() in smp_callin()).
> One "visible" difference is that MCE for the boot CPU is not enabled at
> identify_boot_cpu() time but at device_initcall_sync() time. Either way,
> both times we had no userland around.

Uh, hm, I'm not sure about this: so the issue I see with this is that
the more we're delaying the enabling or MCE reporting - and especially
setting CR4[MCE] - the more we're increasing the window where a MCE
during early boot will cause a shutdown. (This is what happens if
CR4[MCE]=0b).

Perhaps we should split the init into a very early init which doesn't
need to be part of hotplug and the rest, which can do mce_disable_cpu()
and mce_reenable_cpu().

Tony, how do you see this?

> Cc: Tony Luck <tony.luck@xxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxxxx>
> Cc: linux-edac@xxxxxxxxxxxxxxx
> Cc: x86@xxxxxxxxxx
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> ---

...


> @@ -2584,11 +2580,26 @@ static __init int mcheck_init_device(void)
> goto err_out;
> }
>
> + err = __mcheck_cpu_mce_banks_init();
^^^^^^^^

I guess you can merge this one...

> + if (err)
> + goto err_out_mem;
> +
> mce_init_banks();
^^^^^^^^

into this one now.

But let's sort out the bigger issue first.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.