Re: [PATCH v3 03/22] x86: Replace ist_enter() with nmi_enter()

From: Borislav Petkov
Date: Thu Feb 20 2020 - 05:54:50 EST


On Wed, Feb 19, 2020 at 03:47:27PM +0100, Peter Zijlstra wrote:
> @@ -1220,7 +1220,7 @@ static void mce_kill_me_maybe(struct cal
> * MCE broadcast. However some CPUs might be broken beyond repair,
> * so be always careful when synchronizing with others.
> */
> -void do_machine_check(struct pt_regs *regs, long error_code)
> +notrace void do_machine_check(struct pt_regs *regs, long error_code)

Is there a convention where the notrace marker should come in the
function signature? I see all possible combinations while grepping...

> {
> DECLARE_BITMAP(valid_banks, MAX_NR_BANKS);
> DECLARE_BITMAP(toclear, MAX_NR_BANKS);
> @@ -1254,10 +1254,10 @@ void do_machine_check(struct pt_regs *re
> */
> int lmce = 1;
>
> - if (__mc_check_crashing_cpu(cpu))
> - return;
> + nmi_enter();
>
> - ist_enter(regs);
> + if (__mc_check_crashing_cpu(cpu))
> + goto out;
>
> this_cpu_inc(mce_exception_count);
>

Should that __mc_check_crashing_cpu() happen before nmi_enter? The
function is doing only a bunch of checks and clearing MSRs for bystander
CPUs...

> @@ -1346,7 +1346,7 @@ void do_machine_check(struct pt_regs *re
> sync_core();
>
> if (worst != MCE_AR_SEVERITY && !kill_it)
> - goto out_ist;
> + goto out;
>
> /* Fault was in user mode and we need to take some action */
> if ((m.cs & 3) == 3) {
> @@ -1362,10 +1362,11 @@ void do_machine_check(struct pt_regs *re
> mce_panic("Failed kernel mode recovery", &m, msg);
> }
>
> -out_ist:
> - ist_exit(regs);
> +out:
> + nmi_exit();
> }
> EXPORT_SYMBOL_GPL(do_machine_check);
> +NOKPROBE_SYMBOL(do_machine_check);

Yah, that's a good idea regardless.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette