Re: [PATCH 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

From: Ingo Molnar
Date: Fri Sep 18 2015 - 03:14:13 EST



* Andy Lutomirski <luto@xxxxxxxxxx> wrote:

> This demotes an OOPS and likely panic due to a failed non-"safe" MSR
> access to a WARN_ON_ONCE and a return of poisoned values (in the
> RDMSR case). We still write a pr_info entry unconditionally for
> debugging.
>
> To be clear, this type of failure should *not* happen. This patch
> exists to minimize the chance of nasty undebuggable failures due on
> systems that used to work due to a now-fixed CONFIG_PARAVIRT=y bug.

> + if (opcode == 0x320f) {
> + /* RDMSR */
> + pr_info("bad kernel RDMSR from non-existent MSR 0x%x",
> + (unsigned int)regs->cx);
> + if (!panic_on_oops) {
> + WARN_ON_ONCE(true);
> +
> + /* Patch it up with deterministic poison. */
> + regs->ax = 0x5aadc0de;
> + regs->dx = 0x8badf00d;
> + regs->ip += 2;
> + return true;

IMHO this should really not poison the result, but use zero as the result.

The poison might randomly indicate 'present' feature in various registers that
might be accessed in a buggy way. Don't send the code further down into la-la-land
by giving it a 'success'.

And yes, zero can mean success too, but we have to pick a side here ...

The warning will be enough to fix these ups, people (and in particular distro
testing people) will be watching out for them.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/