Re: [PATCH v2] x86/mce: fix wrong no-return-ip logic in do_machine_check()

From: Aili Yao
Date: Tue Feb 23 2021 - 05:00:39 EST


On Tue, 23 Feb 2021 10:43:00 +0100
Borislav Petkov <bp@xxxxxxxxx> wrote:

> On Tue, Feb 23, 2021 at 10:27:55AM +0800, Aili Yao wrote:
> > When Guest access one address with UE error, it will exit guest mode,
> > the host will do the recovery job, and then one SIGBUS is send to
> > the VCPU and qemu will catch the signal, there is only address and
> > error level no RIPV in signal, so qemu will assume RIPV is cleared and
> > inject the error into guest OS.
>
> Lemme see:
>
> void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
>
> /* If we get an action required MCE, it has been injected by KVM
> * while the VM was running. An action optional MCE instead should
> * be coming from the main thread, which qemu_init_sigbus identifies
> * as the "early kill" thread.
> */
> assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
>
> ...
>
> kvm_mce_inject(cpu, paddr, code);
>
> in that function:
>
> if (code == BUS_MCEERR_AR) {
> status |= MCI_STATUS_AR | 0x134;
> mcg_status |= MCG_STATUS_EIPV;
> } else {
> status |= 0xc0;
> mcg_status |= MCG_STATUS_RIPV;
> }
>
> That looks like a valid RIP bit to me. Then cpu_x86_inject_mce() gets
> that mcg_status and injects it into the guest.

What i inject is AR error, and I don't see MCG_STATUS_RIPV flag.

Tks
Aili Yao