Re: [PATCH v2] x86/mce: fix wrong no-return-ip logic in do_machine_check()

From: Aili Yao
Date: Mon Feb 22 2021 - 07:54:14 EST


On Mon, 22 Feb 2021 19:21:46 +0800
Aili Yao <yaoaili@xxxxxxxxxxxx> wrote:

> On Mon, 22 Feb 2021 11:22:06 +0100
> Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> > On Mon, Feb 22, 2021 at 06:08:19PM +0800, Aili Yao wrote:
> > > So why would intel provide this MCG_STATUS_RIPV flag, it's better to
> > > remove it as it will never be set, and all the related logic for this
> > > flag is really needed ?
> >
> > Why would it never be set - of course it will be. You don't set it. If
> > you wanna inject errors, then make sure you inject *valid* errors which
> > the hardware *actually* generates, not some random ones.
> >
>
> As far as I know, Most of RAS related tests are faked, not real errors, and it's really meaningful.
>
> You should better reproduce the issue I tried to fix, or at least read the code more detailly and you will
> know if it's random and invalid
>
I See this in sdm 325462:

AR (Action Required) flag, bit 55 - Indicates (when set) that MCA error code specific recovery action must be
performed by system software at the time this error was signaled. This recovery action must be completed
successfully before any additional work is scheduled for this processor.
-------------------
When the RIPV flag in the IA32_MCG_STATUS is clear, an alternative execution stream needs to be provided;
------------------
when the MCA error code
specific recovery specific recovery action cannot be successfully completed, system software must shut down
the system. When the AR flag in the IA32_MCi_STATUS register is clear, system software may still take MCA
error code specific recovery action but this is optional; system software can safely resume program execution
at the instruction pointer saved on the stack from the machine check exception when the RIPV flag in the
IA32_MCG_STATUS register is set.

Best Regards!
Aili Yao