Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

From: Borislav Petkov
Date: Wed Apr 14 2021 - 09:11:16 EST


On Tue, Apr 13, 2021 at 10:47:21PM -0700, Jue Wang wrote:
> This path is when EPT #PF finds accesses to a hwpoisoned page and
> sends SIGBUS to user space (KVM exits into user space) with the same
> semantic as if regular #PF found access to a hwpoisoned page.
>
> The KVM_X86_SET_MCE ioctl actually injects a machine check into the guest.
>
> We are in process to launch a product with MCE recovery capability in
> a KVM based virtualization product and plan to expand the scope of the
> application of it in the near future.

Any pointers to code or is this all non-public? Any text on what that
product does with the MCEs?

> The in-memory database and analytical domain are definitely using it.
> A couple examples:
> SAP HANA - as we've tested and planned to launch as a strategic
> enterprise use case with MCE recovery capability in our product
> SQL server - https://support.microsoft.com/en-us/help/2967651/inf-sql-server-may-display-memory-corruption-and-recovery-errors

Aha, so they register callbacks for the processes to exec on a memory
error. Good to know, thanks for those.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette