Re: [PATCH RESEND v2] x86/mce: Set PG_hwpoison page flag to avoid the capture kernel panic

From: Borislav Petkov
Date: Tue Oct 10 2023 - 04:29:09 EST


On Thu, Sep 14, 2023 at 11:05:39AM +0800, Zhiquan Li wrote:
> Kdump can exclude the HWPosion page to avoid touch the error page
> again, the prerequisite is the PG_hwpoison page flag is set.
> However, for some MCE fatal error cases, there is no opportunity
> to queue a task for calling memory_failure(), as a result,
> the capture kernel touches the error page again and panics.
>
> Add function mce_set_page_hwpoison_now() which marks a page as
> HWPoison before kernel panic() for MCE error, so that the dump
> program can check and skip the error page and prevent the capture
> kernel panic.

This commit message should explain the full scenario, like you did in
your other reply.

Also explain how the poison flag is consumed by the kdump kernel and put
that in the comment below.

> [Tony: Changed TestSetPageHWPoison() to SetPageHWPoison()]
>
> Co-developed-by: Youquan Song <youquan.song@xxxxxxxxx>
> Signed-off-by: Youquan Song <youquan.song@xxxxxxxxx>
> Signed-off-by: Zhiquan Li <zhiquan1.li@xxxxxxxxx>
> Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>

What does Tony's SOB mean here?

If I read it correctly, it is him sending this patch now. But you're
sending it so you folks need to read up on SOB chains.

> Reviewed-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
>
> ---
> V2 RESEND notes:
> - No changes on this, just rebasing as v6.6-rc1 is out.
> - Added the tag from Naoya.
> Link: https://lore.kernel.org/all/20230719211625.298785-1-tony.luck@xxxxxxxxx/#t
>
> Changes since V1:
> - Revised the commit message as per Naoya's suggestion.
> - Replaced "TODO" comment in code with comments based on mailing list
> discussion on the lack of value in covering other page types.
> Link: https://lore.kernel.org/all/20230127015030.30074-1-tony.luck@xxxxxxxxx/
> ---
> arch/x86/kernel/cpu/mce/core.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 6f35f724cc14..2725698268f3 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -156,6 +156,22 @@ void mce_unregister_decode_chain(struct notifier_block *nb)
> }
> EXPORT_SYMBOL_GPL(mce_unregister_decode_chain);
>
> +/*
> + * Kdump can exclude the HWPosion page to avoid touch the error page again,
> + * the prerequisite is the PG_hwpoison page flag is set. However, for some
> + * MCE fatal error cases, there are no opportunity to queue a task
> + * for calling memory_failure(), as a result, the capture kernel panics.
> + * This function marks the page as HWPoison before kernel panic() for MCE.
> + */
> +static void mce_set_page_hwpoison_now(unsigned long pfn)
> +{
> + struct page *p;
> +
> + p = pfn_to_online_page(pfn);
> + if (p)
> + SetPageHWPoison(p);
> +}

there's no need for that function - just put everything...

> +
> static void __print_mce(struct mce *m)
> {
> pr_emerg(HW_ERR "CPU %d: Machine Check%s: %Lx Bank %d: %016Lx\n",
> @@ -286,6 +302,8 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp)
> if (!fake_panic) {
> if (panic_timeout == 0)
> panic_timeout = mca_cfg.panic_timeout;
> + if (final && (final->status & MCI_STATUS_ADDRV))
> + mce_set_page_hwpoison_now(final->addr >> PAGE_SHIFT);

... here, along with the comment.

> panic(msg);
> } else
> pr_emerg(HW_ERR "Fake kernel panic: %s\n", msg);
> --

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette