Re: [PATCH] x86/mce: Set PG_hwpoison page flag to avoid the capture kernel panic

From: HORIGUCHI NAOYA(堀口 直也)
Date: Sun Jan 29 2023 - 21:32:36 EST


On Thu, Jan 26, 2023 at 05:50:30PM -0800, Tony Luck wrote:
> From: Zhiquan Li <zhiquan1.li@xxxxxxxxx>
>
> Kdump can exclude the HWPosion page to avoid touch the error page
> again, the prerequisite is the PG_hwpoison page flag is set.
> However, for some MCE fatal error cases, there are no opportunity
> to queue a task for calling memory_failure(), as a result,
> the capture kernel touches the error page again and panics.
>
> Add function mce_set_page_hwpoison_now() which mark a page as
> HWPoison before kernel panic() for MCE error, so that the dump
> program can check and skip the error page and prevent the capture
> kernel panic.
>
> [Tony: Changed TestSetPageHWPoison() to SetPageHWPoison()]
>
> Co-developedd-by: Youquan Song <youquan.song@xxxxxxxxx>
> Signed-off-by: Youquan Song <youquan.song@xxxxxxxxx>
> Signed-off-by: Zhiquan Li <zhiquan1.li@xxxxxxxxx>
> Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>

Hi,
Thank you for the patch.

> ---
> arch/x86/kernel/cpu/mce/core.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 2c8ec5c71712..0630999c6311 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -162,6 +162,24 @@ void mce_unregister_decode_chain(struct notifier_block *nb)
> }
> EXPORT_SYMBOL_GPL(mce_unregister_decode_chain);
>
> +/*
> + * Kdump can exclude the HWPosion page to avoid touch the error page again,
> + * the prerequisite is the PG_hwpoison page flag is set. However, for some
> + * MCE fatal error cases, there are no opportunity to queue a task
> + * for calling memory_failure(), as a result, the capture kernel panic.

s/panic/panics/ ?

> + * This function mark the page as HWPoison before kernel panic() for MCE.

s/mark/marks/

> + */
> +static void mce_set_page_hwpoison_now(unsigned long pfn)
> +{
> + struct page *p;
> +
> + /* TODO: need to handle other sort of page, like SGX, PMEM and
> + * HugeTLB pages*/

Although I'm not sure that SGX memory or PMEM pages are expected to be
included in kdump, but simply setting PageHWPoison does not work for them?
(Maybe that depends on how kdump handles these types of memory.)

As for HugeTLB, kdump utility should parse the struct page and be aware of
HugeTLB pages, so maybe setting PageHWPoison on the head page could work.

> + p = pfn_to_online_page(pfn);
> + if (p)
> + SetPageHWPoison(p);
> +}
> +
> static void __print_mce(struct mce *m)
> {
> pr_emerg(HW_ERR "CPU %d: Machine Check%s: %Lx Bank %d: %016Lx\n",
> @@ -292,6 +310,8 @@ static noinstr void mce_panic(const char *msg, struct mce *final, char *exp)
> if (!fake_panic) {
> if (panic_timeout == 0)
> panic_timeout = mca_cfg.panic_timeout;
> + if (final && (final->status & MCI_STATUS_ADDRV))
> + mce_set_page_hwpoison_now(final->addr >> PAGE_SHIFT);
> panic(msg);

I think that setting PageHWPoison outside hwpoison subsystem is OK here,
because this is called just before calling panic() so it's expected to not
conflict with other hwpoison-related code.

Thanks,
Naoya Horiguchi