Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

From: Luck, Tony
Date: Wed Mar 03 2021 - 06:41:01 EST

Next message: Chengming Zhou: "[PATCH v2 1/4] psi: Add PSI_CPU_FULL state"
Previous message: Florian Fainelli: "Re: [PATCH net-next v2 3/3] net: phy: broadcom: Allow BCM54210E to configure APD"
In reply to: Aili Yao: "Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned"
Next in thread: Aili Yao: "Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Feb 26, 2021 at 10:59:15AM +0800, Aili Yao wrote:
> Hi naoya, tony:
> > >
> > > Idea for what we should do next ... Now that x86 is calling memory_failure()
> > > from user context ... maybe parallel calls for the same page should
> > > be blocked until the first caller completes so we can:
> > > a) know that pages are unmapped (if that happens)
> > > b) all get the same success/fail status
> >
> > One memory_failure() call changes the target page's status and
> > affects all mappings to all affected processes, so I think that
> > (ideally) we don't have to block other threads (letting them
> > early return seems fine). Sometimes memory_failure() fails,
> > but even in such case, PG_hwpoison is set on the page and other
> > threads properly get SIGBUSs with this patch, so I think that
> > we can avoid the worst scenario (like system stall by MCE loop).
> >
> I agree with naoya's point, if we block for this issue, Does this change the result
> that the process should be killed? Or is there something other still need to be considered?

Ok ... no blocking ... I think someone in this thread suggested
scanning the page tables to make sure the poisoned page had been
unmapped.

There's a walk_page_range() function that does all the work for that.
Just need to supply some callback routines that check whether a
mapping includes the bad PFN and clear the PRESENT bit.

RFC patch below against v5.12-rc1

-Tony

Next message: Chengming Zhou: "[PATCH v2 1/4] psi: Add PSI_CPU_FULL state"
Previous message: Florian Fainelli: "Re: [PATCH net-next v2 3/3] net: phy: broadcom: Allow BCM54210E to configure APD"
In reply to: Aili Yao: "Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned"
Next in thread: Aili Yao: "Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]