RE: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

From: Luck, Tony
Date: Wed Mar 03 2021 - 13:25:13 EST


> For error address with sigbus, i think this is not an issue resulted by the patch i post, before my patch, the issue is already there.
> I don't find a realizable way to get the correct address for same reason --- we don't know whether the page mapping is there or not when
> we got to kill_me_maybe(), in some case, we may get it, but there are a lot of parallel issue need to consider, and if failed we have to fallback
> to the error brach again, remaining current code may be an easy option;

My RFC patch from yesterday removes the uncertainty about whether the page is there or not. After it walks the page
tables we know that the poison page isn't mapped (note that patch is RFC for a reason ... I'm 90% sure that it should
do a bit more that just clear the PRESENT bit).

So perhaps memory_failure() has queued a SIGBUS for this task, if so, we take it when we return from kill_me_maybe()

If not, we will return to user mode and re-execute the failing instruction ... but because the page is unmapped we will take a #PF

The x86 page fault handler will see that the page for this physical address is marked HWPOISON, and it will send the SIGBUS
(just like it does if the page had been removed by an earlier UCNA/SRAO error).

At least that's the theory.

-Tony