Re: [PATCH v1 3/3] mm,hwpoison: add kill_accessing_process() to find error virtual address

From: Jue Wang
Date: Tue Apr 20 2021 - 12:31:18 EST


On Tue, Apr 20, 2021 at 8:48 AM Luck, Tony <tony.luck@xxxxxxxxx> wrote:
>
> On Mon, Apr 19, 2021 at 07:03:01PM -0700, Jue Wang wrote:
> > On Tue, 13 Apr 2021 07:43:20 +0900, Naoya Horiguchi wrote:
> >
> > > This patch suggests to do page table walk to find the error virtual
> > > address. If we find multiple virtual addresses in walking, we now can't
> > > determine which one is correct, so we fall back to sending SIGBUS in
> > > kill_me_maybe() without error info as we do now. This corner case needs
> > > to be solved in the future.
> >
> > Instead of walking the page tables, I wonder what about the following idea:
> >
> > When failing to get vaddr, memory_failure just ensures the mapping is removed
> > and an hwpoisoned swap pte is put in place; or the original page is flagged with
> > PG_HWPOISONED and kept in the radix tree (e.g., for SHMEM THP).
>
> To remove the mapping, you need to know the virtual address :-)
I meant in this case (racing to access the same poisoned pages), the
page mapping should have been removed by and the hwpoison swap pte
installed by the winner thread?

Other racing threads can rely on the subsequent #PFs to get the
correct SIGBUS with accurate vaddr semantics? Or is the goal to "give
back correct SIGBUS with accurate vaddr on _the first MCE on ANY
threads_"? I wonder if that goal is absolutely necessary and can be
relaxed a little to take into account subsequent #PFs.
>
> Well, I did try a patch that removed *all* user mappings (switched CR3 to
> swapper_pgdir) and returned to user. Then have the resulting page fault
> report the address. But that didn't work very well.
Curious what didn't work well in this case? :-)

>
>
>
>
> > NOTE: no SIGBUS is sent to user space.
> >
> > Then do_machine_check just returns to user space to resume execution, the
> > re-execution will result in a #PF and should land to the exact page fault
> > handling code that generates a SIGBUS with the precise vaddr info:
>
> That's how SRAO (and other races) are supposed to work.
Hmm, I wonder why it doesn't apply to this race.
>
> -Tony