Re: [PATCH v9 07/10] mm: Device exclusive memory access

From: Andrew Morton
Date: Mon May 24 2021 - 18:12:09 EST


On Mon, 24 May 2021 23:27:22 +1000 Alistair Popple <apopple@xxxxxxxxxx> wrote:

> Some devices require exclusive write access to shared virtual
> memory (SVM) ranges to perform atomic operations on that memory. This
> requires CPU page tables to be updated to deny access whilst atomic
> operations are occurring.
>
> In order to do this introduce a new swap entry
> type (SWP_DEVICE_EXCLUSIVE). When a SVM range needs to be marked for
> exclusive access by a device all page table mappings for the particular
> range are replaced with device exclusive swap entries. This causes any
> CPU access to the page to result in a fault.
>
> Faults are resovled by replacing the faulting entry with the original
> mapping. This results in MMU notifiers being called which a driver uses
> to update access permissions such as revoking atomic access. After
> notifiers have been called the device will no longer have exclusive
> access to the region.
>
> Walking of the page tables to find the target pages is handled by
> get_user_pages() rather than a direct page table walk. A direct page
> table walk similar to what migrate_vma_collect()/unmap() does could also
> have been utilised. However this resulted in more code similar in
> functionality to what get_user_pages() provides as page faulting is
> required to make the PTEs present and to break COW.
>
> ...
>
> Documentation/vm/hmm.rst | 17 ++++
> include/linux/mmu_notifier.h | 6 ++
> include/linux/rmap.h | 4 +
> include/linux/swap.h | 7 +-
> include/linux/swapops.h | 44 ++++++++-
> mm/hmm.c | 5 +
> mm/memory.c | 128 +++++++++++++++++++++++-
> mm/mprotect.c | 8 ++
> mm/page_vma_mapped.c | 9 +-
> mm/rmap.c | 186 +++++++++++++++++++++++++++++++++++
> 10 files changed, 405 insertions(+), 9 deletions(-)
>

This is quite a lot of code added to core MM for a single driver.

Is there any expectation that other drivers will use this code?

Is there a way of reducing the impact (code size, at least) for systems
which don't need this code?

How beneficial is this code to nouveau users? I see that it permits a
part of OpenCL to be implemented, but how useful/important is this in
the real world?

Thanks.