Re: [PATCH 1/7] mm: Add write-protect and clean utilities for address space ranges

From: Thomas HellstrÃm (VMware)
Date: Wed Sep 18 2019 - 13:44:19 EST


On 9/18/19 4:41 PM, Kirill A. Shutemov wrote:
On Wed, Sep 18, 2019 at 02:59:08PM +0200, Thomas HellstrÃm (VMware) wrote:
From: Thomas Hellstrom <thellstrom@xxxxxxxxxx>

Add two utilities to a) write-protect and b) clean all ptes pointing into
a range of an address space.
The utilities are intended to aid in tracking dirty pages (either
driver-allocated system memory or pci device memory).
The write-protect utility should be used in conjunction with
page_mkwrite() and pfn_mkwrite() to trigger write page-faults on page
accesses. Typically one would want to use this on sparse accesses into
large memory regions. The clean utility should be used to utilize
hardware dirtying functionality and avoid the overhead of page-faults,
typically on large accesses into small memory regions.

The added file "as_dirty_helpers.c" is initially listed as maintained by
VMware under our DRM driver. If somebody would like it elsewhere,
that's of course no problem.
After quick glance, it looks a lot as rmap code duplication. Why not
extend rmap_walk() interface instead to cover range of pages?

There appears to exist quite a few pagetable walks in the mm code. "Take 1" of this patch series modified the "apply_to_page_range" interface and used that. But the interface modification was actually what eventually caused Linus to reject the code. While it is entirely possible to do a proper modification following Linus' and Christoph's guidelines, that code doesn't allow for huge pages and populates all page table levels. We will soon probably want to support huge pages and do not want to populate. The number of altered code-paths itself IMO motivates yet another pagetable walk implementation.

The walk code currently resembling the present patch the most is the unmap_mapping_range() implementation.

The rmap_walk() is not very well suited since it operates on a struct page and the code of this patch has no notion of struct pages.

So my thoughts on this is that the interface should in time move towards the code in mm/pagewalk.c. If we eventually have more users of an address-space pagewalk or want to re-implement unmap_mapping_range() using a generic pagewalk, we should move the walk to pagewalk.c and reuse its structures, but implement separate code for the walk since we can't split huge pages and we can't take the mmap_sem. Meanwhile we should keep the code separate in as_dirty_helpers.c


Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Cc: Will Deacon <will.deacon@xxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Rik van Riel <riel@xxxxxxxxxxx>
Cc: Minchan Kim <minchan@xxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
Cc: Huang Ying <ying.huang@xxxxxxxxx>
Cc: Souptick Joarder <jrdr.linux@xxxxxxxxx>
Cc: "JÃrÃme Glisse" <jglisse@xxxxxxxxxx>
Cc: linux-mm@xxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx

Signed-off-by: Thomas Hellstrom <thellstrom@xxxxxxxxxx>
Reviewed-by: Ralph Campbell <rcampbell@xxxxxxxxxx> #v1
---
MAINTAINERS | 1 +
include/linux/mm.h | 13 +-
mm/Kconfig | 3 +
mm/Makefile | 1 +
mm/as_dirty_helpers.c | 392 ++++++++++++++++++++++++++++++++++++++++++
5 files changed, 409 insertions(+), 1 deletion(-)
create mode 100644 mm/as_dirty_helpers.c

diff --git a/MAINTAINERS b/MAINTAINERS
index c2d975da561f..b596c7cf4a85 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5287,6 +5287,7 @@ T: git git://people.freedesktop.org/~thomash/linux
S: Supported
F: drivers/gpu/drm/vmwgfx/
F: include/uapi/drm/vmwgfx_drm.h
+F: mm/as_dirty_helpers.c
Emm.. No. Core MM functinality cannot belong to random driver.

OK. I'll put it under core MM.

/Thomas