Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions

From: Jan Kara
Date: Wed Dec 05 2018 - 06:16:55 EST


On Tue 04-12-18 13:56:36, John Hubbard wrote:
> On 12/4/18 12:28 PM, Dan Williams wrote:
> > On Mon, Dec 3, 2018 at 4:17 PM <john.hubbard@xxxxxxxxx> wrote:
> >>
> >> From: John Hubbard <jhubbard@xxxxxxxxxx>
> >>
> >> Introduces put_user_page(), which simply calls put_page().
> >> This provides a way to update all get_user_pages*() callers,
> >> so that they call put_user_page(), instead of put_page().
> >>
> >> Also introduces put_user_pages(), and a few dirty/locked variations,
> >> as a replacement for release_pages(), and also as a replacement
> >> for open-coded loops that release multiple pages.
> >> These may be used for subsequent performance improvements,
> >> via batching of pages to be released.
> >>
> >> This is the first step of fixing the problem described in [1]. The steps
> >> are:
> >>
> >> 1) (This patch): provide put_user_page*() routines, intended to be used
> >> for releasing pages that were pinned via get_user_pages*().
> >>
> >> 2) Convert all of the call sites for get_user_pages*(), to
> >> invoke put_user_page*(), instead of put_page(). This involves dozens of
> >> call sites, and will take some time.
> >>
> >> 3) After (2) is complete, use get_user_pages*() and put_user_page*() to
> >> implement tracking of these pages. This tracking will be separate from
> >> the existing struct page refcounting.
> >>
> >> 4) Use the tracking and identification of these pages, to implement
> >> special handling (especially in writeback paths) when the pages are
> >> backed by a filesystem. Again, [1] provides details as to why that is
> >> desirable.
> >
> > I thought at Plumbers we talked about using a page bit to tag pages
> > that have had their reference count elevated by get_user_pages()? That
> > way there is no need to distinguish put_page() from put_user_page() it
> > just happens internally to put_page(). At the conference Matthew was
> > offering to free up a page bit for this purpose.
> >
>
> ...but then, upon further discussion in that same session, we realized that
> that doesn't help. You need a reference count. Otherwise a random put_page
> could affect your dma-pinned pages, etc, etc.

Exactly.

> I was not able to actually find any place where a single additional page
> bit would help our situation, which is why this still uses LRU fields for
> both the two bits required (the RFC [1] still applies), and the dma_pinned_count.

So single page bit could help you with performance. In 99% of cases there's
just one reference from GUP. So if you could store that info in page flags,
you could safe yourself a relatively expensive removal from LRU and putting
it back to make space in struct page for proper refcount. But since you
report that the performance isn't that horrible, I'd leave this idea on a
backburner. We can always implement it later in case we find in future we
need to improve the performance.

Honza

--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR