Re: [PATCH v7 2/8] iov_iter: Add a function to extract a page list from an iterator

From: David Hildenbrand
Date: Thu Jan 26 2023 - 18:43:23 EST


On 26.01.23 23:15, Al Viro wrote:
On Mon, Jan 23, 2023 at 02:24:13PM +0100, David Hildenbrand wrote:
On 23.01.23 14:19, David Howells wrote:
David Hildenbrand <david@xxxxxxxxxx> wrote:

Switching from FOLL_GET to FOLL_PIN was in the works by John H. Not sure what
the status is. Interestingly, Documentation/core-api/pin_user_pages.rst
already documents that "CASE 1: Direct IO (DIO)" uses FOLL_PIN ... which does,
unfortunately, no reflect reality yet.

Yeah - I just came across that.

Should iov_iter.c then switch entirely to using pin_user_pages(), rather than
get_user_pages()? In which case my patches only need keep track of
pinned/not-pinned and never "got".

That would be the ideal case: whenever intending to access page content, use
FOLL_PIN instead of FOLL_GET.

The issue that John was trying to sort out was that there are plenty of
callsites that do a simple put_page() instead of calling unpin_user_page().
IIRC, handling that correctly in existing code -- what was pinned must be
released via unpin_user_page() -- was the biggest workitem.

Not sure how that relates to your work here (that's why I was asking): if
you could avoid FOLL_GET, that would be great :)

Take a good look at iter_to_pipe(). It does *not* need to pin anything
(we have an ITER_SOURCE there); with this approach it will. And it
will stuff those pinned references into a pipe, where they can sit
indefinitely.

IOW, I don't believe it's a usable approach.


Not sure what makes you believe that FOLL_GET is any better for this long-term pinning, I'd like to learn about that.

As raised already somewhere in the whole discussion by me, the right way to take such a long-term ping as vmsplice() does is to use FOLL_PIN|FOLL_LONGTERM. As also raised, that will fix the last remaining vmsplice()+hugetlb COW issue as tested by the cow.c vm selftest and make sure to migrate that memory off of MIGRATE_MOVABLE/CMA memory where we cannot tolerate to have long-term unmovable memory sitting around.

--
Thanks,

David / dhildenb