Re: [PATCH v7 2/8] iov_iter: Add a function to extract a page list from an iterator

From: David Hildenbrand
Date: Mon Jan 23 2023 - 09:21:47 EST


On 23.01.23 14:38, David Howells wrote:
David Hildenbrand <david@xxxxxxxxxx> wrote:

That would be the ideal case: whenever intending to access page content, use
FOLL_PIN instead of FOLL_GET.

The issue that John was trying to sort out was that there are plenty of
callsites that do a simple put_page() instead of calling
unpin_user_page(). IIRC, handling that correctly in existing code -- what was
pinned must be released via unpin_user_page() -- was the biggest workitem.

Not sure how that relates to your work here (that's why I was asking): if you
could avoid FOLL_GET, that would be great :)

Well, it simplifies things a bit.

I can make the new iov_iter_extract_pages() just do "pin" or "don't pin" and
do no ref-getting at all. Things can be converted over to "unpin the pages or
doing nothing" as they're converted over to using iov_iter_extract_pages()
from iov_iter_get_pages*().

The block bio code then only needs a single bit of state: pinned or not
pinned.

Unfortunately, I'll have to let BIO experts comment on that :) I only know the MM side of things here.


For cifs RDMA, do I need to make it pass in FOLL_LONGTERM? And does that need
a special cleanup?

Anything that holds pins "possibly forever" should that. vmsplice() is another example that should use it, once properly using FOLL_PIN. [FOLL_GET | FOLL_LONGTERM is not really used/defined with semantics]


sk_buff fragment handling could still be tricky. I'm thinking that in that
code I'll need to store FOLL_GET/PIN in the bottom two bits of the frag page
pointer. Sometimes it allocates a new page and attaches it (have ref);
sometimes it does zerocopy to/from a page (have pin) and sometimes it may be
pointing to a kernel buffer (don't pin or ref).

David


--
Thanks,

David / dhildenb