Re: [PATCH v10 3/6] fs/proc/task_mmu: Implement IOCTL to get and/or the clear info about PTEs

From: Nadav Amit
Date: Thu Feb 23 2023 - 12:11:20 EST



> On Feb 22, 2023, at 11:10 PM, Muhammad Usama Anjum <usama.anjum@xxxxxxxxxxxxx> wrote:
>
> Hi Nadav, Mike, Michał,
>
> Can you please share your thoughts at [A] below?

I promised I won't talk about the API, but was persuaded to reconsider. I have a
general question regarding the suitablity of currently proposed high-level API.
To explore some alternatives, I'd like to suggest an alternative that may have
some advantages. If these have already been considered and dismissed, feel free
to ignore.

I believe we have two distinct usage scenarios: (1) vectored reads from pagemap,
and (2) atomic UFFD WP-read/protect. It's possible that these require separate
interfaces

Regarding vectored reads, I believe the simplest solution is to maintain the
current pagemap entry format for output and extend it if necessary. The input
can be a vector of ranges. I'm uncertain about the purpose of fields such
as 'anyof_mask' in 'pagemap_scan_arg', so I can't confirm their necessity and
whether the input need to be made. more complicated. There is a possibility
that fields such as 'anyof_mask' might expose internal APIs, so I hope they’re
not required.

For the atomic operation of 'PAGE_IS_WRITTEN' + 'PAGEMAP_WP_ENGAGE', a different
mechanism might be necessary. This function appears to be UFFD-specific.
Instead of the proposed IOCTL, an alternative option is to
use 'UFFD_FEATURE_WP_ASYNC' to log the pages that were written, similar to
page-modification logging on Intel. Since this feature appears to be specific
to UFFD, I believe it would be more appropriate to include the log as part of
the UFFD mechanism rather than the pagemap.

From my experience with UFFD, proper ordering of events is crucial, although it
is not always done well. Therefore, we should aim for improvement, not
regression. I believe that utilizing the pagemap-based mechanism for WP'ing
might be a step in the wrong direction. I think that it would have been better
to emit a 'UFFD_FEATURE_WP_ASYNC' WP-log (and ordered) with UFFD #PF and
events. The 'UFFD_FEATURE_WP_ASYNC'-log may not need to wake waiters on the
file descriptor unless the log is full.

I am sorry that I chime in that late, but I think the complications that the
proposed mechanism might raise are not negligible. And anyhow this patch-set
still requires quite a bit of work before it can be merged.