Re: [PATCH v4 4/4] mm: vmalloc: convert vread() to vread_iter()

From: David Hildenbrand
Date: Wed Mar 22 2023 - 07:19:02 EST


On 21.03.23 21:54, Lorenzo Stoakes wrote:
Having previously laid the foundation for converting vread() to an iterator
function, pull the trigger and do so.

This patch attempts to provide minimal refactoring and to reflect the
existing logic as best we can, for example we continue to zero portions of
memory not read, as before.

Overall, there should be no functional difference other than a performance
improvement in /proc/kcore access to vmalloc regions.

Now we have eliminated the need for a bounce buffer in read_kcore_iter(),
we dispense with it. We need to ensure userland pages are faulted in before
proceeding, as we take spin locks.

Additionally, we must account for the fact that at any point a copy may
fail if this happens, we exit indicating fewer bytes retrieved than
expected.

Signed-off-by: Lorenzo Stoakes <lstoakes@xxxxxxxxx>
---
fs/proc/kcore.c | 26 ++---
include/linux/vmalloc.h | 3 +-
mm/nommu.c | 10 +-
mm/vmalloc.c | 234 +++++++++++++++++++++++++---------------
4 files changed, 160 insertions(+), 113 deletions(-)

diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index 25e0eeb8d498..221e16f75ba5 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -307,13 +307,9 @@ static void append_kcore_note(char *notes, size_t *i, const char *name,
*i = ALIGN(*i + descsz, 4);
}
-static ssize_t
-read_kcore_iter(struct kiocb *iocb, struct iov_iter *iter)
+static ssize_t read_kcore_iter(struct kiocb *iocb, struct iov_iter *iter)
{
- struct file *file = iocb->ki_filp;
- char *buf = file->private_data;
loff_t *ppos = &iocb->ki_pos;
-
size_t phdrs_offset, notes_offset, data_offset;
size_t page_offline_frozen = 1;
size_t phdrs_len, notes_len;
@@ -507,9 +503,12 @@ read_kcore_iter(struct kiocb *iocb, struct iov_iter *iter)
switch (m->type) {
case KCORE_VMALLOC:
- vread(buf, (char *)start, tsz);
- /* we have to zero-fill user buffer even if no read */
- if (copy_to_iter(buf, tsz, iter) != tsz) {
+ /*
+ * Make sure user pages are faulted in as we acquire
+ * spinlocks in vread_iter().
+ */
+ if (fault_in_iov_iter_writeable(iter, tsz) ||
+ vread_iter(iter, (char *)start, tsz) != tsz) {
ret = -EFAULT;
goto out;
}

What if we race with swapout after faulting the pages in? Or some other mechanism to write-protect the user space pages?

Also, "This is primarily useful when we already know that some or all of the pages in @i aren't in memory". This order of events might slow down things quite a bit if I am not wrong.


Wouldn't you want to have something like:

while (vread_iter(iter, (char *)start, tsz) != tsz) {
if (fault_in_iov_iter_writeable(iter, tsz)) {
ret = -EFAULT;
goto out;
}
}

Or am I missing something?

--
Thanks,

David / dhildenb