Re: [PATCH v2 2/4] mm: vmalloc: use rwsem, mutex for vmap_area_lock and vmap_block->lock

From: Lorenzo Stoakes
Date: Sun Mar 19 2023 - 17:18:37 EST


On Sun, Mar 19, 2023 at 08:47:14PM +0000, Matthew Wilcox wrote:
> On Sun, Mar 19, 2023 at 08:29:16PM +0000, Lorenzo Stoakes wrote:
> > The basis for saying asynchronous was based on Documentation/filesystems/vfs.rst
> > describing read_iter() as 'possibly asynchronous read with iov_iter as
> > destination', and read_iter() is what is (now) invoked when accessing
> > /proc/kcore.
> >
> > However I agree this is vague and it is clearer to refer to the fact that we are
> > now directly writing to user memory and thus wish to avoid spinlocks as we may
> > need to fault in user memory in doing so.
> >
> > Would it be ok for you to go ahead and replace that final paragraph with the
> > below?:-
> >
> > The reason for making this change is to build a basis for vread() to write
> > to user memory directly via an iterator; as a result we may cause page
> > faults during which we must not hold a spinlock. Doing this eliminates the
> > need for a bounce buffer in read_kcore() and thus permits that to be
> > converted to also use an iterator, as a read_iter() handler.
>
> I'd say the purpose of the iterator is to abstract whether we're
> accessing user memory, kernel memory or a pipe, so I'd suggest:
>
> The reason for making this change is to build a basis for vread() to
> write to memory via an iterator; as a result we may cause page faults
> during which we must not hold a spinlock. Doing this eliminates the
> need for a bounce buffer in read_kcore() and thus permits that to be
> converted to also use an iterator, as a read_iter() handler.
>

Thanks, sorry I missed the detail about iterators abstacting the three
different targets there, that is definitely better!

> I'm still undecided whether this change is really a good thing. I
> think we have line-of-sight to making vmalloc (and thus kvmalloc)
> usable from interrupt context, and this destroys that possibility.
>
> I wonder if we can't do something like prefaulting the page before
> taking the spinlock, then use copy_page_to_iter_atomic()

There are a number of aspects of vmalloc that are not atomic-safe,
e.g. alloc_vmap_area() and vmap_range_noflush() are designated
might_sleep(), equally vfree().

So I feel that making it safe for atomic context requires a bit more of a
general rework. Given we would be able to revisit lock types at the point
we do that (something that would fit very solidly into the context of any
such change), and given that this patch series establishes that we use an
iterator, I think it is useful to keep this as-is as defer that change
until later.