Re: [PATCH 1/2] vmalloc: New flag for flush before releasing pages

From: Andy Lutomirski
Date: Thu Dec 06 2018 - 14:19:54 EST


On Thu, Dec 6, 2018 at 11:01 AM Tycho Andersen <tycho@xxxxxxxx> wrote:
>
> On Thu, Dec 06, 2018 at 10:53:50AM -0800, Andy Lutomirski wrote:
> > > If we are going to unmap the linear alias, why not do it at vmalloc()
> > > time rather than vfree() time?
> >
> > Thatâs not totally nuts. Do we ever have code that expects __va() to
> > work on module data? Perhaps crypto code trying to encrypt static
> > data because our APIs donât understand virtual addresses. I guess if
> > highmem is ever used for modules, then we should be fine.
> >
> > RO instead of not present might be safer. But I do like the idea of
> > renaming Rick's flag to something like VM_XPFO or VM_NO_DIRECT_MAP and
> > making it do all of this.
>
> Yeah, doing it for everything automatically seemed like it was/is
> going to be a lot of work to debug all the corner cases where things
> expect memory to be mapped but don't explicitly say it. And in
> particular, the XPFO series only does it for user memory, whereas an
> additional flag like this would work for extra paranoid allocations
> of kernel memory too.
>

I just read the code, and I looks like vmalloc() is already using
highmem (__GFP_HIGH) if available, so, on big x86_32 systems, for
example, we already don't have modules in the direct map.

So I say we go for it. This should be quite simple to implement --
the pageattr code already has almost all the needed logic on x86. The
only arch support we should need is a pair of functions to remove a
vmalloc address range from the address map (if it was present in the
first place) and a function to put it back. On x86, this should only
be a few lines of code.

What do you all think? This should solve most of the problems we have.

If we really wanted to optimize this, we'd make it so that
module_alloc() allocates memory the normal way, then, later on, we
call some function that, all at once, removes the memory from the
direct map and applies the right permissions to the vmalloc alias (or
just makes the vmalloc alias not-present so we can add permissions
later without flushing), and flushes the TLB. And we arrange for
vunmap to zap the vmalloc range, then put the memory back into the
direct map, then free the pages back to the page allocator, with the
flush in the appropriate place.

I don't see why the page allocator needs to know about any of this.
It's already okay with the permissions being changed out from under it
on x86, and it seems fine. Rick, do you want to give some variant of
this a try?