Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU

From: Catalin Marinas
Date: Wed Mar 11 2020 - 13:26:39 EST


On Wed, Mar 11, 2020 at 05:59:53PM +0100, Arnd Bergmann wrote:
> On Wed, Mar 11, 2020 at 3:29 PM Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
>
> > > - Flip TTBR0 on kernel entry/exit, and again during user access.
> > >
> > > This is probably more work to implement than your idea, but
> > > I would hope this has a lower overhead on most microarchitectures
> > > as it doesn't require pinning the pages. Depending on the
> > > microarchitecture, I'd hope the overhead would be comparable
> > > to that of ARM64_SW_TTBR0_PAN.
> >
> > This still doesn't solve the copy_{from,to}_user() case where both
> > address spaces need to be available during copy. So you either pin the
> > user pages in memory and access them via the kernel mapping or you
> > temporarily map (kmap?) the destination/source kernel address. The
> > overhead I'd expect to be significantly greater than ARM64_SW_TTBR0_PAN
> > for the uaccess routines. For user entry/exit, your suggestion is
> > probably comparable with SW PAN.
>
> Good point, that is indeed a larger overhead. The simplest implementation
> I had in mind would use the code from arch/arm/lib/copy_from_user.S and
> flip ttbr0 between each ldm and stm (up to 32 bytes), but I have no idea
> of the cost of storing to ttbr0, so this might be even more expensive. Do you
> have an estimate of how long writing to TTBR0_64 takes on Cortex-A7
> and A15, respectively?

I don't have numbers but it's usually not cheap since you need an ISB to
synchronise the context after TTBR0 update (basically flushing the
pipeline).

> Another way might be to use a use a temporary buffer that is already
> mapped, and add a memcpy() through L1-cache to reduce the number
> of ttbr0 changes. The buffer would probably have to be on the stack,
> which limits the size, but for large copies get_user_pages()+memcpy()
> may end up being faster anyway.

IIRC, the x86 attempt from Ingo some years ago was using
get_user_pages() for uaccess. Depending on the size of the buffer, this
may be faster than copying twice.

--
Catalin