Re: [PATCH 2/2] mm: set PG_dma_pinned on get_user_pages*()

From: Jan Kara
Date: Wed Jun 27 2018 - 07:54:05 EST


On Wed 27-06-18 13:32:21, Michal Hocko wrote:
> On Tue 26-06-18 18:48:25, Jan Kara wrote:
> > On Tue 26-06-18 15:47:57, Michal Hocko wrote:
> > > On Mon 18-06-18 12:21:46, Dan Williams wrote:
> > > [...]
> > > > I do think we should explore a page flag for pages that are "long
> > > > term" pinned. Michal asked for something along these lines at LSF / MM
> > > > so that the core-mm can give up on pages that the kernel has lost
> > > > lifetime control. Michal, did I capture your ask correctly?
> > >
> > > I am sorry to be late. I didn't ask for a page flag exactly. I've asked
> > > for a way to query for the pin to be temporal or permanent. How that is
> > > achieved is another question. Maybe we have some more spare room after
> > > recent struct page reorganization but I dunno, to be honest. Maybe we
> > > can have an _count offset for these longterm pins. It is not like we are
> > > using the whole ref count space, right?
> >
> > Matthew had an interesting idea to pull pinned pages completely out from
> > any LRU and reuse that space in struct page for pinned refcounts. From some
> > initial investigation (read on elsewhere in this thread) it looks doable. I
> > was considering offsetting in refcount as well but on 32-bit architectures
> > there's not that many bits that I'd be really comfortable with that
> > solution...
>
> I am really slow at following up this discussion. The problem I would
> see with off-lru pages is that this can quickly turn into a weird
> reclaim behavior. Especially when we are talking about a lot of memory.
> It is true that such pages wouldn't be reclaimable directly but could
> poke them in some way if we see too many of them while scanning LRU.
>
> Not that this is a fundamental block stopper but this is the first thing
> that popped out when thinking about such a solution. Maybe it is a good
> start though.
>
> Appart from that, do we really care about 32b here? Big DIO, IB users
> seem to be 64b only AFAIU.

IMO it is a bad habit to leave unpriviledged-user-triggerable oops in the
kernel even for uncommon platforms...

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR