Re: [PATCH] mm: Revert pinned_vm braindamage

From: Peter Zijlstra
Date: Mon Jun 17 2013 - 05:45:45 EST


On Thu, Jun 13, 2013 at 02:06:32PM -0700, Andrew Morton wrote:
> Let's try to get this wrapped up?
>
> On Thu, 6 Jun 2013 14:43:51 +0200 Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> >
> > Patch bc3e53f682 ("mm: distinguish between mlocked and pinned pages")
> > broke RLIMIT_MEMLOCK.
>
> I rather like what bc3e53f682 did, actually. RLIMIT_MEMLOCK limits the
> amount of memory you can mlock(). Nice and simple.
>
> This pinning thing which infiniband/perf are doing is conceptually
> different and if we care at all, perhaps we should be looking at adding
> RLIMIT_PINNED.

We could do that; but I really don't like doing it for the reasons I
outlined previously. It gives the user another knob to twiddle which is
pretty much the same as one he already has just slightly different.

Like said, I see RLIMIT_MEMLOCK to mean the amount of pages the user can
exempt from paging; since that is what the VM cares about most.

> > Before that patch: mm_struct::locked_vm < RLIMIT_MEMLOCK; after that
> > patch we have: mm_struct::locked_vm < RLIMIT_MEMLOCK &&
> > mm_struct::pinned_vm < RLIMIT_MEMLOCK.
>
> But this is a policy decision which was implemented in perf_mmap() and
> perf can alter that decision. How bad would it be if perf just ignored
> RLIMIT_MEMLOCK?

Then it could pin all memory -- seems like something bad.

> drivers/infiniband/hw/qib/qib_user_pages.c has issues, btw. It
> compares the amount-to-be-pinned with rlimit(RLIMIT_MEMLOCK), but
> forgets to also look at current->mm->pinned_vm. Duh.
>
> It also does the pinned accounting in __qib_get_user_pages() but in
> __qib_release_user_pages(), the caller is supposed to do it, which is
> rather awkward.
>
>
> Longer-term I don't think that inifinband or perf should be dinking
> around with rlimit(RLIMIT_MEMLOCK) or ->pinned_vm. Those policy
> decisions should be hoisted into a core mm helper where we can do it
> uniformly (and more correctly than infiniband's attempt!).

Agreed, hence my VM_PINNED proposal that would lift most of that to the
core VM.

I just got really lost in the IB code :/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/