Re: Remap_file_pages, RSS limits, security implications (was: Re: [uml-devel] Re: [RFC] [patch 0/18] remap_file_pages protection support (for UML), try 3)

From: Blaisorblade
Date: Wed Sep 21 2005 - 11:18:20 EST


On Wednesday 21 September 2005 17:16, Hugh Dickins wrote:
> On Tue, 20 Sep 2005, Blaisorblade wrote:
> > On Wednesday 07 September 2005 14:00, Hugh Dickins wrote:

> > Ahh, ok... VM_MAYSHARE is the recorded MAP_SHARED, while VM_SHARED says
> > whether the pages are actually shared and writable.

> That's it (let's say "potentially writable" - VM_SHARED says they're
> shared and already writable or shared and could be mprotected writable).

> It is confusing, but it's been that way for years, and once you grasp it,
> you like to keep it that way, so as to feel superior to everyone else ;)

> > > Though thinking through that again now, the user of the nonlinear vma
> > > is penalized,

> > Where? Not in the page fault path.... it's as penalized as the rest of
> > the system. Or will direct reclaim have a preference for pages of the
> > calling process?

> Not in the page fault path, no; and no, direct reclaim doesn't have that
> preference (sounds quite a good idea, but would need a different way of
> choosing pages to free, and would conflict with the swap token idea?).

> I meant in reclaim: that since try_to_unmap_cluster makes such poor
> choices of what to reclaim, and tries to reclaim more than asked because
> it's unable to target the page in question, I'd expect the user of a
> nonlinear vma to suffer more thrashing under memory pressure.

Other pages in the VMA may be unmapped, yes, but not freed. In fact, they're
kept in by the pagecache reference; try_to_unmap() (or better its caller,
shrink_list) will only actually free the page it asked for.

The only real "problem" is that we do ptep_clear_flush_young without
activating the page. And yes, *this* may penalize who holds a nonlinear VMA.
But this is probably fair, given that we're going to have trouble in freeing
those pages.

> > So, it would really be better to actually enforce the RSS rlimit when
> > mapping in pages in *nonlinear* areas (and fallback on setting file PTE's
> > like on NONBLOCK & page not in cache), rather than the "current" Rik's
> > idea of marking pages as inactive on memory-hog processes.

> I'd go mad if I delved into these issues, luckily we have suckers like Rik
> and Nick and Con who are prepared to give their lives to such endless
> tuning.

> > But oh, right in mm/trash.c, the code which should do part of this is
> > fully commented out - and it was in the very first version of the code
> > (looking through bkcvs-git repository).

> mm/trash.c? I got quite excited,
What would that have meant?
> but now it looks like you just mean
> mm/thrash.c.
Yes.
> Yawn.

--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade





___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/