Re: [patch 2/3] MAP_NOZERO - implement sys_brk2()

From: Nicholas Miell
Date: Wed Jun 27 2007 - 18:12:22 EST


On Wed, 2007-06-27 at 11:45 -0700, Davide Libenzi wrote:
> On Wed, 27 Jun 2007, Hugh Dickins wrote:
>
> > On Wed, 27 Jun 2007, Davide Libenzi wrote:
> > > On Wed, 27 Jun 2007, Hugh Dickins wrote:
> > >
> > > > In honesty, I should add that I dislike and distrust Davide's
> > > > MAP_NOZERO very much indeed! Would much rather leave my cpus
> > > > spending a little time in clear_page(). A uid in struct page
> > > > (though I'm sure we could find somewhere to tuck it away) -
> > > > the horror, the horror! But I've so far failed to find a killer
> > > > argument against it, and am hoping for someone else to do so.
> > >
> > > Little time? Please, do not trust me. Start oprofile and run a kernel
> > > build. Look, I'm not even talking about som micro benchmark explicitly
> > > built to exploit the thing. A kernel build.
> > > You will find clear_page to be the *1st* kernel entry after cc1 and as.
> > > That is bad for two reasons. The time it spends in there, and the cache it
> > > blows.
> >
> > I don't doubt that it shows real benefits; but dangerously cutting
> > corners usually shows benefits too. Relying on a uid at this level
> > feels very wrong to me - but as I said, I've not found a killer
> > argument against it.
>
> The reason why I posted is exactly so other ppl can look at it and find
> possible flaws in the way pages and retired. If an effective UID was able
> to see (or it generated) the data on that page, it should be able to get
> that page back uncleared (when VM_NOZERO is set).
> >From a performance POV, a 2-3% boost on a non-micro-bench test like a
> kernel build is not exaclty peanuts. And for more heavy malloc/anon-mmap
> appliations, the boost goes up to 10-15%. That is not exactly what I
> call "little time" ;)
>
> - Davide
>

I don't think the security issues with this will ever make it
worthwhile.

Consider:

1) euid is not sufficient, you need to store away arbitrary LSM
information and call LSM hooks to decide security equivalence. The same
applies to VServer or whatever other container system you use.

2) Two processes, A and B, are in separate VFS namespaces but have
equivalent security identity according to LSM. Process A reads data from
file F which is not visible in process's B's namespace. You have to
prevent process B from ever getting a page that once contained data from
file F.

3) mlock() is often used by programs like GPG to prevent decrypted
secret keys from ever getting swapped out. You need to zero all
once-mlocked pages before they get reused to prevent that page from
getting swapped to disk or application bugs from leaking the key.

--
Nicholas Miell <nmiell@xxxxxxxxxxx>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/