Re: [PATCH v2] proc: "mount -o lookup=" support

From: Alexey Dobriyan
Date: Thu Jan 20 2022 - 07:32:35 EST


On Wed, Jan 19, 2022 at 05:24:23PM +0100, Christian Brauner wrote:
> On Wed, Jan 19, 2022 at 06:48:03PM +0300, Alexey Dobriyan wrote:
> > From 61376c85daab50afb343ce50b5a97e562bc1c8d3 Mon Sep 17 00:00:00 2001
> > From: Alexey Dobriyan <adobriyan@xxxxxxxxx>
> > Date: Mon, 22 Nov 2021 20:41:06 +0300
> > Subject: [PATCH 1/1] proc: "mount -o lookup=..." support
> >
> > Docker implements MaskedPaths configuration option
> >
> > https://github.com/estesp/docker/blob/9c15e82f19b0ad3c5fe8617a8ec2dddc6639f40a/oci/defaults.go#L97
> >
> > to disable certain /proc files. It overmounts them with /dev/null.
> >
> > Implement proper mount option which selectively disables lookup/readdir
> > in the top level /proc directory so that MaskedPaths doesn't need
> > to be updated as time goes on.
>
> I might've missed this when this was sent the last time so maybe it was
> clearly explained in an earlier thread: What's the reason this needs to
> live in the kernel?

The reasons are:
MaskedPaths or equivalents are blacklists, not future proof

MaskedPaths is applied at container creation once,
lookup= is applied at mount time surely but names aren't
required to exist to be filtered (read: some silly ISV module
gets loaded, creates /proc entries, containers get them with all
security holes)

> The MaskedPaths entry is optional so runtimes aren't required to block
> anything by default and this mostly makes sense for workloads that run
> privileged.
>
> In addition MaskedPaths is a generic option which allows to hide any
> existing path, not just proc. Even in the very docker-specific defaults
> /sys/firmware is covered.

Sure, the patch is for /proc only. MaskedPaths can't overmount with
/dev/null file which doesn't exist yet.

> I do see clear value in the subset= and hidepid= options. They are
> generally useful independent of opinionated container workloads. I don't
> see the same for lookup=.
>
> An alternative I find more sensible is to add a new value for subset=
> that hides anything(?) that only global root should have read/write
> access too.