Re: [RFC PATCH 00/27] Containers and using authenticated filesystems

From: Eric W. Biederman
Date: Tue Feb 19 2019 - 11:35:45 EST



So you missed the main mailing lists for discussion of this kind of
thing, and the maintainer. So I have reservations about the quality of
your due diligence already.

Looking at your description you are introducing a container id.
You don't descibe which namespace your contianer id lives in.
Without the container id living in a container this breaks
nested containers and process migration aka CRIU.

So based on the your description.

Nacked-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>



David Howells <dhowells@xxxxxxxxxx> writes:

> Here's a collection of patches that containerises the kernel keys and makes
> it possible to separate keys by namespace. This can be extended to any
> filesystem that uses request_key() to obtain the pertinent authentication
> token on entry to VFS or socket methods.
>
> I have this working with AFS and AF_RXRPC so far, but it could be extended
> to other filesystems, such as NFS and CIFS.
>
> The following changes are made:
>
> (1) Add optional namespace tags to a key's index_key. This allows the
> following:
>
> (a) Automatic invalidation of all keys with that tag when the
> namespace is removed.
>
> (b) Mixing of keys with the same description, but different areas of
> operation within a keyring.
>
> (c) Sharing of cache keyrings, such as the DNS lookup cache.
>
> (d) Diversion of upcalls based on namespace criteria.
>
> (2) Provide each network namespace with a tag that can be used with (1).
> This is used by the DNS query, rxrpc, nfs idmapper keys.
>
> [!] Note that it might still be better to move these keyrings into the
> network namespace.
>
> (3) Provide key ACLs. These allow:
>
> (a) The permissions can be split more finely, in particular separating
> out Invalidate and Join.
>
> (b) Permits to be granted to non-standard subjects. So, for instance,
> Search permission could be granted to a container object, allowing
> a search of the container keyring by a denizen of the container to
> find a key that they can't otherwise see.
>
> (4) Provide a kernel container object. Currently, this is created with a
> system call and passed flags that indicate the namespaces to be
> inherited or replaced. It might be better to actually use something
> like fsconfig() to configure the container by setting key=val type
> options.
>
> The kernel container object provides the following facilities:
>
> (a) request_key upcall interception. The manager of a container can
> intercept requests made inside the container and, using a series
> of filters, can cause the authkeys to be placed into keyrings that
> serve as queues for one or more upcall processing programs. These
> upcall programs use key notifications to monitor those keyrings.
>
> (b) Per-container keyring. A keyring can be attached to the container
> such that this is searched by a request_key() performed by a
> denizen of the container after searching the thread, process and
> session keyrings. The keyring and the keys contained therein must
> be granted Search for that container.
>
> This allows:
>
> (i) Authenticated filesystems to be used transparently inside of
> the container without any cooperation from the occupant
> thereof. All the key maintenance can be done by the manager.
>
> (ii) Keys to be made available to the denizens of a container (by
> granting extra permissions to the container subject).
>
> (c) Per-container ID that can be used in audit messages.
>
> (d) Container object creation gives the manager a file descriptor that
> can:
>
> (i) Be passed to a dirfd parameter to a VFS syscall, such as
> mkdirat(), allowing an operation to be done inside the
> container.
>
> (ii) Be passed to fsopen()/fsconfig() to indicate that the target
> filesystem is going to be created inside a container, in that
> container's namespaces.
>
> (iii) Be passed to the move_mount() syscall as a destination for
> setting the root filesystem inside a new mount namespace made
> upon container creation.
>
> (e) The ability to configure the container with namespaces or
> whatever, and then fork a process into that container to 'boot'
> it.
>
>
> Three sample programs are provided:
>
> (1) test-container. This:
>
> - Creates a kernel container with a blank mount ns.
> - Creates its root mount and moves it to the container root.
> - Mounts /proc therein.
> - Creates a keyring called "_container"
> - Sets that as the container keyring.
> - Grants Search permission to the container on that keyring.
> - Removes owner permission on that keyring.
> - Creates a sample user key "foobar" in the container keyring.
> - Grants various permissions to the container on that key.
> - Creates a keyring called "upcall"
> - Intercepts "user" key upcalls from the container to there.
> - Forks a process into the container
> - Prints the container keyring ID if it can
> - Exec's bash.
>
> This program expects to be given the device name for a partition it
> can mount as the root and expects it to contain things like /etc,
> /bin, /sbin, /lib, /usr containing programs that can be run and /proc
> to mount procfs upon. E.g.:
>
> ./test-container /dev/sda3
>
> (2) test-upcall. This is a service program that monitors the "upcall"
> keyring created by test-container for authkeys appearing, which it
> then hands off to /sbin/request-key. This:
>
> - Opens /dev/watch_queue.
> - Sets the size to 1 page.
> - Sets a filter to watch for "Link creation" key events.
> - Sets a watch on the upcall keyring.
> - Polls the watch queue for events
> - When an event comes in:
> - Gets the authkey ID from the event buffer.
> - Queries the authkey.
> - Forks of a handler which:
> - Moves the authkey to its thread keyring
> - Sets up a new session keyring with the authkey in it.
> - Execs /sbin/request-key.
>
> This can be run in a shell that shares the session keyring with
> test-container, from which it will find the upcall keyring.
> Alternatively, the keyring ID can be provided on the command line:
>
> ./test-upcall [<upcall-keyring>]
>
> It can be triggered from inside of the container with something like:
>
> keyctl request2 user debug:e a @s
>
> and something like:
>
> ptrs h=4 t=2 m=2000003
> NOTIFY[00000004-00000002] ty=0003 sy=0002 i=01000010
> KEY 78543393 change=2 aux=141053003
> Authentication key 141053003
> - create 779280685
> - uid=0 gid=0
> - rings=0,0,798528519
> - callout='a'
> RQDebug keyid: 779280685
> RQDebug desc: debug:e
> RQDebug callout: a
> RQDebug session keyring: 798528519
>
> will appear on stdout/stderr from it and /sbin/request-key.
>
> (3) test-cont-grant. This is a program to make the nominated key
> available to a container's denizens. It:
>
> - Grants search permission to the nominated key.
> - Links the nominated key into the container keyring.
>
> It can be run from outside of the keyring like so:
>
> ./test-cont-grant <key> [<container-keyring>]
>
> If the keyring isn't given, it will look for one called "_container"
> in the session keyring where test-container is expected to have placed
> it.
>
> With kAFS, it can be used like follows:
>
> kinit dhowells@xxxxxxxxxx
> kafs-aklog redhat.com
>
> which would log into kerberos and then get a key for accessing an AFS
> cell called "redhat.com". This can be seen in the session keyring by
> calling "keyctl show":
>
> 120378984 --alswrv 0 0 keyring: _ses
> 474754113 ---lswrv 0 65534 \_ keyring: _uid.0
> 64049961 --alswrv 0 0 \_ rxrpc: afs@xxxxxxxxxx
> 78543393 --alswrv 0 0 \_ keyring: upcall
> 661655334 --alswrv 0 0 \_ keyring: _container
> 639103010 --alswrv 0 0 \_ user: foobar
>
> Then doing:
>
> ./test-cont-grant 64049961
>
> will result in:
>
> 120378984 --alswrv 0 0 keyring: _ses
> 474754113 ---lswrv 0 65534 \_ keyring: _uid.0
> 64049961 --alswrv 0 0 \_ rxrpc: afs@xxxxxxxxxxxxxx
> 78543393 --alswrv 0 0 \_ keyring: upcall
> 661655334 --alswrv 0 0 \_ keyring: _container
> 639103010 --alswrv 0 0 \_ user: foobar
> 64049961 --alswrv 0 0 \_ rxrpc: afs@xxxxxxxxxxxxxx
>
> Inside the container, the cell could be mounted:
>
> mount -t afs "%redhat.com:root.cell" /mnt
>
> and then operations in /mnt will be done using the token that has been
> made available. However, this can be overridden locally inside the
> container by doing kinit and kafs-aklog there with a different user.
>
> More to the point, the container manager could mount the container's
> rootfs, say, over authenticated AFS and then attach the token to the
> container and mount the rootfs into the container and the container's
> inhabitant need not have any means to gain a kerberos login.
>
> [?] I do wonder if the possibility to use container key searches for
> direct mounts should be controlled by a mount option, say:
>
> fsconfig(fsfd, FSCONFIG_SET_CONTAINER, NULL, NULL, cfd);
>
> where you have to have the container handle available.
>
> [!] Note that test-cont-grant picks the container by name and does not
> require the container handle when setting the key ACL - but the
> name must come from the set of children of the current container.
>
>
> The patches can be found here also:
>
> http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=container
>
> Note that this is dependent on the mount-api-viro, fsinfo, notifications
> and keys-namespace branches.
>
> David
> ---
> David Howells (27):
> containers: Rename linux/container.h to linux/container_dev.h
> containers: Implement containers as kernel objects
> containers: Provide /proc/containers
> containers: Allow a process to be forked into a container
> containers: Open a socket inside a container
> containers, vfs: Allow syscall dirfd arguments to take a container fd
> containers: Make fsopen() able to create a superblock in a container
> containers, vfs: Honour CONTAINER_NEW_EMPTY_FS_NS
> vfs: Allow mounting to other namespaces
> containers: Provide fs_context op for container setting
> containers: Sample program for driving container objects
> containers: Allow a daemon to intercept request_key upcalls in a container
> keys: Provide a keyctl to query a request_key authentication key
> keys: Break bits out of key_unlink()
> keys: Make __key_link_begin() handle lockdep nesting
> keys: Grant Link permission to possessers of request_key auth keys
> keys: Add a keyctl to move a key between keyrings
> keys: Find the least-recently used unseen key in a keyring.
> containers: Sample: request_key upcall handling
> container, keys: Add a container keyring
> keys: Fix request_key() lack of Link perm check on found key
> KEYS: Replace uid/gid/perm permissions checking with an ACL
> KEYS: Provide KEYCTL_GRANT_PERMISSION
> keys: Allow a container to be specified as a subject in a key's ACL
> keys: Provide a way to ask for the container keyring
> keys: Allow containers to be included in key ACLs by name
> containers: Sample to grant access to a key in a container
>
>
> arch/x86/entry/syscalls/syscall_32.tbl | 3
> arch/x86/entry/syscalls/syscall_64.tbl | 3
> arch/x86/ia32/sys_ia32.c | 2
> certs/blacklist.c | 7
> certs/system_keyring.c | 12
> drivers/acpi/container.c | 2
> drivers/base/container.c | 2
> drivers/md/dm-crypt.c | 2
> drivers/nvdimm/security.c | 2
> fs/afs/security.c | 2
> fs/afs/super.c | 18 +
> fs/cifs/cifs_spnego.c | 25 +
> fs/cifs/cifsacl.c | 28 +
> fs/cifs/connect.c | 4
> fs/crypto/keyinfo.c | 2
> fs/ecryptfs/ecryptfs_kernel.h | 2
> fs/ecryptfs/keystore.c | 2
> fs/fs_context.c | 39 +
> fs/fscache/object-list.c | 2
> fs/fsopen.c | 54 ++
> fs/namei.c | 45 +-
> fs/namespace.c | 129 ++++-
> fs/nfs/nfs4idmap.c | 29 +
> fs/proc/root.c | 20 +
> fs/ubifs/auth.c | 2
> include/linux/container.h | 100 +++-
> include/linux/container_dev.h | 25 +
> include/linux/cred.h | 3
> include/linux/fs_context.h | 5
> include/linux/init_task.h | 1
> include/linux/key-type.h | 2
> include/linux/key.h | 122 +++--
> include/linux/lsm_hooks.h | 20 +
> include/linux/nsproxy.h | 7
> include/linux/pid.h | 5
> include/linux/proc_ns.h | 6
> include/linux/sched.h | 3
> include/linux/sched/task.h | 3
> include/linux/security.h | 15 +
> include/linux/socket.h | 3
> include/linux/syscalls.h | 6
> include/uapi/linux/container.h | 28 +
> include/uapi/linux/keyctl.h | 85 +++
> include/uapi/linux/mount.h | 4
> init/Kconfig | 7
> init/init_task.c | 3
> ipc/mqueue.c | 10
> kernel/Makefile | 2
> kernel/container.c | 532 ++++++++++++++++++++
> kernel/cred.c | 45 ++
> kernel/exit.c | 1
> kernel/fork.c | 111 ++++
> kernel/namespaces.h | 15 +
> kernel/nsproxy.c | 32 +
> kernel/pid.c | 4
> kernel/sys_ni.c | 5
> lib/digsig.c | 2
> net/ceph/ceph_common.c | 2
> net/compat.c | 2
> net/dns_resolver/dns_key.c | 12
> net/dns_resolver/dns_query.c | 15 -
> net/rxrpc/key.c | 16 -
> net/socket.c | 34 +
> samples/vfs/Makefile | 12
> samples/vfs/test-cont-grant.c | 84 +++
> samples/vfs/test-container.c | 382 ++++++++++++++
> samples/vfs/test-upcall.c | 243 +++++++++
> security/integrity/digsig.c | 31 -
> security/integrity/digsig_asymmetric.c | 2
> security/integrity/evm/evm_crypto.c | 2
> security/integrity/ima/ima_mok.c | 13
> security/integrity/integrity.h | 4
> .../integrity/platform_certs/platform_keyring.c | 13
> security/keys/Makefile | 2
> security/keys/compat.c | 20 +
> security/keys/container.c | 419 ++++++++++++++++
> security/keys/encrypted-keys/encrypted.c | 2
> security/keys/encrypted-keys/masterkey_trusted.c | 2
> security/keys/gc.c | 2
> security/keys/internal.h | 34 +
> security/keys/key.c | 35 -
> security/keys/keyctl.c | 176 +++++--
> security/keys/keyring.c | 198 ++++++-
> security/keys/permission.c | 446 +++++++++++++++--
> security/keys/persistent.c | 27 +
> security/keys/proc.c | 17 -
> security/keys/process_keys.c | 102 +++-
> security/keys/request_key.c | 70 ++-
> security/keys/request_key_auth.c | 21 +
> security/security.c | 12
> security/selinux/hooks.c | 16 +
> security/smack/smack_lsm.c | 3
> 92 files changed, 3696 insertions(+), 425 deletions(-)
> create mode 100644 include/linux/container_dev.h
> create mode 100644 include/uapi/linux/container.h
> create mode 100644 kernel/container.c
> create mode 100644 kernel/namespaces.h
> create mode 100644 samples/vfs/test-cont-grant.c
> create mode 100644 samples/vfs/test-container.c
> create mode 100644 samples/vfs/test-upcall.c
> create mode 100644 security/keys/container.c