Re: [PATCH net-next v2 2/3] net: core: add getsockopt SO_PEERPIDFD

From: Christian Brauner
Date: Wed Mar 22 2023 - 11:35:59 EST


On Tue, Mar 21, 2023 at 07:33:41PM +0100, Alexander Mikhalitsyn wrote:
> Add SO_PEERPIDFD which allows to get pidfd of peer socket holder pidfd.
> This thing is direct analog of SO_PEERCRED which allows to get plain PID.
>
> Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
> Cc: Eric Dumazet <edumazet@xxxxxxxxxx>
> Cc: Jakub Kicinski <kuba@xxxxxxxxxx>
> Cc: Paolo Abeni <pabeni@xxxxxxxxxx>
> Cc: Leon Romanovsky <leon@xxxxxxxxxx>
> Cc: David Ahern <dsahern@xxxxxxxxxx>
> Cc: Arnd Bergmann <arnd@xxxxxxxx>
> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> Cc: Christian Brauner <brauner@xxxxxxxxxx>
> Cc: Kuniyuki Iwashima <kuniyu@xxxxxxxxxx>
> Cc: Lennart Poettering <mzxreary@xxxxxxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Cc: netdev@xxxxxxxxxxxxxxx
> Cc: linux-arch@xxxxxxxxxxxxxxx
> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@xxxxxxxxxxxxx>
> ---
> v2:
> According to review comments from Kuniyuki Iwashima and Christian Brauner:
> - use pidfd_create(..) retval as a result
> - whitespace change
> ---
> arch/alpha/include/uapi/asm/socket.h | 1 +
> arch/mips/include/uapi/asm/socket.h | 1 +
> arch/parisc/include/uapi/asm/socket.h | 1 +
> arch/sparc/include/uapi/asm/socket.h | 1 +
> include/uapi/asm-generic/socket.h | 1 +
> net/core/sock.c | 21 +++++++++++++++++++++
> tools/include/uapi/asm-generic/socket.h | 1 +
> 7 files changed, 27 insertions(+)
>
> diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
> index ff310613ae64..e94f621903fe 100644
> --- a/arch/alpha/include/uapi/asm/socket.h
> +++ b/arch/alpha/include/uapi/asm/socket.h
> @@ -138,6 +138,7 @@
> #define SO_RCVMARK 75
>
> #define SO_PASSPIDFD 76
> +#define SO_PEERPIDFD 77
>
> #if !defined(__KERNEL__)
>
> diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
> index 762dcb80e4ec..60ebaed28a4c 100644
> --- a/arch/mips/include/uapi/asm/socket.h
> +++ b/arch/mips/include/uapi/asm/socket.h
> @@ -149,6 +149,7 @@
> #define SO_RCVMARK 75
>
> #define SO_PASSPIDFD 76
> +#define SO_PEERPIDFD 77
>
> #if !defined(__KERNEL__)
>
> diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
> index df16a3e16d64..be264c2b1a11 100644
> --- a/arch/parisc/include/uapi/asm/socket.h
> +++ b/arch/parisc/include/uapi/asm/socket.h
> @@ -130,6 +130,7 @@
> #define SO_RCVMARK 0x4049
>
> #define SO_PASSPIDFD 0x404A
> +#define SO_PEERPIDFD 0x404B
>
> #if !defined(__KERNEL__)
>
> diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
> index 6e2847804fea..682da3714686 100644
> --- a/arch/sparc/include/uapi/asm/socket.h
> +++ b/arch/sparc/include/uapi/asm/socket.h
> @@ -131,6 +131,7 @@
> #define SO_RCVMARK 0x0054
>
> #define SO_PASSPIDFD 0x0055
> +#define SO_PEERPIDFD 0x0056
>
> #if !defined(__KERNEL__)
>
> diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
> index b76169fdb80b..8ce8a39a1e5f 100644
> --- a/include/uapi/asm-generic/socket.h
> +++ b/include/uapi/asm-generic/socket.h
> @@ -133,6 +133,7 @@
> #define SO_RCVMARK 75
>
> #define SO_PASSPIDFD 76
> +#define SO_PEERPIDFD 77
>
> #if !defined(__KERNEL__)
>
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 3f974246ba3e..85c269ca9d8a 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1763,6 +1763,27 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
> goto lenout;
> }
>
> + case SO_PEERPIDFD:
> + {
> + struct pid *peer_pid;
> + int pidfd;
> +
> + if (len > sizeof(pidfd))
> + len = sizeof(pidfd);
> +
> + spin_lock(&sk->sk_peer_lock);
> + peer_pid = get_pid(sk->sk_peer_pid);
> + spin_unlock(&sk->sk_peer_lock);
> +
> + pidfd = pidfd_create(peer_pid, 0);
> +
> + put_pid(peer_pid);
> +
> + if (copy_to_sockptr(optval, &pidfd, len))
> + return -EFAULT;

This leaks the pidfd. We could do:

if (copy_to_sockptr(optval, &pidfd, len)) {
close_fd(pidfd);
return -EFAULT;
}

but it's a nasty anti-pattern to install the fd in the caller's fdtable
and then close it again. So let's avoid it if we can. Since you can only
set one socket option per setsockopt() sycall we should be able to
reserve an fd and pidfd_file, do the stuff that might fail, and then
call fd_install. So that would roughly be:

peer_pid = get_pid(sk->sk_peer_pid);
pidfd_file = pidfd_file_create(peer_pid, 0, &pidfd);
f (copy_to_sockptr(optval, &pidfd, len))
return -EFAULT;
goto lenout:

.
.
.

lenout:
if (copy_to_sockptr(optlen, &len, sizeof(int)))
return -EFAULT;

// Made it safely, install pidfd now.
fd_install(pidfd, pidfd_file)

(See below for the associated api I'm going to publish independent of
this as kernel/fork.c and fanotify both could use it.)

But now, let's look at net/socket.c there's another wrinkle. So let's say you
have successfully installed the pidfd then it seems you can still fail later:

if (level == SOL_SOCKET)
err = sock_getsockopt(sock, level, optname, optval, optlen);
else if (unlikely(!sock->ops->getsockopt))
err = -EOPNOTSUPP;
else
err = sock->ops->getsockopt(sock, level, optname, optval,
optlen);

if (!in_compat_syscall())
err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname,
optval, optlen, max_optlen,
err);

out_put:
fput_light(sock->file, fput_needed);
return err;

If the bpf hook returns an error we've placed an fd into the caller's sockopt
buffer without their knowledge.