Re: [PATCH v1 2/2] signal: add procfd_signal() syscall

From: Aleksa Sarai
Date: Mon Nov 19 2018 - 16:24:01 EST


On 2018-11-20, Aleksa Sarai <cyphar@xxxxxxxxxx> wrote:
> On 2018-11-19, Christian Brauner <christian@xxxxxxxxxx> wrote:
> > On Tue, Nov 20, 2018 at 07:28:57AM +1100, Aleksa Sarai wrote:
> > > On 2018-11-19, Christian Brauner <christian@xxxxxxxxxx> wrote:
> > > > + if (info) {
> > > > + ret = __copy_siginfo_from_user(sig, &kinfo, info);
> > > > + if (unlikely(ret))
> > > > + goto err;
> > > > + /*
> > > > + * Not even root can pretend to send signals from the kernel.
> > > > + * Nor can they impersonate a kill()/tgkill(), which adds
> > > > + * source info.
> > > > + */
> > > > + ret = -EPERM;
> > > > + if ((kinfo.si_code >= 0 || kinfo.si_code == SI_TKILL) &&
> > > > + (task_pid(current) != pid))
> > > > + goto err;
> > > > + } else {
> > > > + prepare_kill_siginfo(sig, &kinfo);
> > > > + }
> > >
> > > I wonder whether we should also have a pidns restriction here, since
> > > currently it isn't possible for a container process using a pidns to
> > > signal processes outside its pidns. AFAICS, this isn't done through an
> > > explicit check -- it's a side-effect of processes in a pidns not being
> > > able to address non-descendant-pidns processes.
> > >
> > > But maybe it's reasonable to allow sending a procfd to a different pidns
> > > and the same operations working on it? If we extend the procfd API to
> >
> > No, I don't think so. I really don't want any fancy semantics in here.
> > Fancy doesn't get merged and fancy is hard to maintain. So we should do
> > something like:
> >
> > if (proc_pid_ns() != current_pid_ns)
> > return EINVAL
>
> This isn't quite sufficient. The key thing is that you have to be in an
> *ancestor* (or same) pidns, not the *same* pidns. Ideally you can re-use
> the check already in pidns_get_parent, and expose it. It would be
> something as trivial as:
>
> bool pidns_is_descendant(struct pid_namespace *ns,
> struct pid_namespace *ancestor)
> {
> for (;;) {
> if (!ns)
> return false;
> if (ns == ancestor)
> break;
> ns = ns->parent;
> }
> return true;
> }
>
> And you can rewrite pidns_get_parent to use it. So you would instead be
> doing:
>
> if (pidns_is_descendant(proc_pid_ns, task_active_pid_ns(current)))
> return -EPERM;

Scratch the last bit, -EPERM is wrong here. I would argue that -EINVAL
is *somewhat* wrong because arguable the more semantically consistent
error (with kill(2)) would be -ESRCH -- but then you're mixing the "pid
is dead" and "pid is not visible to you" cases. I'm not sure what the
right errno would be here (I'm sure some of the LKML greybeards will
have a better clue.) :P

--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

Attachment: signature.asc
Description: PGP signature