Re: [PATCH] proc: allow killing processes via file descriptors

From: Andy Lutomirski
Date: Sun Nov 18 2018 - 20:43:58 EST


On Sun, Nov 18, 2018 at 12:32 PM Daniel Colascione <dancol@xxxxxxxxxx> wrote:
>
> On Sun, Nov 18, 2018 at 12:28 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> >> That is, I'm proposing an API that looks like this:
> >>
> >> int process_kill(int procfs_dfd, int signo, const union sigval value)
> >>
> >> If, later, process_kill were to *also* accept process-capability FDs,
> >> nothing would break.
> >
> > Except that this makes it ambiguous to the caller as to whether their current creds are considered. So it would need to be a different syscall or at least a flag. Otherwise a lot of those nice theoretical properties go away.
>
> Sure. A flag might make for better ergonomics.
>
> >> Yes, that's what I have in mind. A siginfo_t is small enough that we
> >> could just store it as a blob allocated off the procfs inode or
> >> something like that without bothering with a shmfs file. You'd be able
> >> to read(2) the exit status as many times as you wanted.
> >
> > I think that, if the syscall in question is read(2), then it should work *once* per struct file. Otherwise running cat on the file would behave very oddly.
>
> Why? The file pointer would work normally.

Can you explain the exact semantics? If I have an fd where read(2)
returns the same 4-byte value every time read(2) is called, then cat
will just return an infinite sequence of the same value. This is not
a complete disaster, but it's not really a good thing.

>
> > Read and poll have the same problem as write: we canât check caps in read or poll either.
>
> Why not? Reading /proc/pid/stat does an access check today and
> conditionally replaces the exit status with zero.

And that's probably a bug. It's at least a giant kludge that we shouldn't copy.

Here is the general rule: the basic operations that are expected to
treat file descriptors as capabilities *must* treat file descriptors
as capabilities, at least for new APIs. This includes read(2),
write(2), and poll(2). We should have an exceedingly good reason to
check current's creds, mm, or anything else about current in those
syscalls.

There is a good reason for this: consider what happens if you type:

sudo >/proc/PID/whatever

or

sudo </proc/PID/whatever