Re: [PATCH 5/7] seccomp_filter: Document what seccomp_filter isand how it works.

From: Frederic Weisbecker
Date: Wed May 04 2011 - 13:04:09 EST


On Wed, May 04, 2011 at 12:22:40PM -0400, Eric Paris wrote:
> On Wed, 2011-05-04 at 12:06 -0400, Steven Rostedt wrote:
> > On Wed, 2011-05-04 at 11:54 -0400, Eric Paris wrote:
> >
> > > As this is a deny by default interface which only allows you to further
> > > restrict you couldn't add more than 1 syscall if you didn't have an
> > > explict 'apply' action.
> > >
> > > SECCOMP_FILTER_SET, __NR_fo, "a=0"
> > > SECCOMP_FILTER_SET, __NR_read, "1" == EPERM
> > >
> > > Maybe apply on set is fine after the first apply, but we definitely need
> > > some way to do more than 1 set before the rules are applied....
> >
> > So we could have SET be 'or' and APPLY be 'and'.
> >
> > SECCOMP_FILTER_SET, __NR_foo, "a=0"
> > SECCOMP_FILTER_SET, __NR_read, "1" == EPERM
>
> When I said "== EPERM" I meant that the given prctl call would return
> EPERM. I'm going to pretend that you didn't type it.
>
> > SECCOPM_FILTER_APPLY
> >
> > SECCOMP_FILTER_SET, __NR_foo, "b=0"
> > SECCOPM_FILTER_APPLY
> >
> > Will end up being:
> >
> > (foo: a == 0 || read: "1") && (foo: b == 0)
> >
> > The second set/apply now removes the read option, and foo only works if
> > a is 0 and b is 0.
> >
> > This would also work for children, as they can only restrict (with
> > 'and') and can not add more control.
>
> I think we pretty much agree although I'm pretty that we will have 1
> filter per syscall. So the rules would really be (in your syntax)
>
> Rule1: (foo: a == 0 && b == 0)
> OR
> Rule2: (read: "1")
>
> Although logically the same, it's not just one huge rule. I don't see
> any need for any operation other than an &&. Before the first "set" you
> can add new syscalls. After the first set you can only && onto existing
> syscalls. So the following set of operations:
>
> SECCOMP_FILTER_SET, __NR_foo, "a=0"
> SECCOMP_FILTER_SET, __NR_read, "1"
> SECCOPM_FILTER_APPLY
>
> SECCOMP_FILTER_SET, __NR_foo, "b=0"
> SECCOMP_FILTER_APPLY
>
> SECCOMP_FILTER_SET, __NR_write, "1"
> SECCOMP_FILTER_APPLY
>
> Would return EPERM for the __NR_write entry since it was a new syscall
> after a set. I think we agree on all this.

No, why?

The default filter for a syscall, if none have been given for it, is "0".

Thus, if you write "1" later, the entire filter is going to be:

"0 && 1"

Which is fine, we are not overriding already applied permissions there.

So where is the need to return -EPERM in such a specific case? Is it
worth the corner case to check in the kernel, and to handle in userspace?
And for what reason?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/