Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

From: Will Drewry
Date: Thu May 26 2011 - 14:08:16 EST


On Thu, May 26, 2011 at 12:38 PM, <Valdis.Kletnieks@xxxxxx> wrote:
> On Thu, 26 May 2011 12:02:45 CDT, Will Drewry said:
>
>> Absolutely - that was what I meant :/  The patches do not currently
>> check creds at creation or again at use, which would lead to
>> unprivileged filters being used in a privileged context.  Right now,
>> though, if setuid() is not allowed by the seccomp-filter, the process
>> will be immediately killed with do_exit(SIGKILL) on call -- thus
>> avoiding a silent failure.
>
> How do you know you have the bounding set correct?
>
> This has been a long-standing issue for SELinux policy writing - it's usually
> easy to get 95% of the bounding box right (you need these rules for shared
> libraries, you need these rules to access the user's home directory, you need
> these other rules to talk TCP to the net, etc).  There's a nice tool that
> converts any remaining rejection messages into rules you can add to the policy.
>
> The problem is twofold: (a) that way you can never be sure you got *all* the
> rules right and (b) the missing rules are almost always in squirrelly little
> error-handling code that gets invoked once in a blue moon.  So in this case,
> you end up with trying to debug the SIGKILL that happened when the process was
> already in trouble for some other reason...
>
> "Wow. Who would have guessed that program only called gettimeofday() in
> the error handler for when it was formatting its crash message?"
>
> Exactly.

Depending on the need, there is work involved, and there are many ways
to determine your bounding box. It can be very tight -- where you
analyze normal workloads (perf,strace,objdump) and accept the fact
that pathological workloads may result in process death -- or it can
be quite loose and enable most system calls, just not newer ones,
let's say. In practice, you might get bit a few times if you're
overly zealous (I know I have), but it's the difference between
failing open and failing closed. There are some scenarios where you
never, ever want to fail-open even at the cost of process death and
lack of solid insight into a valid failure path.

Hope that makes sense and isn't too general,
will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/