Re: [PATCH 3/7] seccomp_filter: Enable ftrace-based system call filtering

From: Will Drewry
Date: Thu Apr 28 2011 - 11:15:13 EST


On Thu, Apr 28, 2011 at 9:29 AM, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> On Wed, Apr 27, 2011 at 10:08:47PM -0500, Will Drewry wrote:
>> This change adds a new seccomp mode based on the work by
>> agl@xxxxxxxxxxxxx This mode comes with a bitmask of NR_syscalls size and
>> an optional linked list of seccomp_filter objects. When in mode 2, all
>
> Since you now use the filters. Why not using them to filter syscalls
> entirely rather than using a bitmap of allowed syscalls?

The current approach just uses a linked list of filters. While a more
efficient data structure could be used, the bitmask provides a quick
binary decision, and optimizes for the relatively common case where
there won't be many non-binary filters to evaluate so we don't have to
walk the list for a larger number of yes/no decisions versus more
complex predicates. Though that may be a short-sighted view! I'm
happy to change it up.

> You have the "nr" field in syscall tracepoints.

I'n not sure I follow. Do you mean moving entirely to using the
actual tracepoint infrastructure instead of using the seccomp hooks,
or just looking up proper filter by syscall nr? If there's a sane and
better way to do the latter, I'm all ears :) As far as using the
tracepoints themselves, I looked to how the perf/ftrace interactions
worked and while I could've registered with the syscalls tracepoints
for enter and exit, it would mean later evaluation of the system call
interception, possibly out-of-order with respect to other registered
event sinks, and there is complexity in just killing current from
within the notifier-like list registered syscall events (as Eric Paris
ran into when expanding filtering into perf itself). To get around
that, the tracepoint handler would have to pump the data somewhere
else (like it does for perf), and it just seemed messy. I think it's
doable, but I don't know that the pure syscall tracepoint
infrastructure should be burdened with the added requirements that
come with seccomp-filtering. If I didn't properly understand the
code, though, please set me on the right path.

thanks!
will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/