Re: [PATCH net-next 1/2] bpf: enable non-root eBPF programs

From: Ingo Molnar
Date: Tue Oct 06 2015 - 04:21:00 EST



* Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:

> On 10/06/2015 09:13 AM, Ingo Molnar wrote:
> >
> >* Alexei Starovoitov <ast@xxxxxxxxxxxx> wrote:
> >
> >>On 10/5/15 3:14 PM, Daniel Borkmann wrote:
> >>>One scenario that comes to mind ... what happens when there are kernel
> >>>pointers stored in skb->cb[] (either from the current layer or an old
> >>>one from a different layer that the skb went through previously, but
> >>>which did not get overwritten)?
> >>>
> >>>Socket filters could read a portion of skb->cb[] also when unprived and
> >>>leak that out through maps. I think the verifier doesn't catch that,
> >>>right?
> >>
> >>grrr. indeed. previous layer before sk_filter() can leave junk in there.
> >
> >Could this be solved by activating zeroing/sanitizing of this data if there's an
> >active BPF function around that can access that socket?
>
> I think this check could only be done in sk_filter() for testing these
> conditions (unprivileged user + access to cb area), so it would need to
> happen from outside a native eBPF program. :/

Yes, the kernel (with code running outside of any eBPF program) would guarantee
that those data fields are zeroed/sanitized, if there's an eBPF program that is
attached to that socket.

> [...] Also classic BPF would then need to test for it, since a socket filter
> doesn't really know whether native eBPF is loaded there or a classic-to-eBPF
> transformed one, and classic never makes use of this. Anyway, it could be done
> by adding a bit flag cb_access:1 to the bpf_prog, set it during eBPF
> verification phase, and test it inside sk_filter() if I see it correctly.

That could also be done in an unlikely() branch, to keep the cost to the non-eBPF
case near zero.

> The reason is that this sanitizing must only be done in the 'top-level' program
> that is run from sk_filter() _directly_, because a user at any time could decide
> to put an already loaded eBPF fd into a tail call map. And cb[] is then used to
> pass args/state around between two programs, thus it cannot be unconditionally
> cleared from within the program. The association to a socket filter
> (SO_ATTACH_BPF) happens at a later time after a native eBPF program has already
> been loaded via bpf(2).

So zeroing tends to be very cheap and it could also be beneficial to performance
in terms of bringing the cacheline into the CPU cache. But I really don't know the
filter code so I'm just handwaving.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/