Re: [PATCH bpf-next v4 3/8] bpf: lsm: provide attachment points for BPF LSM programs

From: KP Singh
Date: Tue Feb 25 2020 - 14:29:21 EST


On 24-Feb 13:41, Kees Cook wrote:
> On Mon, Feb 24, 2020 at 10:45:27AM -0800, Casey Schaufler wrote:
> > On 2/24/2020 9:13 AM, KP Singh wrote:
> > > On 24-Feb 08:32, Casey Schaufler wrote:
> > >> On 2/23/2020 2:08 PM, Alexei Starovoitov wrote:
> > >>> On Fri, Feb 21, 2020 at 08:22:59PM -0800, Kees Cook wrote:
> > >>>> If I'm understanding this correctly, there are two issues:
> > >>>>
> > >>>> 1- BPF needs to be run last due to fexit trampolines (?)
> > >>> no.
> > >>> The placement of nop call can be anywhere.
> > >>> BPF trampoline is automagically converting nop call into a sequence
> > >>> of directly invoked BPF programs.
> > >>> No link list traversals and no indirect calls in run-time.
> > >> Then why the insistence that it be last?
> > > I think this came out of the discussion about not being able to
> > > override the other LSMs and introduce a new attack vector with some
> > > arguments discussed at:
> > >
> > > https://lore.kernel.org/bpf/20200109194302.GA85350@xxxxxxxxxx/
> > >
> > > Let's say we have SELinux + BPF runnng on the system. BPF should still
> > > respect any decisions made by SELinux. This hasn't got anything to
> > > do with the usage of fexit trampolines.
> >
> > The discussion sited is more about GPL than anything else.
> >
> > The LSM rule is that any security module must be able to
> > accept the decisions of others. SELinux has to accept decisions
> > made ahead of it. It always has, as LSM checks occur after
> > "traditional" checks, which may fail. The only reason that you
> > need to be last in this implementation appears to be that you
> > refuse to use the general mechanisms. You can't blame SELinux
> > for that.
>
> Okay, this is why I wanted to try to state things plainly. The "in last
> position" appears to be the result of a couple design choices:
>
> -the idea of "not wanting to get in the way of other LSMs", while
> admirable, needs to actually be a non-goal to be "just" a stacked LSM
> (as you're saying here Casey). This position _was_ required for the
> non-privileged LSM case to avoid security implications, but that goal
> not longer exists here either.
>
> -optimally using the zero-cost call-outs (static key + fexit trampolines)
> meant it didn't interact well with the existing stacking mechanism.
>
> So, fine, these appear to be design choices, not *specifically*
> requirements. Let's move on, I think there is more to unpack here...
>
> > >>>> 2- BPF hooks don't know what may be attached at any given time, so
> > >>>> ALL LSM hooks need to be universally hooked. THIS turns out to create
> > >>>> a measurable performance problem in that the cost of the indirect call
> > >>>> on the (mostly/usually) empty BPF policy is too high.
> > >>> also no.
>
> AIUI, there was some confusion on Alexei's reply here. I, perhaps,
> was not as clear as I needed to be. I think the later discussion on
> performance overheads gets more into the point, and gets us closer to
> the objections Alexei had. More below...
>
> > > This approach still had the issues of an indirect call and an
> > > extra check when not used. So this was not truly zero overhead even
> > > after special casing BPF.
> >
> > The LSM mechanism is not zero overhead. It never has been. That's why
> > you can compile it out. You get added value at a price. You get the
> > ability to use SELinux and KRSI together at a price. If that's unacceptable
> > you can go the route of seccomp, which doesn't use LSM for many of the
> > same reasons you're on about.
> > [...]
> > >>>> So, trying to avoid the indirect calls is, as you say, an optimization,
> > >>>> but it might be a needed one due to the other limitations.
> > >>> I'm convinced that avoiding the cost of retpoline in critical path is a
> > >>> requirement for any new infrastructure in the kernel.
> > >> Sorry, I haven't gotten that memo.
>
> I agree with Casey here -- it's a nice goal, but those cost evaluations have
> not yet(?[1]) hit the LSM world. I think it's a desirable goal, to be
> sure, but this does appear to be an early optimization.
>
> > [...]
> > It can do that wholly within KRSI hooks. You don't need to
> > put KRSI specific code into security.c.
>
> This observation is where I keep coming back to.
>
> Yes, the resulting code is not as fast as it could be. The fact that BPF
> triggers the worst-case performance of LSM hooking is the "new" part
> here, from what I can see.
>
> I suspect the problem is that folks in the BPF subsystem don't want to
> be seen as slowing anything down, even other subsystems, so they don't
> want to see this done in the traditional LSM hooking way (which contains
> indirect calls).
>
> But the LSM subsystem doesn't want special cases (Casey has worked very
> hard to generalize everything there for stacking). It is really hard to
> accept adding a new special case when there are still special cases yet
> to be worked out even in the LSM code itself[2].
>
> > >>> Networking stack converted all such places to conditional calls.
> > >>> In BPF land we converted indirect calls to direct jumps and direct calls.
> > >>> It took two years to do so. Adding new indirect calls is not an option.
> > >>> I'm eagerly waiting for Peter's static_call patches to land to convert
> > >>> a lot more indirect calls. May be existing LSMs will take advantage
> > >>> of static_call patches too, but static_call is not an option for BPF.
> > >>> That's why we introduced BPF trampoline in the last kernel release.
> > >> Sorry, but I don't see how BPF is so overwhelmingly special.
> > > My analogy here is that if every tracepoint in the kernel were of the
> > > type:
> > >
> > > if (trace_foo_enabled) // <-- Overhead here, solved with static key
> > > trace_foo(a); // <-- retpoline overhead, solved with fexit trampolines
>
> This is a helpful distillation; thanks.
>
> static keys (perhaps better described as static branches) make sense to
> me; I'm familiar with them being used all over the place[3]. The resulting
> "zero performance" branch mechanism is extremely handy.
>
> I had been thinking about the fexit stuff only as a way for BPF to call
> into kernel functions directly, and I missed the place where this got
> used for calling from the kernel into BPF directly. KP walked me through
> the fexit stuff off list. I missed where there NOP stub ("noinline int
> bpf_lsm_##NAME(__VA_ARGS__) { return 0; }") was being patched by BPF in
> https://lore.kernel.org/lkml/20200220175250.10795-6-kpsingh@xxxxxxxxxxxx/
> The key bit being "bpf_trampoline_update(prog)"
>
> > > It would be very hard to justify enabling them on a production system,
> > > and the same can be said for BPF and KRSI.
> >
> > The same can be and has been said about the LSM infrastructure.
> > If BPF and KRSI are that performance critical you shouldn't be
> > tying them to LSM, which is known to have overhead. If you can't
> > accept the LSM overhead, get out of the LSM. Or, help us fix the
> > LSM infrastructure to make its overhead closer to zero. Whether
> > you believe it or not, a lot of work has gone into keeping the LSM
> > overhead as small as possible while remaining sufficiently general
> > to perform its function.
> >
> > No. If you're too special to play by LSM rules then you're special
> > enough to get into the kernel using more direct means.
>
> So, I see the primary conflict here being about the performance
> optimizations. AIUI:
>
> - BPF subsystem maintainers do not want any new slowdown associated
> with BPF
> - LSM subsystem maintainers do not want any new special cases in
> LSM stacking
>
> So, unless James is going to take this over Casey's objections, the path
> forward I see here is:
>
> - land a "slow" KRSI (i.e. one that hooks every hook with a stub).
> - optimize calling for all LSMs

I will work on v5 which resgisters the nops as standard LSM hooks and
we can follow-up on performance.

- KP

>
> Does this seem right to everyone?
>
> -Kees
>
>
> [1] There is a "known cost to LSM", but as Casey mentions, it's been
> generally deemed "acceptable". There have been some recent attempts to
> quantify it, but it's not been very clear:
> https://lore.kernel.org/linux-security-module/c98000ea-df0e-1ab7-a0e2-b47d913f50c8@xxxxxxxxxxxxx/ (lore is missing half this conversation for some reason)
>
> [2] Casey's work to generalize the LSM interfaces continues and it quite
> complex:
> https://lore.kernel.org/linux-security-module/20200214234203.7086-1-casey@xxxxxxxxxxxxxxxx/
>
> [3] E.g. HARDENED_USERCOPY uses it:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/usercopy.c?h=v5.5#n258
> and so does the heap memory auto-initialization:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/slab.h?h=v5.5#n676
>
> --
> Kees Cook