Re: [patch 2.1.97] more capabilities support

Alexander Kjeldaas (astor@guardian.no)
Wed, 29 Apr 1998 21:01:01 +0200


On Wed, Apr 29, 1998 at 01:09:52PM +0200, Andrej Presern wrote:
>
> If we talk about pure capabilities, we talk about expressivity. A pure
> capability system will perform an action for you if you are able to
> express the request. In contrast, a list based system will let you
> express the request and then verify if the request should be performed
> for you (ie if you are authorized).
>
> In a drawing, the difference would be something like this:
>
> Lists:
> +----+ +-------+
> |obj1| ---request----|-->obj2|
> +----+ +-------+
>
> After the border of authority has already been passed, obj2 checks if
> obj1 is authorized for the action, and if not, the action is denied and
> the execution returns to obj1.
>
>
> Capabilities:
> +----+ +----+
> |obj1|---request---->|obj2|
> +----+ +----+
>
> The border of authority is not passed, for if it were, obj2 will simply
> perform the action without any checking. If obj1 is not authorized to
> request that an action be performed for it by obj2, obj1 cannot even
> pass the border of authority. This means that obj2 doesn't have to
> execute not even a single instruction if obj1 doesn't have the
> authorization to invoke a capability on it.
>

Yes, this is a nice idea. You separate checking authority to a single
place, or if possible use hardware mechanisms to catch a fault. We use
this when copying information to/from user-land. Other than the
user-land/kernel border, there are a lot of "almost" borders in the
kernel. We have a common permission() routine in the VFS layer and we
have common implementations of all struct file_operations routines so
you don't have to duplicate unnecessary authority checks. When
possible, it is of course in everybody's interest to simplify the
logic needed to check for authority.

But whether you call it a "border of authority" or not, somewhere you
have to check for authority, and in the Linux kernel you have to deal
with an existing design. POSIX capabilities are designed to deal with
the typical UNIX design - all you [basically] do is change suser()
calls to capable() calls [the backwards compatibility stuff is needed
no matter how this is implemented]. System call capabilities must be
only a small part of your overall design, because you can't use them
to express some of the important privileged operations the kernel
provides. I want to know how you will support expressing those
operations - which means at least representing the POSIX
capabilities. Some important capabilities:

CAP_DAC_READ_SEARCH - capability needed to take backup
CAP_SYS_TIME - capability needed for xntpd.
CAP_KILL - kill any process
CAP_NET_BIND_SERVICE - any well-known service
CAP_NET_RAW - tcpdump
CAP_SYS_RAWIO - X server
CAP_SYS_RESOURCE - important daemons

Of these, CAP_SYS_RAWIO can be caught by the system call filter. If
you comment on nothing else in this mail, _please_ tell me how you
will to implement the other capabilities above.

> Because a capability can have an arbitrary amount of authority, it can
> have an arbitrary amount of expressiveness. But as you can see, the
> issue is not about how much you can express in a request but rather _if_
> you can express the request, because the latter is what makes the
> difference between lists and capabilities.

This isn't useful information to me. The expressiveness depends on the
implementation. You don't have "arbitrary amount of expressiveness".

> You stated in your mail (as quoted) that you didn't want to do
> CAP_NETWORK yet because it would introduce more complexity to the
> system. The always increasing complexity is exactly one of the
> disadvantages of a lists design! With every additional authority you
> need an additional check/branch in the program that will handle it, for
> if you don't you have a security hole. This means that an additional
> amount of code is needed to accomodate the check, which then reflects in
> increased complexity, increased storage use and worse performance.
>
> With pure capabilities on the other hand, time and storage required to
> perform a check is always constant: 0 (zero), because the check is done
> implicitly at the object border, so the object does not have to waste
> any resources to perform it explicitly.
>

As I have stated before, I don't have anything against pure
capabilities, but I haven't seen them yet. Not with the expressiveness
that I want, and certainly not with arbitrary expressiveness.

>
> You only emulate what you already have: a lists design. That is also
> exactly the reason why the proposed extension may seem as not very
> efficient: it is indeed an emulation of a lists design, which is needed
> however to support backward compatibility.
>

But then we're back to where we started! We both try to be backwards
compatible and we both come up with a lists design! It's just that the
expressiveness differ. Your design can express things that the POSIX
one cannot do without additional checks (CAP_EXEC, CAP_NETWORK
etc). Your design cannot implement things that the POSIX one can. Both
are nice, but it is my belief that being able to precisely express
privileged operations is more important since abuse of privileged
operations is what causes most trouble on our systems.

[explanation of why sys_call_table is faster]

I'm aware of how this would work. It might be faster, it might be
slower, you'll have to test to find out. Both techniques are pretty
fast. The capable() requires two instructions and a branch, yours
requires more work on context switch and a somewhat more chance of a
cache miss. I'm happy as long as you can get roughly similar speed,
which I think you will.

> > 3) they require 60 times the memory, and
> > 4) are harder to administer, and
> > 5) are probably slower
>
> As you can see, the above mechanism can hardly be slower than what you
> propose, and I'm sure you will also agree that it's also easier to
> administer. As for memory, the reason for increased memory is because

No, I don't agree that they are easier to administer. Please explain.

> we're emulating a lists design, where the whole list needs to be
> initialized if it should work. If the syscall mechanism wasn't
> fundamentally broken in the first place, the emulation would not be so
> inefficient memory-wise.

A simple change to the system call invocation path could check the
bitmap I proposed instead of having to deal with an actual pointer
array, and would save memory. But this is details.

>
> > [...]
> > > The other way to abuse the program is that the attacker installs the
> > > syscall capability that contains execve() by itself before calling
> > > execve(). But this means that the attacker must have such a capability.
> > > And because it can't just produce one by itself, and because it can't
> > > get one from the system (it can only get a copy of the currently
> > > installed one) the only way to get it is by stealing it from the
> > > attacked process (and to do that it must know exactly where the attacked
> > > process holds it), which complicates things even more.
> >
> > If you want it to be difficult to steal the capability from the
> > running process, you will have to _design_ it to be difficult. Until I
> > see some evidence suggesting it is difficult, I'll assume it is as
> > easy as stealing POSIX capabilities.
>
> Can you please explain to me how you can steal a file descriptor in
> Linux?
>

We must be misunderstanding each other here. Using a file descriptor
in a buffer overflow exploit is done by either knowing the the file
descriptor number [since the assignment rules are *easily* guessable],
or fetching one from the stack or some variable. A similar technique
could be used with capabilities. There's nothing inherently difficult
with stealing a capability for an attacker compared to stealing any
other data the process might hold. This is obvious until there is some
remarkable design proving something else.

astor

-- 
 Alexander Kjeldaas, Guardian Networks AS, Trondheim, Norway
 http://www.guardian.no/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu