RE: Bug in how capability inheritance is handled in "fs/exec.c", 2.3.99

From: Linda Walsh (law@sgi.com)
Date: Sun May 28 2000 - 20:28:34 EST


> Check section 3.1.2.2 of Draft 17 (on page 18, lines 36-28):
>
> It explicitly specifies what happens to the inheritable, effective, and
> permitted flags of the new process image after an exec call. As near as
> I can tell the Linux implementation follows the requirements Draft 17.
>
> I_1 = I_0
> P_1 = (P_f && X) | (I_F && I_0)
> E_1 = E_F && P_1
>
> Where I_1, E_1, and P_1 represent the inheritable, effective, and
> permitted flags of the new process image, I_0 is the inheritable flag of
> the current process image, and I_F, E_F, and P_F, are the inheritable,
> effective, and permitted flags of the executable, and X represents
> possible additional implementation-defined restrictions.

---
	Oh...now isn't that just special.

So let me pose an example. Using *only* the constraints listed above. Let's supposed I have a system that has Inheritable bits set "permissive" -- meaning I want to allow parents to pass any bits to children and programs to allow passing any bits in (this is an administrative level policy).

Initially I have something like 'UID=backup' that only starts with the ability to do CAP_DAC_READ_SEARCH in their permitted set, so lets assume that is bit 0 and we use an 8 bit mask. That gives me a *default* permission vector of PIE=(0000 0001,1111 1111,0000 0000). If 'UID=backup' wants to do backup, they must specifically ask for that capability, say login (or su) w/E=00000001. That yields PIE= (0000 0001,1111 1111,0000 0001). So far this appears legal.

Now I execute a backup program. This program can be used by anyone -- a user can backup their own files -- only a user running w/CAP_DAC_READ_SEARCH can actually backup the system. However, it also allows accessing /dev/rawtape -- a RAW_IO (let's say cap=2, bit=1) device, so it has a capset of PIE=(0000 0010,11111111,0000 0010). Using the Above Cap formulae, I get:

1) I1=I0(1111 1111) 2) P1= Pf(0000 0010) | (If(11111111) & I1(11111111) 3) E1= Ef(1111 1111) & p1(0000 0010) E1 (new effective) = 1111 1111.

This is *obviously* bad and certainly not the intent of the POSIX spec.

Using the formula I proposed: 1) pI' = pI(1111 1111) & fI(1111 1111) 2) pP' = fP(0000 00010) | (pP(0000 0001) & pI'(1111 1111)) 3) pE' = fE(1111 1111) & pP'(0000 0011) Final effected = (0000 0011) -- the desired outcome.

In my non-thorough glance through P1003.1e I find no statement that states that the "permitted Inheritable" vector must be a subset of the Permitted vector. I believe the "permitted inheritable" vector is designed to be a mask -- not an equivalency. I may want to allow any bit (0 or 1) in the Pset, to be Inheritable.

But now, as a 2nd exercise, let's add the constraint that I-sets must be subset of their associated Pset. Using the first formula:

1) I1=I0(0000 0001) 2) P1= Pf(0000 0010) | (If(0000 0010) & I1(0000 0001) 3) E1= Ef(1111 1111) & p1(0000 0010) E1 (new effective) = 0000 0010. (wrong value, we no longer have CAP_DAC_OVERRIDE)

Using my proposed formula: 1) pI' = pI(0000 0010) & fI(0000 0010) #fI must be subset of fP) 2) pP' = fP(0000 00010) | (pP(0000 0001) & pI'(0000 0000)) 3) pE' = fE(1111 1111) & pP'(0000 0010) Final effected = (0000 0010) (again, wrong value)

So the paradigm of limiting the Inheritable set to be a subset of the Permitted set creates an error in both cases.

The only case that yields the desired effect is the 2nd formula and allowing inheritable sets to be set to any value on processes and files.

> As far as the wording in section 25.1.1.2 is concerned, it talks about > the permitted flags depending on the capability *states* (which covers > all three capmasks, not just the permitted bitmask). --- The section says the resulting set depends on the cap-states of both. It then gets more specific, saying, each cap in the new _permitted_ (it is italicized in the original) may have been *forced* (my emphasis) 'on' by [1] the program (file) *or* [2] *inherited* *from* *the* *previous* *process* [meaning they had to be in the previous process's permitted and inherited set otherwise there would be nothing to inherit and...] ([3] if the capability attributes of the program allow the inheritance (i.e program's 'I' vector).

So we have: [1] = caps turned on by file Permitted (fP), [2] = caps of old process that are passed down, i.e. Inherited & Permitted (pI & pP) [3] fI is a bound for what can be inherited from [2]

Stated more succinctly, [3] is applied to [2]: (fI & (pI & pP)) This is or'ed with [1] to yield the new Permitted set (pP') pP'= (fP) | (fI & pI & pP)

The new effective set will be what the file asked for (fE) bounded by the new permitted set: pE' = fE & pP'

The new 'inheritable set', pI', would, most securely, be the most restrictive combination of the file's and process's I-sets, namely:

pI' = pI & fI

If you re-order this, for efficiency, pI' is a sub-component of the pP' above, so the entire expression becomes:

pE' = fE &(pP' = fP|((pI' = pI&fI) & pP))

or broken down separately:

1) pI' = pI & fI 2) pP' = fp | (pI' & pP) 3) pE' = fE & pP'

> So the Draft 17 model specifies that flags in the inheritable bitmask > will be inherited across an exec **if** the executable allows it to be > inherited. --- Right: the file "Inheritable" vector is a limit as well as the User's previous "Inheritable" vector (pI'=pI & fI).

> The basic idea seems to be that an executable should never > have permissions shoved down its throat that it's not prepared to > handle. --- True, but if the previous process's Permitted set aren't factored in, then neither is it possible for an executable to inherit capabilities that both the previous process and the file have specified as inheritable. Somewhere you have to be "and"ing the previous process's Permitted set with the new Inheritable mask (pP & pI').

> Or put another way, if the system administrator considers a > particular program to be untrusted because it hasn't been audited yet to > make sure it doesn't do bone-headed things, it should under no > circumstances be allowed to any capabilities that the system > administrator hasn't authorized that capability to have. --- To provide such a 'default', then a file-system default would be set (at mount time) to be PIE=(0,0,0) and/or we files that have no capability set be the same as PIE=(0,0,0) which would clear any capabilities. Any individual trusted file would have a capset of PIE=(0,ALL,ALL) -- meaning the file isn't enabling any capabilities, but would simply use the capabilities of the caller.

> If you > really want it, I'd probably do it by defining a new cpamask which > contained additional privileges which are or'ed into the permitted mask > unconditionally across the exec. This would be a *dangerous* capmask, > since if it is ever set, it basically violates the guarantee which POSIX > 1.e gave you --- namely, that programs that had a null inheritable and > executable bitmasks can never run with raised capabilities, and so > therefore they don't need to be audited for security problems. --- Programs that have a PIE set of '0,0,0' could never be run with privilege. I don't understand what you meant by "executable bitmasks". Did you mean "permitted" or "effective" or both? There are different implications/effects for all 3. I *don't* want a new mask -- I just want to clean up the semantics of what is already there.

> (In > Unix, a **lot** of problems are caused by running programs as root that > weren't designed to be run as root. POSIX 1.e prevents this, but in > order to provide the functionality you want, we have to break this > guarantee.) --- Naw. Linux already does what I am asking, but does so in an inconsistent and error-prone manner. Currently, the setcap system call improperly constrains pI to be a subset of pP. This is a bug. That's not part of the POSIX spec -- but it is *needed* because how inheritance in fs/exec.c is broken. So the limit in kernel/capability.c is a kludge to solve the inaccurate inheritance model in fs/exec.c.

This bug limits functionality to a lower level than specified by POSIX.

In order to be consistent, there should be an equivalent kludge in compute_creds() also in fs/exec.c. Note that it is *missing* -- in fact no new inherited set is computed at all! Yikes! So this automatically implies that the new inherited set is not bounded by the new_permitted set. This is inconsistent with setcap(). Not only, but this implies that the file-Inheritable set would have no effect on what that programs wants to be propagated to its to children. I submit that you want the program to control this as much as the user (i.e. -- the "and" of both for the new inheritable set).

But bounding the inherited set by the permitted as shown in my "exercise 2", above is also a broken semantic. Therefore the proper fix is to remove the bounding in setcap() and apply the proper algorithm in 1 place in fs/exec.c.

Note that there is another 'kludge' -- a check for uid==0 in prepare_binprm(). Rather than:

if (!issecure(SECURE_NOROOT)) { if (bprm->e_uid == 0 || current->uid == 0) cap_set_full(bprm->cap_inheritable); if (bprm->e_uid == 0) cap_set_full(bprm->cap_effective); }

We should have:

/* note that cap->permitted already was zeroed */ /* FIXME: UID==0 shouldn't mean anything when file-caps * are implemented */ if (!issecure(SECURE_NOROOT) && bprm->e_uid == 0) { cap_set_full(bprm->cap_inheritable); cap_set_full(bprm->cap_effective); )

/* so default permissions now, in absence of true file caps would be PIE=(none,all,all)

Thus, root starts with PIE=(all,all,all), but normal users would start with PIE=(0,all,0). Using the new exec rule:

1) pI' = pI & fI 2) pP' = fp | (pI' & pP) 3) pE' = fE & pP'

For root: 1) pI'=all & all 2) pP'= 0 | (all & all) 3) pE'=all & all new PIE=(all,all,all)

For non-root users we get: 1) pI'=all | all 2) pP'= 0 | (all & 0) 3) pE'= all & 0 new PIE=(0,all,0)

This mathematically proves that execution of normal files would not change the capability set of root or users.

Unfortunately, this isn't the way things are as we currently have an inconsistent (and broken) capability mechanism.

:-(

-Linda

-- Linda A Walsh | Trust Technology, Core Linux, SGI law@sgi.com | Voice: (650) 933-5338

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed May 31 2000 - 21:00:20 EST