**Next message:**Oliver Xymoron: "[PATCH] (1/4) Entropy accounting fixes"**Previous message:**Alexander Viro: "Re: IDE?"**Next in thread:**Oliver Xymoron: "[PATCH] (1/4) Entropy accounting fixes"**Reply:**Oliver Xymoron: "[PATCH] (1/4) Entropy accounting fixes"**Reply:**Linus Torvalds: "Re: [PATCH] (0/4) Entropy accounting fixes"**Reply:**Linus Torvalds: "Re: [PATCH] (0/4) Entropy accounting fixes"**Maybe reply:**David Brownell: "Re: [PATCH] (0/4) Entropy accounting fixes"**Reply:**Theodore Ts'o: "Re: [PATCH] (0/4) Entropy accounting fixes"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]

I've done an analysis of entropy collection and accounting in current

Linux kernels and founds some major weaknesses and bugs. As entropy

accounting is only one part of the security of the random number

device, it's unlikely that these flaws are compromisable, nonetheless

it makes sense to fix them.

- Broken analysis of entropy distribution

- Spoofable delta model

- Interrupt timing independence

- Ignoring time scale of entropy sources

- Confusion of unpredictable and merely complex sources and trusting the

latter

- Broken pool transfers

- Entropy pool can be overrun with untrusted data

Net effect: a typical box will claim to generate 2-5 _orders of magnitude_

more entropy than it actually does.

Note that entropy accounting is mostly useful for things like the

generation of large public key pairs where the number of bits of

entropy in the key is comparable to the size of the PRNG's internal

state. For most purposes, /dev/urandom is still more than strong

enough to make attacking a cipher directly more productive than

attacking the PRNG.

The following patches against 2.5.31 have been tested on x86, but

should compile elsewhere just fine.

I've tried to cover some of the issues in detail below:

Broken analysis of entropy distribution

---------------------------------------

(I know the topic of entropy is rather poorly understood, so here's a couple

useful pieces of background for kernel folks:

Cryptanalytic Attacks on Pseudorandom Number Generators

Kelsey, Schneier, Wagner, Hall

www.counterpane.com/pseudorandom_number.pdf

Cryptographic Randomness from Air Turbulence in Disk Drives

D. Davis, R. Ihaka, P.R. Fenstermacher

http://world.std.com/~dtd/random/forward.ps)

Mathematically defining entropy

For a probability distribution P of samples K, the entropy is:

E = sum (-P(K) * log2 P(K))

For a uniform distribution of n bits of data, the entropy is

n. Anything other than a uniform distribution has less than n bits of

entropy.

Non-Uniform Distribution Of Timing

Unfortunately, our sample source is far from uniform. For starters, each

interrupt has a finite time associated with it - the interrupt latency.

Back to back interrupts will result in samples that are periodically

spaced by a fixed interval.

A priori, we might expect a typical interrupt to be a Poisson

process, resulting in a gamma-like distribution. It would also have

zero probability up to some minimum latency, have a peak at minimum

latency representing the likelihood of back-to-back interrupts, a

smooth hump around the average interrupt rate, and an infinite tail.

Not surprisingly, this distribution has less entropy in it than a

uniform distribution would. Linux takes the approach of assuming the

distribution is "scale invariant" (which is true for exponential

distributions and approximately true for the tails of gamma

distributions) and that the amount of entropy in a sample is in

relation to the number of bits in a given interrupt delta.

Assuming the interrupt actually has a nice gamma-like distribution

(which is unlikely in practice), then this is indeed true. The

trouble is that Linux assumes that if a delta is 13 bits, it contains

12 bits of actual entropy. A moment of thought will reveal that

binary numbers of the form 1xxxx can contain at most 4 bits of

entropy - it's a tautology that all binary numbers start with 1 when

you take off the leading zeros. This is actually a degenerate case of

Benford's Law (http://mathworld.wolfram.com/BenfordsLaw.html), which

governs the distribution of leading digits in scale invariant

distributions.

What we're concerned with is the entropy contained in digits

following the leading 1, which we can derive with a simple extension

of Benford's Law (and some Python):

def entropy(l):

s=0

for pk in l:

if pk: s=s+(-pk*log2(pk))

return s

def benford(digit, place=0, base=10):

if not place:

s=log(1+1.0/digit)

else:

s=0

for k in range(base**(place-1), (base**place)):

s=s+log(1+1.0/(k*base+digit))

print k,s

return s/log(base)

for b in range(3,16):

l=[]

for k in range(1,(2**(b-1))-1):

l.append(benford(k,0,2**(b-1)))

print "%2d %6f" % (b, entropy(l))

Which gives us:

3 1.018740

4 2.314716

5 3.354736

6 4.238990

7 5.032280

8 5.769212

9 6.468756

10 7.141877

11 7.795288

12 8.433345

13 9.059028

14 9.674477

15 10.281286

As it turns out, our 13-bit number has at most 9 bits of entropy, and

as we'll see in a bit, probably significantly less.

All that said, this is easily dealt with by lookup table.

Interrupt Timing Independence

-----------------------------

Linux entropy estimate also wrongly assumes independence of different

interrupt sources. While SMP complicates the matter, this is

generally not the case. Low-priority interrupts must wait on high

priority ones and back to back interrupts on shared lines will

serialize themselves ABABABAB. Further system-wide CLI, cache flushes

and the like will skew -all- the timings and cause them to bunch up

in predictable fashion.

Furthermore, all this is observable from userspace in the same way

that worst-case latency is measured.

To protect against back to back measurements and userspace

observation, we insist that at least one context switch has occurred

since we last sampled before we trust a sample.

Questionable Sources and Time Scales

------------------------------------

Due to the vagarities of computer architecture, things like keyboard

and mouse interrupts occur on their respective scanning or serial

clock edges, and are clocked relatively slowly. Worse, devices like

USB keyboards, mice, and disks tend to share interrupts and probably

line up on USB clock boundaries. Even PCI interrupts have a

granularity on the order of 33MHz (or worse, depending on the

particular adapter), which when timed by a fast processor's 2GHz

clock, make the low six bits of timing measurement predictable.

And as far as I can find, no one's tried to make a good model or

estimate of actual keyboard or mouse entropy. Randomness caused by

disk drive platter turbulence has actually been measured and is on

the order of 100bits/minute and is well correlated on timescales of

seconds - we're likely way overestimating it.

We can deal with this by having each trusted source declare its clock

resolution and removing extra timing resolution bits when we make samples.

Trusting Predictable or Measurable Sources

------------------------------------------

What entropy can be measured from disk timings are very often leaked

by immediately relaying data to web, shell, or X clients. Further,

patterns of drive head movement can be remotely controlled by clients

talking to file and web servers. Thus, while disk timing might be an

attractive source of entropy, it can't be used in a typical server

environment without great caution.

Complexity of analyzing timing sources should not be confused with

unpredictability. Disk caching has no entropy, disk head movement has

entropy only to the extent that it creates turbulence. Network

traffic is potentially completely observable.

(Incidentally, tricks like Matt Blaze's truerand clock drift

technique probably don't work on most PCs these days as the

"realtime" clock source is often derived directly from the

bus/PCI/memory/CPU clock.)

If we're careful, we can still use these timings to seed our RNG, as

long as we don't account them as entropy.

Batching

--------

Samples to be mixed are batched into a 256 element ring

buffer. Because this ring isn't allowed to wrap, it's dangerous to

store untrusted samples as they might flood out trusted ones.

We can allow untrusted data to be safely added to the pool by XORing

new samples in rather than copying and allowing the pool to wrap

around. As non-random data won't be correlated with random data, this

mixing won't destroy any entropy.

Broken Pool Transfers

---------------------

Worst of all, the accounting of entropy transfers between the

primary and secondary pools has been broken for quite some time and

produces thousands of bits of entropy out of thin air.

-- "Love the dolphins," she advised him. "Write by W.A.S.T.E.." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

**Next message:**Oliver Xymoron: "[PATCH] (1/4) Entropy accounting fixes"**Previous message:**Alexander Viro: "Re: IDE?"**Next in thread:**Oliver Xymoron: "[PATCH] (1/4) Entropy accounting fixes"**Reply:**Oliver Xymoron: "[PATCH] (1/4) Entropy accounting fixes"**Reply:**Linus Torvalds: "Re: [PATCH] (0/4) Entropy accounting fixes"**Reply:**Linus Torvalds: "Re: [PATCH] (0/4) Entropy accounting fixes"**Maybe reply:**David Brownell: "Re: [PATCH] (0/4) Entropy accounting fixes"**Reply:**Theodore Ts'o: "Re: [PATCH] (0/4) Entropy accounting fixes"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]

*
This archive was generated by hypermail 2b29
: Fri Aug 23 2002 - 22:00:14 EST
*