Re: x86/random: Speculation to the rescue

From: Alexander E. Patrakov
Date: Sun Sep 29 2019 - 04:05:32 EST


29.09.2019 04:53, Linus Torvalds ÐÐÑÐÑ:
On Sat, Sep 28, 2019 at 3:24 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:

Nicholas presented the idea to (ab)use speculative execution for random
number generation years ago at the Real-Time Linux Workshop:

What you describe is just a particularly simple version of the jitter
entropy. Not very reliable.

But hey, here's a made-up patch. It basically does jitter entropy, but
it uses a more complex load than the fibonacci LFSR folding: it calls
"schedule()" in a loop, and it sets up a timer to fire.

And then it mixes in the TSC in that loop.

And to be fairly conservative, it then credits one bit of entropy for
every timer tick. Not because the timer itself would be all that
unpredictable, but because the interaction between the timer and the
loop is going to be pretty damn unpredictable.

This looks quite similar to the refactoring proposed earlier by Stephan MÃller in his paper: https://www.chronox.de/lrng/doc/lrng.pdf . Indeed, he makes a good argument that the timing of device interrupts is right now the main actual source of entropy in Linux, at the end of Section 1.1:

"""
The discussion shows that the noise sources of block devices and HIDs are a derivative of the interrupt noise source. All events used as entropy source recorded by the block device and HID noise source are delivered to the Linux kernel via interrupts.
"""

Now your patch adds the timer interrupt (while the schedule() loop is running) to the mix, essentially in the same setup as proposed.


Ok, I'm handwaving. But I do claim it really is fairly conservative to
think that a cycle counter would give one bit of entropy when you time
over a timer actually happening. The way that loop is written, we do
guarantee that we'll mix in the TSC value both before and after the
timer actually happened. We never look at the difference of TSC
values, because the mixing makes that uninteresting, but the code does
start out with verifying that "yes, the TSC really is changing rapidly
enough to be meaningful".

So if we want to do jitter entropy, I'd much rather do something like
this that actually has a known fairly complex load with timers and
scheduling.

And even if absolutely no actual other process is running, the timer
itself is still going to cause perturbations. And the "schedule()"
call is more complicated than the LFSR is anyway.

It does wait for one second the old way before it starts doing this.

Whatever. I'm entirely convinced this won't make everybody happy
anyway, but it's _one_ approach to handle the issue.

Ahmed - would you be willing to test this on your problem case (with
the ext4 optimization re-enabled, of course)?

And Thomas - mind double-checking that I didn't do anything
questionable with the timer code..

And this goes without saying - this patch is ENTIRELY untested. Apart
from making people upset for the lack of rigor, it might do
unspeakable crimes against your pets. You have been warned.

Linus



--
Alexander E. Patrakov