Re: x86/random: Speculation to the rescue

From: Borislav Petkov
Date: Mon Sep 30 2019 - 02:10:26 EST


On Sun, Sep 29, 2019 at 07:59:19PM -0700, Linus Torvalds wrote:
> All my smoke testing looked fine - I disabled trusting the CPU, I
> increased the required entropy a lot, and to actually trigger the
> lockup issue without the broken user space, I made /dev/urandom do
> that "wait for entropy" thing too.

Hohum, seems to get rid of the longish delay during boot on my test
boxes here:

$ grep random /var/log/messages

<--- that's before

Sep 30 07:46:07 cz vmunix: [ 0.000000] random: get_random_bytes called from start_kernel+0x304/0x4ac with crng_init=0
Sep 30 07:46:07 cz vmunix: [ 1.505641] random: fast init done
Sep 30 07:46:07 cz vmunix: [ 7.124808] random: dd: uninitialized urandom read (512 bytes read)
Sep 30 07:46:07 cz vmunix: [ 8.507672] random: dbus-daemon: uninitialized urandom read (12 bytes read)
Sep 30 07:46:07 cz vmunix: [ 8.518621] random: dbus-daemon: uninitialized urandom read (12 bytes read)
Sep 30 07:46:07 cz vmunix: [ 8.565073] random: avahi-daemon: uninitialized urandom read (4 bytes read)
Sep 30 07:46:21 cz vmunix: [ 23.092795] random: crng init done
Sep 30 07:46:21 cz vmunix: [ 23.096419] random: 3 urandom warning(s) missed due to ratelimiting

<--- that's after and we're 15 secs faster:

Sep 30 07:47:53 cz vmunix: [ 0.329599] random: get_random_bytes called from start_kernel+0x304/0x4ac with crng_init=0
Sep 30 07:47:53 cz vmunix: [ 1.949216] random: fast init done
Sep 30 07:47:53 cz vmunix: [ 4.806132] random: dd: uninitialized urandom read (512 bytes read)
Sep 30 07:47:53 cz vmunix: [ 5.954547] random: dbus-daemon: uninitialized urandom read (12 bytes read)
Sep 30 07:47:53 cz vmunix: [ 5.965483] random: dbus-daemon: uninitialized urandom read (12 bytes read)
Sep 30 07:47:53 cz vmunix: [ 6.014102] random: avahi-daemon: uninitialized urandom read (4 bytes read)
Sep 30 07:47:55 cz vmunix: [ 8.238514] random: crng init done
Sep 30 07:47:55 cz vmunix: [ 8.240205] random: 3 urandom warning(s) missed due to ratelimiting

Seeing how those uninitialized urandom read warns still happen, I added a
dump_stack() to see when we do wait for the random bytes first and I got this:

[ 5.522348] random: dbus-daemon: uninitialized urandom read (12 bytes read)
[ 5.532008] random: dbus-daemon: uninitialized urandom read (12 bytes read)
[ 5.579922] random: avahi-daemon: uninitialized urandom read (4 bytes read)
[ 5.751790] elogind-daemon[1730]: New seat seat0.
[ 5.756376] elogind-daemon[1730]: Watching system buttons on /dev/input/event6 (Power Button)
[ 5.777381] elogind-daemon[1730]: Watching system buttons on /dev/input/event3 (Power Button)
[ 5.781485] elogind-daemon[1730]: Watching system buttons on /dev/input/event5 (Lid Switch)
[ 5.783547] elogind-daemon[1730]: Watching system buttons on /dev/input/event4 (Sleep Button)
[ 5.885300] elogind-daemon[1730]: Watching system buttons on /dev/input/event0 (AT Translated Set 2 keyboard)
[ 5.911602] CPU: 2 PID: 1798 Comm: sshd Not tainted 5.3.0+ #1
[ 5.914672] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.39 04/16/2019
[ 5.917774] Call Trace:
[ 5.920905] dump_stack+0x46/0x60
[ 5.924044] wait_for_random_bytes.part.32+0x21/0x163
[ 5.927256] ? handle_mm_fault+0x50/0xc0
[ 5.930425] ? _raw_spin_unlock_irq+0x17/0x40
[ 5.933604] ? __do_page_fault+0x225/0x500
[ 5.936763] __x64_sys_getrandom+0x70/0xb0
[ 5.939902] do_syscall_64+0x4c/0x180
[ 5.943003] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 5.946152] RIP: 0033:0x7f4417f4d495
[ 5.949225] Code: 74 4c 8d 0c 37 41 ba 3e 01 00 00 66 2e 0f 1f 84 00 00 00 00 00 4d 39 c8 73 27 4c 89 ce 31 d2 4c 89 c7 44 89 d0 4c 29 c6 0f 05 <48> 3d 00 f0 ff ff 77 2b 48 85 c0 78 0e 74 3c 49 01 c0 4d 39 c8 72
[ 5.952902] RSP: 002b:00007ffc69e6e328 EFLAGS: 00000202 ORIG_RAX: 000000000000013e
[ 5.956227] RAX: ffffffffffffffda RBX: 0000000000000020 RCX: 00007f4417f4d495
[ 5.959530] RDX: 0000000000000000 RSI: 0000000000000020 RDI: 0000559262c74780
[ 5.962820] RBP: 0000559262c708b0 R08: 0000559262c74780 R09: 0000559262c747a0
[ 5.966104] R10: 000000000000013e R11: 0000000000000202 R12: 00007ffc69e6e470
[ 5.969373] R13: 0000000000000002 R14: 00007f4417f4d460 R15: 000000007fffffff
[ 7.852837] random: crng init done
[ 7.854637] random: 3 urandom warning(s) missed due to ratelimiting
[ 17.767786] elogind-daemon[1730]: New session c1 of user root.

so sshd does getrandom(2) while those other userspace things don't. Oh
well.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette