Re: Linux 5.3-rc8

From: Ahmed S. Darwish
Date: Wed Sep 11 2019 - 17:41:55 EST


On Wed, Sep 11, 2019 at 05:45:38PM +0100, Linus Torvalds wrote:
> On Wed, Sep 11, 2019 at 5:07 PM Theodore Y. Ts'o <tytso@xxxxxxx> wrote:
> > >
> > > Ted, comments? I'd hate to revert the ext4 thing just because it
> > > happens to expose a bad thing in user space.
> >
> > Unfortuantely, I very much doubt this is going to work. That's
> > because the add_disk_randomness() path is only used for legacy
> > /dev/random [...]
> >
> > Also, because by default, the vast majority of disks have
> > /sys/block/XXX/queue/add_random set to zero by default.
>
> Gaah. I was looking at the input randomness, since I thought that was
> where the added randomness that Ahmed got things to work with came
> from.
>
> And that then made me just look at the legacy disk randomness (for the
> obvious disk IO reasons) and I didn't look further.
>

Yup, I confirm that the quick patch kept the situation as-is. I was
going to debug why, but now we know the answer..

> > So the the way we get entropy these days for initializing the CRNG is
> > via the add_interrupt_randomness() path, where do something really
> > fast, and we assume that we get enough uncertainity from 8 interrupts
> > to give us one bit of entropy (64 interrupts to give us a byte of
> > entropy), and that we need 512 bits of entropy to consider the CRNG
> > fully initialized. (Yeah, there's a lot of conservatism in those
> > estimates, and so what we could do is decide to say, cut down the
> > number of bits needed to initialize the CRNG to be 256 bits, since
> > that's the size of the CHACHA20 cipher.)
>
> So that's 4k interrupts if I counted right, and yeah, maybe Ahmed was
> just close enough before, and the merging of the inode table IO then
> took him below that limit.
>
> > Ultimately, though, we need to find *some* way to fix userspace's
> > assumptions that they can always get high quality entropy in early
> > boot, or we need to get over people's distrust of Intel and RDRAND.
>
> Well, even on a PC, sometimes rdrand just isn't there. AMD has screwed
> it up a few times, and older Intel chips just don't have it.
>
> So I'd be inclined to either lower the limit regardless -

ACK :)

> and perhaps make the "user space asked for randomness much too
> early" be a big *warning* instead of being a basically fatal hung
> machine?

Hmmm, regarding "randomness request much too early", how much is time
really a factor here?

I tested leaving the machine even for 15+ minutes, and it still didn't
continue booting: the boot is practically blocked forever...

Or is the thoery that hopefully once the machine is un-stuck, more
sources of entropy will be available? If that's the case, then
possibly (rate-limited):

"urandom: process XX asked for YY bytes. CRNG not yet initialized"

> Linus

thanks,

--
darwi
http://darwish.chasingpointers.com