Re: Failover root devices

From: Austin S Hemmelgarn
Date: Thu Sep 17 2015 - 12:02:28 EST


On 2015-09-16 20:16, Drew DeVault wrote:
I would like to see Linux support multiple root devices, so that it can
attempt one and move on to the next if it is not present. I've reviewed
the relevant code during boot-up and it seems like a good place for me
to submit my first patch, but I want to bring it up for discussion here
on LKML first.

The design I had in mind is something like this:

root=device;device;device;...

Where 'device' follows the current format (/dev/sdX, UUIDs, and so on,
via name_to_dev_t). I would modify prepare_namespace to iterate through
each offered root device until one works.

My use-case for this feature is that I would like to be able to change
the hardware of my machine and boot up differently based on what's
present. In my case, I would like to install my system normally, with
/boot on its own partition, and keep a seperate userspace on a flash
drive. Then, during boot-up, if the flash drive is present, it would be
used as the root device. If it's not present, a partition on disk would
be selected.
I think this is an excellent idea, in addition to the above use-case, it would allow for distros to automatically launch a recovery image if the main root device has failed for some reason.

That said, using the term failover for this is probably not the best idea, many people associate it almost exclusively with online failover and high-availability setups, and trying to do something like that with the root file system is just asking for trouble (I'll be happy to go into specifics as to why if someone asks).
The only potential roadblock with this feature that comes to mind is
figuring out how to handle time-outs between root devices. I think it
would be wise to choose a sensible default value, and provide another
cmdline parameter to tweak it. The prepare_namespace flow might end up
looking something like this:

1. Wait rootdelay seconds
2. Check 1st device, not present
3. Recheck 1st device until rootfailoverdelay seconds has passed
4. Move on to 2nd device, present -> boot

Or:

1. Wait rootdelay seconds
2. Check 1st device, not present
3. Recheck 1st device until rootfailoverdelay seconds has passed
4. Move on to 2nd device, not present
5. Recheck 2st device until rootfailoverdelay seconds has passed
6. GOTO 2

And so on.
As for this, I'd say default to the first method, and then provide an option to switch to the second (both have practical uses).
I also need to research how the various init systems interact with this
part of the boot process. I suspect systemd probably does something
silly wrt waiting for the root device. Since this feature would (of
course) be backwards compatible, it might be wise to just implement it
here and let the init systems add support for the feature themselves.
If you're using an initramfs (which is a requirement from what I understand for using systemd), then this could be done entirely in the initramfs. The issue with that is that there is no standard syntax for doing it, and no way to do it without an initramfs (both of which would be nice to have).
Advice? Who should I send my patches to when they're ready? Please CC
me, I do not subscribe to LKML.
Use scripts/getmaintainer.pl (or just check the MAINTAINERS file directly) to determine this, but make sure to Cc at least LKML for the changes as well.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature