Re: [PATCH v3 0/5] RISC-V: Add kexec/kdump support

From: Nick Kossifidis
Date: Fri Apr 09 2021 - 06:10:32 EST


Στις 2021-04-07 19:29, Rob Herring έγραψε:
On Mon, Apr 05, 2021 at 11:57:07AM +0300, Nick Kossifidis wrote:
This patch series adds kexec/kdump and crash kernel
support on RISC-V. For testing the patches a patched
version of kexec-tools is needed (still a work in
progress) which can be found at:

https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz

v3:
* Rebase on newer kernel tree
* Minor cleanups
* Split UAPI changes to a separate patch
* Improve / cleanup init_resources
* Resolve Palmer's comments

v2:
* Rebase on newer kernel tree
* Minor cleanups
* Properly populate the ioresources tre, so that it
can be used later on for implementing strict /dev/mem
* Use linux,usable-memory on /memory instead of a new binding

Where? In any case, that's not going to work well with EFI support
assuming like arm64, 'memory' is passed in UEFI structures instead of
DT. That's why there's now a /chosen linux,usable-memory-ranges
property.


Here:
https://elixir.bootlin.com/linux/v5.12-rc5/source/drivers/of/fdt.c#L1001

The "linux,usable-memory" binding is already defined and is part of
early_init_dt_scan_memory() which we call on mm/init.c to determine
system's memory layout. It's simple, clean and I don't see a reason
to use another binding on /chosen and add extra code for this, when
we already handle it on early_init_dt_scan_memory() anyway. As for
EFI, even when enabled, we still use DT to determine system memory
layout, not EFI structures, plus I don't see how EFI is relevant
here, the bootloader in kexec's case is Linux, not EFI. BTW the /memory
node is mandatory in any case, it should exist on DT regardless of EFI,
/chosen node on the other hand is -in general- optional, and we can still
boot a riscv system without /chosen node present (we only require it for
the built-in cmdline to work).

Also a simple grep for "linux,usable-memory-ranges" on the latest kernel
sources didn't return anything, there is also nothing on chosen.txt, where
is that binding documented/implemented ?

Isn't the preferred kexec interface the file based interface? I'd
expect a new arch to only support that. And there's common kexec DT
handling for that pending for 5.13.


Both approaches have their pros an cons, that's why both are available, in no
way CONFIG_KEXEC is deprecated in favor of CONFIG_KEXEC_FILE, at least not as
far as I know. The main point for the file-based syscall is to support secure
boot, since the image is loaded by the kernel directly without any processing
by the userspace tools, so it can be pre-signed by the kernel's "vendor". On
the other hand, the kernel part is more complicated and you can't pass a new
device tree, the kernel needs to re-use the existing one (or modify it
in-kernel), you can only override the cmdline.

This doesn't work for our use cases in FORTH, where we use kexec not only to
re-boot our systems, but also to boot to a system with different hw layout
(e.g. FPGA prototypes or systems with FPGAs on the side), device tree overlays
also don't cover our use cases. To give you an idea we can add/remove/modify
devices, move them to another region etc and still use kexec to avoid going
through the full boot cycle. We just unload their drivers, perform a full or
partial re-programming of the FPGA from within Linux, and kexec to the new
system with the new device tree. The file-based syscall can't cover this
scenario, in general it's less flexible and it's only there for secure boot,
not for using custom-built kernels, nor custom device tree images.

Security-wise the file load syscall provides guarantees for integrity and
authenticity, but depending on the kernel "vendor"'s infrastructure and
signing process this may allow e.g. to load an older/vulnerable kernel through
kexec and get away with it, there is no check as far as I know to make sure
the loaded kernel is at least as old as the running kernel, the assumption is
that the "vendor" will use a different signing key/cert for each kernel and
that you'll kexec to a kernel/crash kernel that's the same version as the
running one. Until we have clear guidelines on how this is meant to be used
and have a discussion on secure boot within RISC-V (we have something on
the TEE TG but we'll probably switch to a SIG committee for this), I don't
see how this feature is a priority compared to the more generic CONFIG_KEXEC.

Regards,
Nick