Re: [PATCH] do not clean dummy variable in kexec path

From: Dave Young
Date: Thu Aug 08 2019 - 03:49:13 EST


On 08/05/19 at 06:55pm, Ard Biesheuvel wrote:
> On Mon, 5 Aug 2019 at 11:36, Dave Young <dyoung@xxxxxxxxxx> wrote:
> >
> > kexec reboot fails randomly in UEFI based kvm guest. The firmware
> > just reset while calling efi_delete_dummy_variable(); Unfortunately
> > I don't know how to debug the firmware, it is also possible a potential
> > problem on real hardware as well although nobody reproduced it.
> >
> > The intention of efi_delete_dummy_variable is to trigger garbage collection
> > when entering virtual mode. But SetVirtualAddressMap can only run once
> > for each physical reboot, thus kexec_enter_virtual_mode is not necessarily
> > a good place to clean dummy object.
> >
>
> I would argue that this means it is not a good place to *create* the
> dummy variable, and if we don't create it, we don't have to delete it
> either.
>
> > Drop efi_delete_dummy_variable so that kexec reboot can work.
> >
>
> Creating it and not deleting it is bad, so please try and see if we
> can omit the creation on this code path instead.
>

Check the code for the dummy var, it is created only in below chunk:
arch/x86/platform/efi/quirks.c:
efi_query_variable_store():
[snip]
/*
* We account for that by refusing the write if permitting it would
* reduce the available space to under 5KB. This figure was provided by
* Samsung, so should be safe.
*/
if ((remaining_size - size < EFI_MIN_RESERVE) &&
!efi_no_storage_paranoia) {

/*
* Triggering garbage collection may require that the firmware
* generate a real EFI_OUT_OF_RESOURCES error. We can force
* that by attempting to use more space than is available.
*/
unsigned long dummy_size = remaining_size + 1024;
void *dummy = kzalloc(dummy_size, GFP_KERNEL);

if (!dummy)
return EFI_OUT_OF_RESOURCES;

status = efi.set_variable((efi_char16_t *)efi_dummy_name,
&EFI_DUMMY_GUID,
EFI_VARIABLE_NON_VOLATILE |
EFI_VARIABLE_BOOTSERVICE_ACCESS |
EFI_VARIABLE_RUNTIME_ACCESS,
dummy_size, dummy);

if (status == EFI_SUCCESS) {
/*
* This should have failed, so if it didn't make sure
* that we delete it...
*/
efi_delete_dummy_variable();
}

[snip]

So the dummy var only be created when the if condition matched, also
once creating succeeded it is deleted. The deleting while entering
virtual mode is always deleting a non exist efi var. Please correct me
if I miss something.

If above is true, then at least in the kexec path can be dropped because
we have a real bug which resets machine.

Thanks
Dave