Re: [RFC PATCH 00/56] Dynamic mitigations

From: Alexander Graf

Date: Wed Oct 15 2025 - 05:14:28 EST

On 14.10.25 20:06, Kaplan, David wrote:

[AMD Official Use Only - AMD Internal Distribution Only]

-----Original Message-----
From: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Sent: Tuesday, October 14, 2025 11:29 AM
To: Kaplan, David <David.Kaplan@xxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>; Borislav Petkov <bp@xxxxxxxxx>; Peter
Zijlstra <peterz@xxxxxxxxxxxxx>; Pawan Gupta
<pawan.kumar.gupta@xxxxxxxxxxxxxxx>; Ingo Molnar <mingo@xxxxxxxxxx>; Dave
Hansen <dave.hansen@xxxxxxxxxxxxxxx>; x86@xxxxxxxxxx; H . Peter Anvin
<hpa@xxxxxxxxx>; Alexander Graf <graf@xxxxxxxxxx>; Boris Ostrovsky
<boris.ostrovsky@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx
Subject: Re: [RFC PATCH 00/56] Dynamic mitigations

Caution: This message originated from an External Source. Use proper caution
when opening attachments, clicking links, or responding.

On Mon, Oct 13, 2025 at 09:33:48AM -0500, David Kaplan wrote:

Dynamic mitigations enables changing the kernel CPU security mitigations at
runtime without a reboot/kexec.

Previously, mitigation choices had to be made on the kernel cmdline. With
this feature an administrator can select new mitigation choices by writing
a sysfs file, after which the kernel will re-patch itself based on the new
mitigations.

As the performance cost of CPU mitigations can be significant, selecting
the right set of mitigations is important to achieve the correct balance of
performance/security.

Use
---
As described in the supplied documentation file, new mitigations are
selected by writing cmdline options to a new sysfs file. Only cmdline
options related to mitigations are recognized via this interface. All
previous mitigation-related cmdline options are ignored and selections are
done based on the new options.

Examples:
echo "mitigations=off" > /sys/devices/system/cpu/mitigations
echo "spectre_v2=retpoline tsa=off" > /sys/devices/system/cpu/mitigations

There are several use cases that will benefit from dynamic mitigations:

Use Cases
---------
1. Runtime Policy

Some workflows rely on booting a generic kernel before customizing the system.
cloud-init is a popular example of this where a VM is started typically with
default settings and then is customized based on a customer-provided
configuration file.

I'm not really a fan of this. It adds complexity to some areas that are
already struggling with too much complexity.

IMO this would need some REALLY strong justification, more than just
"hey, this makes things more convenient."

The mitigations should be a "set it and forget it" thing. I don't see
anything here which justifies the considerable maintenance burden this
would add for all existing and future mitigations.

The problem is there are environments like the one outlined where you can't just 'set it and forget it' because the kernel needs it set at boot-time, but in these environments you don't know how to configure the system until much later in boot. So you end up running with the default settings all the time, even if you don't need them. And the default settings can have significant performance impacts in many cases.

The cloud guys on this thread may be able to offer some additional color here since I believe that's where you're most likely to have this situation.

The crux of the problem here is that the kernel command line is difficult to influence in most cloud environments.

In the cloud, you typically start from a generic base image which then boots, talks to a configuration mechanism (IMDS in EC2) which then contains all of the actual customization. For most customization, that is perfectly fine: You can install packages, run scripts, launch services, etc. But there is no simple way to modify the kernel command line. The story gets even worse when you try to abstract the cloud environment itself by using configuration layers on top like puppet, ansible, salt, etc. because you could not do a pre-boot environment hack even if you wanted to.

Users could in theory have a bootup script which checks the current command line, modifies the boot loader configuration, regenerates boot loader config files (If they even can. Signed UKIs make that difficult), and then reboot/kexec into the new environment. So we would punt a *lot* of complexity onto users and still degrade their experience by prolonging the launch phase.

I'm all ears for alternatives, but runtime setting seems like the most natural way to allow bootup / configuration scripts to actually instill policy.

Alex

Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597