Re: [PATCH v3 3/6] x86/kexec: Disable kexec/kdump on platforms with TDX partial write erratum
From: Huang, Kai
Date: Wed Jul 02 2025 - 04:44:55 EST
On Wed, 2025-07-02 at 16:25 +0800, Gao, Chao wrote:
> On Thu, Jun 26, 2025 at 10:48:49PM +1200, Kai Huang wrote:
> > Some early TDX-capable platforms have an erratum: A kernel partial
> > write (a write transaction of less than cacheline lands at memory
> > controller) to TDX private memory poisons that memory, and a subsequent
> > read triggers a machine check.
> >
> > On those platforms, the old kernel must reset TDX private memory before
> > jumping to the new kernel, otherwise the new kernel may see unexpected
> > machine check. Currently the kernel doesn't track which page is a TDX
> > private page. For simplicity just fail kexec/kdump for those platforms.
>
> My understanding is that the kdump kernel uses a small amount of memory
> reserved at boot, which the crashed kernel never accesses. And the kdump
> kernel reads the memory of the crashed kernel and doesn't overwrite it.
> So it should be safe to allow kdump (i.e., no partial write to private
> memory). Anything I missed?
>
> (I am not asking to enable kdump in *this* series; I'm just trying to
> understand the rationale behind disabling kdump)
As you said it *should* be safe. The kdump kernel should only read TDX
private memory but not write. But I cannot say I am 100% sure (there are
many things involved when generating the kdump file such as memory
compression) so in internal discussion we thought we should just disable it.