Re: [PATCH v8 2/5] x86/tdx: Add TDX Guest event notify interrupt support

From: Kai Huang
Date: Thu Jul 14 2022 - 06:43:07 EST


On Wed, 2022-07-13 at 17:46 -0700, Sathyanarayanan Kuppuswamy wrote:
> Hi Kai/Dave,
>
> On 6/27/22 4:21 AM, Kai Huang wrote:
> > On Sat, 2022-06-25 at 15:35 +1200, Yao, Jiewen wrote:
> > > Thank you, Jun.
> > >
> > > Yes. I confirmed that we will include below change to GHCI.next spec.
> > >
> > > ================
> > > 3.5 TDG.VP.VMCALL<SetupEventNotifyInterrupt>
> > >
> > > From: "The host VMM should use SEAMCALL [TDWRVPS] leaf to inject an interrupt at the requested-interrupt vector into the TD via the posted-interrupt descriptor. "
> > >
> > > To: "The host VMM should use SEAMCALL [TDWRVPS] leaf to inject an interrupt at the requested-interrupt vector into the TD VCPU that executed TDG.VP.VMCALL <SetupEventNotifyInterrupt> via the posted-interrupt descriptor. "
> > >
> >
> > Hi Sathy,
> >
> > With this change, I don't think we should use system vector anymore. Instead,
> > we just need one non-migratable IRQ which has a fixed vector on a fixed cpu.
> >
>
> Thanks. As suggested, I have attempted to allocate IRQ vector at runtime
> using irq_domain_alloc_irqs() call. Vector is allocated from
> "x86_vector_domain" as Kai suggested.

I am not expert either. I said "idea only" in my first reply :)

>
> Since I am not well versed in this area, I would like expert comments on it.
> Mainly for IRQ allocation logic in tdx_late_init(). I have tested this version using
> QEMU and it works fine.
>
>
> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> index 928dcf7a20d9..dcc878546574 100644
> --- a/arch/x86/coco/tdx/tdx.c
> +++ b/arch/x86/coco/tdx/tdx.c
> @@ -5,12 +5,16 @@
> #define pr_fmt(fmt) "tdx: " fmt
>
> #include <linux/cpufeature.h>
> +#include <linux/interrupt.h>
> +#include <linux/irq.h>
> +#include <linux/numa.h>
> #include <asm/coco.h>
> #include <asm/tdx.h>
> #include <asm/vmx.h>
> #include <asm/insn.h>
> #include <asm/insn-eval.h>
> #include <asm/pgtable.h>
> +#include <asm/irqdomain.h>
>
> /* TDX module Call Leaf IDs */
> #define TDX_GET_INFO 1
> @@ -19,6 +23,7 @@
>
> /* TDX hypercall Leaf IDs */
> #define TDVMCALL_MAP_GPA 0x10001
> +#define TDVMCALL_SETUP_NOTIFY_INTR 0x10004
>
> /* MMIO direction */
> #define EPT_READ 0
> @@ -34,6 +39,26 @@
> #define VE_GET_PORT_NUM(e) ((e) >> 16)
> #define VE_IS_IO_STRING(e) ((e) & BIT(4))
>
> +/*
> + * Handler used to report notifications about
> + * TDX_GUEST_EVENT_NOTIFY_VECTOR IRQ. Currently it will be
> + * used only by the attestation driver. So, race condition
> + * with read/write operation is not considered.
> + */
> +static void (*tdx_event_notify_handler)(void);
> +
> +/* Helper function to register tdx_event_notify_handler */
> +void tdx_setup_ev_notify_handler(void (*handler)(void))
> +{
> + tdx_event_notify_handler = handler;
> +}
> +
> +/* Helper function to unregister tdx_event_notify_handler */
> +void tdx_remove_ev_notify_handler(void)
> +{
> + tdx_event_notify_handler = NULL;
> +}
> +

Looks it's weird that you still need it. Couldn't we pass the function to
handle GetQuote directly to request_irq()?

> /*
> * Wrapper for standard use of __tdx_hypercall with no output aside from
> * return code.
> @@ -98,6 +123,31 @@ static inline void tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
> panic("TDCALL %lld failed (Buggy TDX module!)\n", fn);
> }
>
> +/*
> + * tdx_hcall_set_notify_intr() - Setup Event Notify Interrupt Vector.
> + *
> + * @vector: Vector address to be used for notification.
> + *
> + * return 0 on success or failure error number.
> + */
> +static long tdx_hcall_set_notify_intr(u8 vector)
> +{
> + /* Minimum vector value allowed is 32 */
> + if (vector < 32)
> + return -EINVAL;
> +
> + /*
> + * Register callback vector address with VMM. More details
> + * about the ABI can be found in TDX Guest-Host-Communication
> + * Interface (GHCI), sec titled
> + * "TDG.VP.VMCALL<SetupEventNotifyInterrupt>".
> + */
> + if (_tdx_hypercall(TDVMCALL_SETUP_NOTIFY_INTR, vector, 0, 0, 0))
> + return -EIO;
> +
> + return 0;
> +}
> +
> static u64 get_cc_mask(void)
> {
> struct tdx_module_output out;
> @@ -775,3 +825,52 @@ void __init tdx_early_init(void)
>
> pr_info("Guest detected\n");
> }
> +
> +static irqreturn_t tdx_ev_handler(int irq, void *dev_id)
> +{
> + tdx_event_notify_handler();
> + return IRQ_HANDLED;
> +}
> +
> +static int __init tdx_late_init(void)
> +{
> + struct irq_alloc_info info;
> + struct irq_cfg *cfg;
> + int evirq, cpu;
> +
> + if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
> + return 0;
> +
> + if (!x86_vector_domain) {
> + pr_err("x86 vector domain is NULL\n");
> + return 0;
> + }
> +
> + init_irq_alloc_info(&info, NULL);
> +
> + evirq = irq_domain_alloc_irqs(x86_vector_domain, 1, NUMA_NO_NODE, &info);

If I read correctly, if you set info->mask to cpumask_of(cpu), and pass it to
irq_domain_alloc_irqs(), the x86_vector_domain.alloc will immediately allocate a
vector on the given cpu, so you don't need to call irq_set_affinity() and wait
to allocate the vector on _this_ cpu until request_irq().

> +
> + cpu = get_cpu();
> +
> + irq_set_handler(evirq, handle_edge_irq);
> +
> + /*
> + * Event notification vector will be delivered to the CPU
> + * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested.
> + * So set the IRQ affinity to the current CPU.
> + */
> + irq_set_affinity(evirq, cpumask_of(cpu));
> +
> + if (request_irq(evirq, tdx_ev_handler, 0, "tdx_evirq", NULL))
> + pr_err("Request event IRQ failed\n");
> +
> + cfg = irq_cfg(evirq);
> +
> + if (tdx_hcall_set_notify_intr(cfg->vector))
> + pr_err("Setting event notification interrupt failed\n");
> +
> + put_cpu();

So the IRQ's affinity isn't kernel managed. Also looks it doesn't have anything
like IRQF_NOBALANCING to prevent it from being migrated. If I understand
correctly, userspace can still change the affinity which could cause the vector
being reallocated on another cpu?

Perhaps you can pass IRQF_NO_BALANCING to request_irq()?


--
Thanks,
-Kai