RE: [PATCH] PCI: hv: use effective affinity mask

From: Jake Oshins
Date: Wed Nov 01 2017 - 16:53:07 EST


> -----Original Message-----
> From: Dexuan Cui
> Sent: Wednesday, November 1, 2017 1:31 PM
> To: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>; linux-pci@xxxxxxxxxxxxxxx; Jake
> Oshins <jakeo@xxxxxxxxxxxxx>; KY Srinivasan <kys@xxxxxxxxxxxxx>;
> Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>
> Cc: devel@xxxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Haiyang
> Zhang <haiyangz@xxxxxxxxxxxxx>; Jork Loeser
> <Jork.Loeser@xxxxxxxxxxxxx>; Chris Valean (Cloudbase Solutions SRL) <v-
> chvale@xxxxxxxxxxxxx>; Adrian Suhov (Cloudbase Solutions SRL) <v-
> adsuho@xxxxxxxxxxxxx>; Simon Xiao <sixiao@xxxxxxxxxxxxx>; 'Eyal
> Mizrachi' <eyalmi@xxxxxxxxxxxx>; Jack Morgenstein
> <jackm@xxxxxxxxxxxx>; Armen Guezalian <armeng@xxxxxxxxxxxx>; Firas
> Mahameed <firas@xxxxxxxxxxxx>; Tziporet Koren
> <tziporet@xxxxxxxxxxxx>; Daniel Jurgens <danielj@xxxxxxxxxxxx>
> Subject: [PATCH] PCI: hv: use effective affinity mask
>
>
> The effective_affinity_mask is always set when an interrupt is assigned in
> __assign_irq_vector() -> apic->cpu_mask_to_apicid(), e.g. for struct apic
> apic_physflat: -> default_cpu_mask_to_apicid() ->
> irq_data_update_effective_affinity(), but it looks d->common->affinity
> remains all-1's before the user space or the kernel changes it later.
>
> In the early allocation/initialization phase of an irq, we should use the
> effective_affinity_mask, otherwise Hyper-V may not deliver the interrupt to
> the expected cpu. Without the patch, if we assign 7 Mellanox ConnectX-3
> VFs to a 32-vCPU VM, one of the VFs may fail to receive interrupts.
>
> Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx>
> Cc: Jake Oshins <jakeo@xxxxxxxxxxxxx>
> Cc: Jork Loeser <jloeser@xxxxxxxxxxxxx>
> Cc: Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>
> Cc: K. Y. Srinivasan <kys@xxxxxxxxxxxxx>
> ---
>
> Please consider this for v4.14, if it's not too late.
>
> drivers/pci/host/pci-hyperv.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
> index 5ccb47d..8b5f66d 100644
> --- a/drivers/pci/host/pci-hyperv.c
> +++ b/drivers/pci/host/pci-hyperv.c
> @@ -879,7 +879,7 @@ static void hv_irq_unmask(struct irq_data *data)
> int cpu;
> u64 res;
>
> - dest = irq_data_get_affinity_mask(data);
> + dest = irq_data_get_effective_affinity_mask(data);
> pdev = msi_desc_to_pci_dev(msi_desc);
> pbus = pdev->bus;
> hbus = container_of(pbus->sysdata, struct hv_pcibus_device,
> sysdata); @@ -1042,6 +1042,7 @@ static void hv_compose_msi_msg(struct
> irq_data *data, struct msi_msg *msg)
> struct hv_pci_dev *hpdev;
> struct pci_bus *pbus;
> struct pci_dev *pdev;
> + struct cpumask *dest;
> struct compose_comp_ctxt comp;
> struct tran_int_desc *int_desc;
> struct {
> @@ -1056,6 +1057,7 @@ static void hv_compose_msi_msg(struct irq_data
> *data, struct msi_msg *msg)
> int ret;
>
> pdev = msi_desc_to_pci_dev(irq_data_get_msi_desc(data));
> + dest = irq_data_get_effective_affinity_mask(data);
> pbus = pdev->bus;
> hbus = container_of(pbus->sysdata, struct hv_pcibus_device,
> sysdata);
> hpdev = get_pcichild_wslot(hbus, devfn_to_wslot(pdev->devfn));
> @@ -1081,14 +1083,14 @@ static void hv_compose_msi_msg(struct irq_data
> *data, struct msi_msg *msg)
> switch (pci_protocol_version) {
> case PCI_PROTOCOL_VERSION_1_1:
> size = hv_compose_msi_req_v1(&ctxt.int_pkts.v1,
> - irq_data_get_affinity_mask(data),
> + dest,
> hpdev->desc.win_slot.slot,
> cfg->vector);
> break;
>
> case PCI_PROTOCOL_VERSION_1_2:
> size = hv_compose_msi_req_v2(&ctxt.int_pkts.v2,
> - irq_data_get_affinity_mask(data),
> + dest,
> hpdev->desc.win_slot.slot,
> cfg->vector);
> break;
> --
> 2.7.4

Signed-off-by: Jake Oshins <jakeo@xxxxxxxxxxxxx>