Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated whendriver load first time

From: Konrad Rzeszutek Wilk
Date: Wed May 22 2013 - 12:41:53 EST

On Wed, May 22, 2013 at 04:25:10PM +0100, Jan Beulich wrote:
> >>> On 22.05.13 at 17:14, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote:
> > The physdev_unmap_pirq (from PHYSDEVOP_unmap_pirq), only has this
> > check:
> > if (domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND)
> >
> > and since the arch.hvm.emuirq is IRQ_UNBOUND (-1), it does not
> > call unmap_domain_pirq_emuirq. It probably shouldn't, but it should
> > at least remove the info->arch.pirq = PIRQ_ALLOCATED as we are
> > telling the hypervisor: "hey, I am done with this, return to the
> > pool." But since that is not cleared, the PHYSDEVOP_get_free_pirq
> > will skip this pirq as arch.pirq is still set to PIRQ_ALLOCATED.
> Okay, that clarifies it quite a bit. For one, I'll leave any of the
> emuirq stuff to Stefano, who wrote this originally. And then, from
> the beginning of this thread, I'm not convinced that freeing a pirq
> is really the right thing here: unmap_pirq() is the counterpart of
> map_pirq(), not get_free_pirq(). I would think that is a guest
> allocates a pirq and then unmaps it without first mapping it, it's
> the guest's fault that it now lost one pirq resource. It should not
> have allocated one in the first place if it didn't mean to use it for
> anything.

It does use it, but if you do run this in a loop:
rmmod e1000e;modprobe e1000e

it ends up doing thse three hypercalls: PHYSDEVOP_get_free_pirq,
PHYSDEVOP_map_pirq, PHYSDEVOP_unmap_pirq and so on. The reason is that
drivers/xen/events.c keeps track of the Linux IRQ <-> PIRQ just as long
as needed - if the driver does a free_irq, well, then the mapping is
de-allocated and lost.

One patch I posted (for Linux) keeps track of the PIRQ so that if
free_irq is called and we remove the Linux IRQ <-> PIRQ association,
we still have the PIRQ saved away and can re-use it.

In other words, the loop ends up doing:
PHYSDEVOP_map_pirq, PHYSDEVOP_unmap_pirq

> >> I see none at all, unmap_domain_pirq() has a <= 0 check, and
> >> unmap_domain_pirq_emuirq() again doesn't appear to have any.
> >
> > The 'unmap_domain_pirq' path would be if dom0 (so QEMU) did the
> > unmap for the guest. That is via the PHYSDEVOP_unmap_pirq. And
> > I think if that path was taken (as Stefano suggests QEMU should
> > do when a MSI or MSI-X driver is unloaded and zero is writen as
> > an PIRQ), we would end up calling clear_domain_irq_pirq, which
> > would set arch.pirq = 0.
> >
> > Or to a negative value as you pointed out later. Which then
> > means we won't be ever able to re-use the PIRQ (as
> > PHYSDEVOP_get_free_pirq or rather get_free_pirq would skip over it
> > as arch.pirq != 0).
> That setting to a negative value is not causing the slot to be
> permanently lost, it merely defers its freeing. It was the only
> way I could find back then to reasonably handle an unmap
> being done before the matched unbind.

Ah, so pt_irq_destroy_bind (XEN_DOMCTL_unbind_pt_irq) is the counterpart
to PHYSDEVOP_get_free_pirq in some form. Which looks to be on QEMU side
only called when the PCI device is put in sleep or pulled out of the guest.

It probably shouldn't be called when the device is merely de-activated.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at